12. Data Partitioning (Sharding)

Data partitioning is the technique of splitting a large dataset across multiple databases or servers. Each partition (shard) holds a subset of the data. Partitioning enables horizontal scaling of databases beyond the capacity of a single machine.


Why Partition Data?

Problem How Partitioning Helps
Data too large for one machine Split across multiple machines
Too many queries for one DB Spread queries across shards
Geographic latency Place data closer to users
High availability Failure of one shard doesn't affect others
Independent scaling Scale hot shards independently

Partitioning Types

1. Horizontal Partitioning (Sharding)

Split rows across different databases. Each shard has the same schema but different data.

Original Table: users (10M rows)

Shard 1: users where id 1 - 3,333,333
Shard 2: users where id 3,333,334 - 6,666,666
Shard 3: users where id 6,666,667 - 10,000,000
┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│   Shard 1   │  │   Shard 2   │  │   Shard 3   │
│ users: 1-3M │  │ users: 3-6M │  │ users: 6-10M│
│ id|name|... │  │ id|name|... │  │ id|name|... │
└─────────────┘  └─────────────┘  └─────────────┘

This is the most common form of partitioning in system design.

2. Vertical Partitioning

Split columns across different databases. Each partition has a subset of columns with all rows.

Original Table: users (id, name, email, bio, avatar_url, preferences_json)

Partition A: users_core (id, name, email)       ← Frequently accessed
Partition B: users_profile (id, bio, avatar_url) ← Occasionally accessed
Partition C: users_settings (id, preferences_json) ← Rarely accessed

When to use: When certain columns are accessed much more frequently than others, or when some columns are very large (BLOBs, JSON).

3. Functional Partitioning

Split data by function or domain into separate databases.

User Service     → Users Database
Order Service    → Orders Database
Payment Service  → Payments Database
Product Service  → Products Database

This naturally emerges in microservice architectures where each service owns its data.


Sharding Strategies

1. Range-Based Sharding

Assign contiguous ranges of a key to each shard.

Shard 1: user_id 1 - 1,000,000
Shard 2: user_id 1,000,001 - 2,000,000
Shard 3: user_id 2,000,001 - 3,000,000

Pros:

  • Simple to implement and understand.
  • Range queries are efficient (data sorted within each shard).
  • Easy to find which shard holds a key.

Cons:

  • Hot spots: If new users all get sequential IDs, the last shard gets all writes.
  • Uneven distribution if data isn't uniformly distributed.
  • Rebalancing requires splitting and migrating ranges.

Best for: Time-series data (shard by time range), alphabetical data.

2. Hash-Based Sharding

Use a hash function to determine the shard:

hash("user:alice") % 3 = 0 → Shard 0
hash("user:bob")   % 3 = 2 → Shard 2
hash("user:carol") % 3 = 1 → Shard 1

Pros:

  • Uniform distribution across shards.
  • No hot spots (if hash function is good).
  • Works well for key-value lookups.

Cons:

  • Range queries are expensive — must query all shards.
  • Adding/removing shards remaps many keys (unless using consistent hashing).
  • Losing the natural ordering of data.

Best for: Key-value lookups, user data partitioned by user_id.

3. Consistent Hashing

A variant of hash-based sharding that minimizes key redistribution. (See: 06-consistent-hashing.md)

Pros:

  • Only keys move when adding/removing a node.
  • Good for dynamic clusters.

4. Directory-Based Sharding

A lookup service maps each key to its shard.

┌────────────────────────────┐
│      Shard Directory       │
│  user:alice → Shard 2      │
│  user:bob   → Shard 1      │
│  user:carol → Shard 3      │
└────────────┬───────────────┘
             │
    ┌────────┼────────┐
    ↓        ↓        ↓
 Shard 1  Shard 2  Shard 3

Pros:

  • Maximum flexibility — can move any key to any shard.
  • Easy to handle hot spots by moving data.

Cons:

  • Single point of failure: The directory must be highly available.
  • Extra hop: Every query must first consult the directory.
  • Directory becomes a bottleneck at scale.

5. Geo-Based Sharding

Partition data by geographic region.

Shard US:     Users in North America → us-east-1 database
Shard EU:     Users in Europe → eu-west-1 database
Shard APAC:   Users in Asia-Pacific → ap-southeast-1 database

Pros:

  • Low latency for users (data close to them).
  • Regulatory compliance (data sovereignty).

Cons:

  • Users who travel see different shards.
  • Cross-region queries are expensive.

Choosing a Shard Key

The shard key determines how data is distributed. A good shard key is critical.

Properties of a Good Shard Key

Property Description
High cardinality Many distinct values → even distribution
Uniform distribution Values are evenly spread, not clustered
Relevant to queries Most queries can target a single shard
Immutable Key doesn't change (avoids resharding)

Shard Key Examples

Entity Good Shard Key Bad Shard Key
Users user_id (uniform, immutable) country (skewed — most users in a few countries)
Orders order_id or user_id status (only a few values)
Messages conversation_id timestamp (hot spot on recent shard)
Logs hash(request_id) log_level (most are "INFO")

Compound Shard Key

Use multiple fields for better distribution:

Shard key: (tenant_id, user_id)
→ Data for a tenant is spread across shards
→ Queries for a specific tenant+user hit one shard

Challenges of Sharding

1. Cross-Shard Queries (Scatter-Gather)

When a query needs data from multiple shards:

Query: "Find all orders with amount > $100"

App → Query Shard 1 → Results
   → Query Shard 2 → Results    → Merge all results
   → Query Shard 3 → Results
  • Higher latency (must wait for slowest shard).
  • More complex application logic.
  • Pagination across shards is very difficult.

2. Cross-Shard Joins

Joins across shards are expensive or impossible:

-- This is efficient (same shard):
SELECT * FROM users u JOIN orders o ON u.id = o.user_id WHERE u.id = 123;

-- This is expensive (cross-shard):
SELECT * FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100;

Solutions:

  • Denormalize data to avoid joins.
  • Use application-level joins.
  • Co-locate related data on the same shard (e.g., user and their orders on the same shard).

3. Cross-Shard Transactions

ACID transactions across shards require distributed coordination:

  • 2PC (Two-Phase Commit): Slow, coordinator is SPOF.
  • Saga Pattern: Eventually consistent, compensating transactions.
  • Avoid if possible: Design shard boundaries so transactions stay within a single shard.

4. Rebalancing / Resharding

When shards become uneven (data skew or hot spots), data must move:

  • Add shards: Split overloaded shards.
  • Consistent hashing: Minimizes data movement.
  • Shadow reads: New shard reads from old shard during migration.
  • Dual writes: Write to both old and new shard during migration.
Rebalancing steps:
1. Create new shard
2. Copy data from overloaded shard to new shard
3. Update routing rules
4. Switch reads to new shard
5. Switch writes to new shard
6. Remove data from old shard

5. Hot Spots

Celebrity problem: A very popular user's data receives disproportionate traffic.

Solutions:

  • Split hot keys: Store hot key's data across multiple shards with a suffix.
    celebrity_user:123:shard_a, celebrity_user:123:shard_b
    
  • Caching: Cache hot data aggressively.
  • Read replicas: Add replicas for the hot shard.

6. ID Generation

Auto-increment IDs don't work across shards. Solutions:

Strategy Example Pros Cons
UUID 550e8400-e29b-41d4-a716-446655440000 Simple, globally unique 128 bits, not sortable
Snowflake ID Timestamp + machine ID + sequence Sortable, 64-bit Clock dependency
ULID 01ARZ3NDEKTSV4RRFFQ69G5FAV Sortable, 128-bit Slightly larger
Database sequences Shard 1: 1,4,7; Shard 2: 2,5,8; Shard 3: 3,6,9 Small IDs Requires coordination
ID service Central service assigns ranges Predictable Single point of failure

Sharding in Practice

System Sharding Approach
DynamoDB Hash-based partitioning on partition key; automatic splitting
Cassandra Consistent hashing with virtual nodes
MongoDB Range or hash-based sharding; config servers store metadata
MySQL/Vitess Application-level sharding with Vitess as proxy
PostgreSQL/Citus Extension that adds sharding to PostgreSQL
Redis Cluster Hash slots (16384 slots distributed across nodes)

Sharding Decision Framework

Can a single database handle the load?
├── Yes → Don't shard! Use read replicas, caching, query optimization first.
└── No → What's the bottleneck?
          ├── Reads → Read replicas first, then shard
          ├── Writes → Shard by write-heavy entity (e.g., user_id)
          └── Storage → Shard by data volume

Choose sharding strategy:
├── Need range queries? → Range-based sharding
├── Need uniform distribution? → Hash-based sharding
├── Dynamic cluster? → Consistent hashing
└── Complex routing? → Directory-based sharding

Summary

Concept Key Point
Horizontal partitioning Split rows across databases — the primary meaning of "sharding"
Vertical partitioning Split columns across databases
Range-based Good for range queries; risk of hot spots
Hash-based Uniform distribution; bad for range queries
Consistent hashing Minimal redistribution on node changes
Shard key Must be high-cardinality, uniformly distributed, query-aligned
Challenges Cross-shard queries, joins, transactions, rebalancing

Rule of thumb: Avoid sharding as long as possible. Use read replicas, caching, and vertical scaling first. When you must shard, choose the shard key very carefully — it's extremely hard to change later.