12. Data Partitioning (Sharding)
Data partitioning is the technique of splitting a large dataset across multiple databases or servers. Each partition (shard) holds a subset of the data. Partitioning enables horizontal scaling of databases beyond the capacity of a single machine.
Why Partition Data?
| Problem | How Partitioning Helps |
|---|---|
| Data too large for one machine | Split across multiple machines |
| Too many queries for one DB | Spread queries across shards |
| Geographic latency | Place data closer to users |
| High availability | Failure of one shard doesn't affect others |
| Independent scaling | Scale hot shards independently |
Partitioning Types
1. Horizontal Partitioning (Sharding)
Split rows across different databases. Each shard has the same schema but different data.
Original Table: users (10M rows)
Shard 1: users where id 1 - 3,333,333
Shard 2: users where id 3,333,334 - 6,666,666
Shard 3: users where id 6,666,667 - 10,000,000
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Shard 1 │ │ Shard 2 │ │ Shard 3 │
│ users: 1-3M │ │ users: 3-6M │ │ users: 6-10M│
│ id|name|... │ │ id|name|... │ │ id|name|... │
└─────────────┘ └─────────────┘ └─────────────┘
This is the most common form of partitioning in system design.
2. Vertical Partitioning
Split columns across different databases. Each partition has a subset of columns with all rows.
Original Table: users (id, name, email, bio, avatar_url, preferences_json)
Partition A: users_core (id, name, email) ← Frequently accessed
Partition B: users_profile (id, bio, avatar_url) ← Occasionally accessed
Partition C: users_settings (id, preferences_json) ← Rarely accessed
When to use: When certain columns are accessed much more frequently than others, or when some columns are very large (BLOBs, JSON).
3. Functional Partitioning
Split data by function or domain into separate databases.
User Service → Users Database
Order Service → Orders Database
Payment Service → Payments Database
Product Service → Products Database
This naturally emerges in microservice architectures where each service owns its data.
Sharding Strategies
1. Range-Based Sharding
Assign contiguous ranges of a key to each shard.
Shard 1: user_id 1 - 1,000,000
Shard 2: user_id 1,000,001 - 2,000,000
Shard 3: user_id 2,000,001 - 3,000,000
Pros:
- Simple to implement and understand.
- Range queries are efficient (data sorted within each shard).
- Easy to find which shard holds a key.
Cons:
- Hot spots: If new users all get sequential IDs, the last shard gets all writes.
- Uneven distribution if data isn't uniformly distributed.
- Rebalancing requires splitting and migrating ranges.
Best for: Time-series data (shard by time range), alphabetical data.
2. Hash-Based Sharding
Use a hash function to determine the shard:
hash("user:alice") % 3 = 0 → Shard 0
hash("user:bob") % 3 = 2 → Shard 2
hash("user:carol") % 3 = 1 → Shard 1
Pros:
- Uniform distribution across shards.
- No hot spots (if hash function is good).
- Works well for key-value lookups.
Cons:
- Range queries are expensive — must query all shards.
- Adding/removing shards remaps many keys (unless using consistent hashing).
- Losing the natural ordering of data.
Best for: Key-value lookups, user data partitioned by user_id.
3. Consistent Hashing
A variant of hash-based sharding that minimizes key redistribution. (See: 06-consistent-hashing.md)
Pros:
- Only
keys move when adding/removing a node. - Good for dynamic clusters.
4. Directory-Based Sharding
A lookup service maps each key to its shard.
┌────────────────────────────┐
│ Shard Directory │
│ user:alice → Shard 2 │
│ user:bob → Shard 1 │
│ user:carol → Shard 3 │
└────────────┬───────────────┘
│
┌────────┼────────┐
↓ ↓ ↓
Shard 1 Shard 2 Shard 3
Pros:
- Maximum flexibility — can move any key to any shard.
- Easy to handle hot spots by moving data.
Cons:
- Single point of failure: The directory must be highly available.
- Extra hop: Every query must first consult the directory.
- Directory becomes a bottleneck at scale.
5. Geo-Based Sharding
Partition data by geographic region.
Shard US: Users in North America → us-east-1 database
Shard EU: Users in Europe → eu-west-1 database
Shard APAC: Users in Asia-Pacific → ap-southeast-1 database
Pros:
- Low latency for users (data close to them).
- Regulatory compliance (data sovereignty).
Cons:
- Users who travel see different shards.
- Cross-region queries are expensive.
Choosing a Shard Key
The shard key determines how data is distributed. A good shard key is critical.
Properties of a Good Shard Key
| Property | Description |
|---|---|
| High cardinality | Many distinct values → even distribution |
| Uniform distribution | Values are evenly spread, not clustered |
| Relevant to queries | Most queries can target a single shard |
| Immutable | Key doesn't change (avoids resharding) |
Shard Key Examples
| Entity | Good Shard Key | Bad Shard Key |
|---|---|---|
| Users | user_id (uniform, immutable) |
country (skewed — most users in a few countries) |
| Orders | order_id or user_id |
status (only a few values) |
| Messages | conversation_id |
timestamp (hot spot on recent shard) |
| Logs | hash(request_id) |
log_level (most are "INFO") |
Compound Shard Key
Use multiple fields for better distribution:
Shard key: (tenant_id, user_id)
→ Data for a tenant is spread across shards
→ Queries for a specific tenant+user hit one shard
Challenges of Sharding
1. Cross-Shard Queries (Scatter-Gather)
When a query needs data from multiple shards:
Query: "Find all orders with amount > $100"
App → Query Shard 1 → Results
→ Query Shard 2 → Results → Merge all results
→ Query Shard 3 → Results
- Higher latency (must wait for slowest shard).
- More complex application logic.
- Pagination across shards is very difficult.
2. Cross-Shard Joins
Joins across shards are expensive or impossible:
-- This is efficient (same shard):
SELECT * FROM users u JOIN orders o ON u.id = o.user_id WHERE u.id = 123;
-- This is expensive (cross-shard):
SELECT * FROM users u JOIN orders o ON u.id = o.user_id WHERE o.amount > 100;
Solutions:
- Denormalize data to avoid joins.
- Use application-level joins.
- Co-locate related data on the same shard (e.g., user and their orders on the same shard).
3. Cross-Shard Transactions
ACID transactions across shards require distributed coordination:
- 2PC (Two-Phase Commit): Slow, coordinator is SPOF.
- Saga Pattern: Eventually consistent, compensating transactions.
- Avoid if possible: Design shard boundaries so transactions stay within a single shard.
4. Rebalancing / Resharding
When shards become uneven (data skew or hot spots), data must move:
- Add shards: Split overloaded shards.
- Consistent hashing: Minimizes data movement.
- Shadow reads: New shard reads from old shard during migration.
- Dual writes: Write to both old and new shard during migration.
Rebalancing steps:
1. Create new shard
2. Copy data from overloaded shard to new shard
3. Update routing rules
4. Switch reads to new shard
5. Switch writes to new shard
6. Remove data from old shard
5. Hot Spots
Celebrity problem: A very popular user's data receives disproportionate traffic.
Solutions:
- Split hot keys: Store hot key's data across multiple shards with a suffix.
celebrity_user:123:shard_a, celebrity_user:123:shard_b - Caching: Cache hot data aggressively.
- Read replicas: Add replicas for the hot shard.
6. ID Generation
Auto-increment IDs don't work across shards. Solutions:
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| UUID | 550e8400-e29b-41d4-a716-446655440000 |
Simple, globally unique | 128 bits, not sortable |
| Snowflake ID | Timestamp + machine ID + sequence | Sortable, 64-bit | Clock dependency |
| ULID | 01ARZ3NDEKTSV4RRFFQ69G5FAV |
Sortable, 128-bit | Slightly larger |
| Database sequences | Shard 1: 1,4,7; Shard 2: 2,5,8; Shard 3: 3,6,9 | Small IDs | Requires coordination |
| ID service | Central service assigns ranges | Predictable | Single point of failure |
Sharding in Practice
| System | Sharding Approach |
|---|---|
| DynamoDB | Hash-based partitioning on partition key; automatic splitting |
| Cassandra | Consistent hashing with virtual nodes |
| MongoDB | Range or hash-based sharding; config servers store metadata |
| MySQL/Vitess | Application-level sharding with Vitess as proxy |
| PostgreSQL/Citus | Extension that adds sharding to PostgreSQL |
| Redis Cluster | Hash slots (16384 slots distributed across nodes) |
Sharding Decision Framework
Can a single database handle the load?
├── Yes → Don't shard! Use read replicas, caching, query optimization first.
└── No → What's the bottleneck?
├── Reads → Read replicas first, then shard
├── Writes → Shard by write-heavy entity (e.g., user_id)
└── Storage → Shard by data volume
Choose sharding strategy:
├── Need range queries? → Range-based sharding
├── Need uniform distribution? → Hash-based sharding
├── Dynamic cluster? → Consistent hashing
└── Complex routing? → Directory-based sharding
Summary
| Concept | Key Point |
|---|---|
| Horizontal partitioning | Split rows across databases — the primary meaning of "sharding" |
| Vertical partitioning | Split columns across databases |
| Range-based | Good for range queries; risk of hot spots |
| Hash-based | Uniform distribution; bad for range queries |
| Consistent hashing | Minimal redistribution on node changes |
| Shard key | Must be high-cardinality, uniformly distributed, query-aligned |
| Challenges | Cross-shard queries, joins, transactions, rebalancing |
Rule of thumb: Avoid sharding as long as possible. Use read replicas, caching, and vertical scaling first. When you must shard, choose the shard key very carefully — it's extremely hard to change later.