Database Sharding — Learn System Design

Overview

Sharding partitions a single logical dataset across multiple independent databases (shards), each holding a subset of the rows. It scales writes and storage beyond what one machine can handle, unlike read replicas which only scale reads. The cost is significant complexity: cross-shard queries, transactions, and rebalancing all become harder, so sharding should be a last resort after vertical scaling, caching, and replication.

Syntax / Usage

A shard key determines which shard a row lives on. A routing layer maps the key to a shard, usually via hashing or ranges. Consistent hashing minimizes how much data must move when shards are added or removed.

shard_for(key, num_shards):
    return hash(key) % num_shards        # hash-based: even spread, no range scans

Ranges:   users A-H -> shard1,  I-P -> shard2,  Q-Z -> shard3
Directory: lookup table maps key -> shard  (flexible, extra hop)

[ App ] --> [ Router ] --> shard1 | shard2 | shard3

Choose a shard key with high cardinality and even access so no single shard becomes a hotspot. Modulo hashing forces a full reshuffle when num_shards changes, which is why consistent hashing or virtual buckets are preferred.

Examples

A social app shards the posts table by user_id, so all of a user's posts live together and a profile page reads from one shard.

A multi-tenant SaaS uses a directory shard map keyed by tenant_id, letting a huge tenant be moved to a dedicated shard without changing the hashing scheme.

A time-series system range-shards by month, keeping recent hot data on fast storage and archiving old shards cheaply—accepting that "last 90 days" queries may fan out across a few shards.

Common Mistakes

Picking a low-cardinality or skewed shard key, creating hotspots (e.g. sharding by country)
Using hash % N, forcing a massive data migration whenever shard count changes
Assuming cross-shard joins and transactions work like single-node ones (they need scatter-gather or sagas)
Sharding prematurely instead of first trying replicas, caching, and bigger hardware
Forgetting that unique constraints and auto-increment IDs no longer hold across shards

Overview

Syntax / Usage

Examples

Common Mistakes

See Also