Sharding & Consistent Hashing

When a single database node cannot hold your data or sustain your write throughput, sharding splits the dataset across multiple nodes. It is the primary horizontal scaling strategy for databases, but it introduces significant complexity that should not be undertaken lightly — managed databases and read replicas should be exhausted first.

Sharding

Sharding (horizontal partitioning) splits a large dataset across multiple database instances — called shards — each holding a subset of the data. A single database that cannot hold 10 TB or sustain 100,000 writes/second can be replaced by ten shards, each holding 1 TB and handling 10,000 writes/second.

Sharding strategies

  1. Range-based — shard by a contiguous range of the key (e.g. user IDs 1–1M on shard A, 1M–2M on shard B). Simple to implement and supports range queries efficiently. Risk: hotspots if one range is much more active than others — newly registered users all land on the last shard, for example.
  2. Hash-based — apply a hash function to the shard key and take the modulus (hash(userId) % numShards). Distributes load evenly and eliminates hotspots. Downside: adding or removing shards requires rehashing and migrating most of the data — solved by consistent hashing (see below).
  3. Directory-based — a lookup table maps each key to its shard. Maximum flexibility; any key can be reassigned to any shard without rehashing. Downside: the directory itself is a bottleneck and single point of failure.

Choosing a shard key

The shard key is the most important design decision in a sharded system. A bad key is impossible to fix without a full migration.

  1. High cardinality — the key must have enough distinct values to distribute data evenly. A boolean flag is a catastrophic shard key.
  2. Write distribution — a monotonically increasing key (auto-increment ID, timestamp) with range sharding funnels all writes to the last shard. Use hash-based sharding or a random prefix to spread writes.
  3. Query locality — ideally, most queries touch only one shard. Shard on the dimension you query by most often (e.g. user_id if most queries filter by user).

Challenges

  1. Cross-shard queries — a query that spans multiple shards must be executed on each shard and the results merged in the application layer. Joins across shards are especially expensive and often infeasible at scale.
  2. Cross-shard transactions — ACID transactions across shards require a distributed transaction protocol (two-phase commit), which is complex, slow, and a potential single point of failure.
  3. Hotspots — a celebrity user, viral content, or monotonically increasing key can funnel disproportionate traffic to one shard. Mitigated by compound shard keys or splitting hot shards.
  4. Resharding — adding new shards after launch requires redistributing data, which is operationally complex and risky. Consistent hashing minimises the data movement required.
  5. Operational overhead — more database nodes means more backups, monitoring, and maintenance. Managed databases (Amazon Aurora, Google Spanner, PlanetScale) abstract much of this away.

Consistent Hashing

Standard hash-based sharding has a fatal flaw: if you change the number of shards, hash(key) % N produces different results for almost every key, requiring a massive data migration. Consistent hashing solves this by mapping both keys and shards onto a circular ring, so adding or removing a shard only relocates the keys that were closest to it — typically 1/N of the total keys.

How it works

  1. The hash space is imagined as a circle from 0 to 2³².
  2. Each server node is hashed to a position on the ring.
  3. Each key is hashed to a position on the ring and assigned to the first node encountered clockwise from that position.
  4. When a node is added, it takes over only the keys between itself and its predecessor — all other assignments are unchanged.
  5. When a node is removed, its keys move to the next node clockwise.

Virtual nodes

With few physical nodes, consistent hashing can produce uneven load distribution — some nodes end up responsible for large arcs of the ring. Virtual nodes (vnodes) solve this: each physical node is hashed to multiple positions on the ring, making the distribution statistically uniform even with a small cluster. Cassandra and DynamoDB both use this technique.

Consistent hashing is also used in load balancers that need session affinity — the same client always reaches the same backend — and in distributed caches to minimise cache misses when the server pool changes.