Consistent hashing
Learn how consistent hashing distributes data across nodes so that adding or removing a node rebalances only 1/N of the keyspace, and why virtual nodes fix the uneven-load problem.
The problem with modulo hashing
The obvious way to distribute keys across N nodes is node = hash(key) % N. Simple. Works fine when your fleet is static.
Add one node. N becomes N+1. Now hash(key) % (N+1) produces different results for almost every key. Every key maps to a different node than before. In a 10-node cache cluster, adding an 11th node invalidates ~91% of cached entries simultaneously. Every cache miss goes to your database. At 3 a.m. after a scale-out operation, you've just triggered a thundering herd against your database from a routine operational change.
Remove one node. Same problem in reverse. The redistribution is total and immediate.
This is why you need consistent hashing.
How consistent hashing works
Consistent hashing maps both keys and nodes onto the same abstract ring. The ring runs from 0 to 2^32 - 1 (or some large number). Every node is assigned a position on the ring by hashing its identifier (hostname, IP, node ID). Every key is also hashed to a position on the ring.
Lookup rule: For a given key, walk clockwise around the ring until you hit the first node. That node owns the key.
Adding a node: Place the new node on the ring. Only keys between the new node and its predecessor on the ring need to move to the new node. All other keys stay put.
Removing a node: Keys on the removed node move to its successor. All other keys are unaffected.
With N nodes, adding or removing one node moves only 1/N of the keyspace. That's the property that makes consistent hashing valuable.
// Consistent hashing lookup
ring = sorted_map() // position → node_id
function add_node(node_id):
pos = hash(node_id)
ring[pos] = node_id
function get_node(key):
pos = hash(key)
// Find the first node at or after pos (clockwise)
entry = ring.ceiling(pos)
if entry is None:
entry = ring.first() // wrap around
return entry.node_id
The problem with naive consistent hashing: hot spots
With 4 nodes placed by hashing their IDs, you have 4 arc segments of the ring, each covering a different fraction of the key space. By probability, those arcs are unlikely to be equal. One node might own 40% of the ring, another only 10%.
Unequal arc ownership means unequal load. The node owning 40% of keys handles 4x the traffic of the one owning 10%. This is a real problem in production: a small number of nodes become hot while others are underutilized.
A second problem: when a node fails, all its load shifts to its single clockwise neighbor. That neighbor, which was already handling its share, now handles double. It may become overwhelmed and fail too, cascading the failure.
Virtual nodes solve uneven distribution
Instead of placing each physical node once on the ring, place it many times at many positions. Each placement is a "virtual node" (vnode). A typical configuration uses 100-300 vnodes per physical node.
// Virtual nodes
VNODES_PER_NODE = 150
function add_node(node_id):
for i in range(VNODES_PER_NODE):
vnode_key = f"{node_id}:vnode:{i}"
pos = hash(vnode_key)
ring[pos] = node_id // physical node owns this position
The ring with virtual nodes
With 150 vnodes per node and 4 nodes, the ring has 600 entries. Each physical node owns ~150 ring positions spread around the full circumference. The law of large numbers ensures each node covers roughly 25% of the keyspace regardless of the hash values of node IDs.
Benefits of vnodes:
- Balanced load. With enough vnodes, each node handles nearly equal key share.
- Graceful failure. When a node is removed, its 150 vnodes are distributed across all other working nodes. Load spreads evenly instead of doubling on one neighbor.
- Heterogeneous hardware. Give a more powerful node more vnodes. It proportionally handles more keys.
Without vnodes, one node may own 40% of the ring
With 4 nodes placed by a single hash of their IDs, the ring arcs are statistically unequal. In a real production case I have seen, one node owned 38% of the keyspace and handled 4x the load of another node owning 9%. Using 150 vnodes per node brings each node's share to within a few percent of 25%. This is not optional at production scale.
How Cassandra uses consistent hashing
Cassandra uses a consistent hash ring called the "token ring." Each node owns a set of token ranges. Data is written to the node whose token range includes the partition key's hash.
Replication factor (RF) determines how many nodes store each key. With RF=3, a write goes to the first 3 clockwise nodes from the key's hash position. Reads can be satisfied by any replica, with the consistency level controlling how many must agree.
Key: user_id=42
Hash: 1,847,234,902
Ring position: lands in Node B's token range
RF=3: write to Node B, Node C, Node D (next two clockwise)
When a node is added, Cassandra needs to "stream" (transfer) the keys that now hash to the new node. With vnodes, this streaming is parallel: the new node receives slices from many different existing nodes simultaneously instead of a single large transfer from one.
Where consistent hashing is used
| System | Use |
|---|---|
| Cassandra | Partition key → token ring → replica set |
| Amazon DynamoDB | Internal partition routing |
| Memcached (with client-side library) | Client-side key-to-server routing |
| Nginx (consistent hashing upstream) | hash $request_uri consistent for sticky routing |
| Couchbase | Replaces static vBucket map with token-based routing |
| CDN edge node selection | Some CDN implementations route cache miss origin pulls via consistent hash to reduce origin fan-out |
When consistent hashing is the wrong tool
Consistent hashing does not solve:
- Skewed access patterns (hot keys). If 90% of requests are for the same 1% of keys, consistent hashing does nothing. Those keys still land on the same node. Use read replicas, local caching, or key spreading (appending a random suffix and scatter-gathering).
- Range queries. Consistent hashing distributes keys by hash, destroying natural key ordering. Range scans require querying all nodes and merging results.
- Small key spaces. With fewer than a few thousand distinct keys, consistent hashing has no advantage over a static partition table.
Consistent hashing vs range sharding: choose the right strategy
Consistent hashing does not fix hot keys
A hot key that gets 50,000 requests/second will always land on the same node regardless of how many nodes you add or how many vnodes you configure. Adding nodes to a cluster with a hot key makes every other node less loaded but leaves the hot node exactly as loaded. Solve hot keys at the application tier, not the sharding layer.
Interview cheat sheet
- Consistent hashing moves
1/Nof keys when a node is added or removed. Modulo hashing moves nearly all keys. - Virtual nodes give each physical node multiple ring positions, ensuring near-equal load distribution and graceful failure handling.
- Cassandra and DynamoDB are the canonical production examples.
- Mention virtual nodes immediately after consistent hashing. Interviewers who ask about consistent hashing expect to hear about the hot-spot problem and the vnode solution.
- Hot keys cannot be solved by consistent hashing alone. Name the right solution (request coalescing, key spreading, local caching) when asked.
- When describing a scale-out event, state the fraction that moves: adding one node to a 10-node cluster moves 1/10 (10%) of keys, not all keys. Explicitly contrast with modulo hashing where nearly all keys relocate.
- Range-based sharding preserves sort order; consistent hashing destroys it. Hash-sharded clusters cannot serve range queries without scatter-gather across all nodes. Know which your system needs before choosing.
- In Cassandra, replication factor determines how many clockwise successors receive each write. RF=3 means data loss requires all three replicas to fail simultaneously, which is why RF=3 with QUORUM is the standard production setting.
Quick recap
- Modulo hashing (
key % N) invalidates nearly all key assignments when the cluster size changes. Consistent hashing invalidates only1/N. - Keys and nodes both map onto the same hash ring. A key belongs to the first node clockwise from its ring position.
- Without virtual nodes, ring positions are uneven and one node may own much more key space than others.
- Virtual nodes (typically 150-300 per physical node) ensure near-equal load distribution and spread failure load across many nodes.
- Consistent hashing solves redistribution overhead. It does not solve hot key concentration. Hot keys require application-level solutions.
- Cassandra and DynamoDB use token rings with virtual nodes as their core partitioning mechanism.
Related concepts
- Sharding — Consistent hashing is one of the two strategies for sharding (the other being range-based). Understanding the tradeoff is essential for data modeling decisions.
- Load Balancing — Some load balancers use consistent hashing to provide sticky routing (always send the same client to the same server), useful for session affinity without shared session state.
- Caching — Memcached client libraries commonly implement consistent hashing for cache key routing, which is why adding a cache node does not cause a full cache invalidation storm.