Consistent Hashing

Let's first understand what is hashing in the system design.

In a traditional hashing in the system design, we take the hash of the request or IP, then take the modulo against the total number of nodes (server or databases). This will give us the node where we want to route our request.

Hash(key 1)-> H1 mod N = Node(0)
Hash(key 2)-> H2 mod N = Node(1)
..
Hash(key n)-> Hn mod N = Node(n-1)

Where, 

key: Request ID or IP.

H: Hash function result.

N: Total number of nodes.

Node: The node where the request will be routed.

However, there is a problem with this approach:

If we add or remove a node, the total number of nodes changes, which disrupts our current mapping strategy because the same request might then map to a different server. This would force the majority of requests to be redistributed—a highly inefficient process.

Our goal is to distribute requests uniformly across all nodes, enabling us to add or remove nodes with minimal disruption. To achieve this, we need a distribution scheme that doesn't depend directly on the number of nodes, ensuring that only a small portion of keys need to be relocated when changes occur. This is where Consistent Hashing comes into the play.

Consistent Hashing