LRU Cache Design

Design –

An LRU cache should support the operations: lookup, insert and delete. Apparently, in order to achieve fast lookup, we need to use hash. By the same token, if we want to make insert/delete fast, something like linked list should come to your mind. Since we need to locate the least recently used item efficiently, we need something in order like queue, stack or sorted array.

To combine all these analyses, we can use queue implemented by a doubly linked list to store all the resources. Also, a hash table with resource identifier as key and address of the corresponding queue node as value is needed.

Here’s how it works. when resource A is requested, we check the hash table to see if A exists in the cache. If exists, we can immediately locate the corresponding queue node and return the resource. If not, we’ll add A into the cache. If there are enough space, we just add a to the end of the queue and update the hash table. Otherwise, we need to delete the least recently used entry. To do that, we can easily remove the head of the queue and the corresponding entry from the hash table.

We maintain a hash table(each node can itself be a linked list if we want to handle collisions). Assuming we are not handling collision for now, each node in hash table will have two value – 1. data 2. node address of DDL. Each node in DDL will have data, previous and next pointer.

Eviction policy – 

When the cache is full, we need to remove existing items for new resources. In fact, deleting the least recently used item is just one of the most common approaches. So are there other ways to do that?

As mentioned above, The strategy is to maximum the chance that the requesting resource exists in the cache. I’ll briefly mention several approaches here:

Random Replacement (RR) – As the term suggests, we can just randomly delete an entry.
Least frequently used (LFU) – We keep the count of how frequent each item is requested and delete the one least frequently used.
W-TinyLFU – I’d also like to talk about this modern eviction policy. In a nutshell, the problem of LFU is that sometimes an item is only used frequently in the past, but LFU will still keep this item for a long while. W-TinyLFU solves this problem by calculating frequency within a time window. It also has various optimizations of storage.

We are using a doubly linked list, for implementing our eviction policy. Suppose we use the case LFU. We maintain the head and tail node address for this DDL. Each node in DDL will have data, previous and next pointer.
If a new element is added to hash table, we create a new node and add it to DDL as tail node. Then we save this data and the DDL node address in the hash table.
If an element is to be read and it exists. We read the data from hash table. We move the node corresponding to this data to the end of DDL. The latest data used will be at the end of the DDL.
If we need to delete a data, we get that node address in DDL and remove it from DDL and update the DDL accordingly(next and previous pointer).
Suppose we have hit the limit and we want to delete LFU keys from our cache, we fetch the head of the DDL and delete it from the DDL and change the head of DDL. We delete the nodes value from hash table.

Concurrency – 

To discuss concurrency, I’d like to talk about why there is concurrency issue with cache and how can we address it.
It falls into the classic reader-writer problem. When multiple clients are trying to update the cache at the same time, there can be conflicts. For instance, two clients may compete for the same cache slot and the one who updates the cache last wins.

1. Lock -The common solution of course is using a lock. The downside is obvious – it affects the performance a lot.
2. Compare and Set –  We will have to expand our design to implement Compare and Set. Each key in hash table will have a time stamp as to when it was updated last. Read the above link to know how memcache handles it internally.

The gets() operation internally receives two values from the memcache service: the value stored for the key (in our example the counter value), and a timestamp (also known as the cas_id). The timestamp is an opaque number; only the memcache service knows what it means. The important thing is that each time the value associated with a memcache key is updated, the associated timestamp is changed. The gets() operation stores this timestamp in a Python dict on the Client object, using the key passed to gets() as the dict key.

The cas() operation internally adds the timestamp to the request it sends to the memcache service. The service then compares the timestamp received with a cas() operation to the timestamp currently associated with the key. If they match, it updates the value and the timestamp, and returns success. If they don’t match, it leaves the value and timestamp alone, and returns failure. (By the way, it does not send the new timestamp back with a successful response. The only way to retrieve the timestamp is to call gets().)

Distributed cache – 

When the system gets to certain scale, we need to distribute the cache to multiple machines.

The general strategy is to keep a hash table that maps each resource to the corresponding machine. Therefore, when requesting resource A, from this hash table we know that machine M is responsible for cache A and direct the request to M. At machine M, it works similar to local cache discussed above. Machine M may need to fetch and update the cache for A if it doesn’t exist in memory. After that, it returns the cache back to the original server.

We use commit log which is used across all distributed system design. Check the below image for the two basic modelJADM_Volume 2_Issue 1_Pages 25-31    active_and_passive_arch

There’s a primary-backup model. The master node(concerned with write as per our design), updates the data and sends out final result to the slave node.

State-Machine replication model – Here the master node(who handles writes) sends out commit log to each of the slave node so that all the nodes can executed the same command and be in the same state as the master node.

We can talk about CAP theorem here. It’s not possible to attain consistency and availability in distributed system at the same time. So suppose we read from slave and write to master, the master makes a commit log which is sent out to all slaves. If the data is read before the update was sent from master to all slave then the data will be stale( Consistency: every read would get you the most recent write. This is not guaranteed in this case but availability is a guaranteed, as slave node will give stale data). If we try to update the slave immediately then we might have availability issues(till all the updates are complete don’t show stale data, so will have to send out an error – so 100% availability not achieved).


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s