Press "Enter" to skip to content

Caching 101

Some time ago I wrote a post on scalability. In it, I mentioned caching as one of the most important techniques to help our system scale. Today, I want to take advantage of that fact and dive deeper into caching itself.

Today, I will walk you through basic definitions, the most common problems with caching, different caching eviction algorithms (or cache algorithms), mention the most commonly used tools, and share some best practices when working with cache.

What Is Caching?

It is a technique of storing frequently accessed data or computation results in fast, temporal storage. After doing so, we can easily retrieve the data needed by our clients without spending additional CPU cycles. Thus reducing the work needed to be done on our application side, either in the form of some calculations or external queries.

Usually, cache takes the form of some external service put in front of our application, effectively working as a middleware between us and the client. However, we can also have an application level cache that stores the results in application memory. The trade-off in this case is that requests reach our service and need to be processed at least to a certain point. In the first case, the cache service is responsible for processing the request and only calls our service when the data is missing.

Missing data and what happens next is just one of the possible problems in caching space. Besides, we can have a problem with consistency between our service state and cache state, or with picking when and how we evict entries from our cache. Before we go deeper into these problems, let’s spend a moment discussing when and why adding cache makes sense.

What Can Be Cached?

Let’s start with a key consideration to use caching. Data that we want to store in the cache must not have real time guarantee. If that is not the case and the data may be inconsistent in a couple of seconds then caching may not be the best idea. We end with two sources of truth (the main data store and the cache), inconsistent data stores and the caching service is essentially an unnecessary intermediary with no added value.

In my opinion, there are two key characteristics that may indicate data good for caching:

  • Frequently Accessed Data – simple and straight data is often used whether by internal services or by end users.
  • Expensive Computations – full or partial results of heavy mathematical calculations, DB queries, 3rd service call or outgoing requests, but also operations on big datasets. Especially beneficial if multiple processes reuse the computed data.

Side note: DB queries and outgoing requests are among the most expensive operations we are doing. They are usually a good place to start looking for some “low-hanging fruits” in performance optimization.

Problems With Caching

Now you know what is caching, and why use it. Now we can talk a bit more about caching problems, starting with Eviction.

Eviction

Eviction focuses on when, and how we should invalidate our cache entries. I think it is the most important singular problem in the topic of caching, as it directly contributes to almost all the problems below. The old “proverb” says there are two complex problems in IT, naming and cache invalidation.

The crucial problem when it comes to eviction is that it is very hard to tune it properly. When we evict entries too early, we will start getting cache misses that would stress our service, while if we evict too late we can end with a lot of inconsistencies.

There are multiple strategies on how to tackle this issue, for example LRU, and LFU. However, they are still based on different tradeoffs in eviction – for example, LRU evicts the least recently used entries while LFU evicts the least frequently used entries. We will dive deeper into the topic of caching algorithms below.

Cache Inconsistency

Another big problem in caching is inconsistency, or to be more specific, the risk of inconsistencies occurring. The state of the cache may end up not being in line with the state we have on our application side. The potential implications of such an event occurring vary from nothing to a lawsuit.

The solution for the problem with inconsistent state is simple: find the answer to the question – how often our cache should query our service for data? As you can see, it is the place where the eviction starts to show its head.

I know there are two main approaches here – the pull and push model. In the pull one cache checks periodically if it has correct data inside; the important factor here is time, the same as in the case of evictions. In the push model, we push a new version of data to the cache every time the change occurs on the backend side.

While the push model may seem like a right away pick, it also has its own drawbacks, like putting additional work on the service side. I believe that picking one over the other should depend on the exact use case and possible repercussions of inconsistency in data.

Cache Miss

A cache miss is a situation when the data requested by the users is not in the cache. In such a case the caching service has to ask the service directly and retrieve needed data. While in the case of singular misses that is not a problem.

However, there can be cases where the number of misses is so high that it may overload the main service. This particular problem is called Thundering Herd/Cache Stampede, similar to animals running through sth and stamping it to the ground.

Cache Overhead

Despite the possible tremendous benefits of having a properly working caching layer. Actually having and maintaining it can be costly. Deploying, then managing, potentially fine-tuning, and applying potential configuration changes can be a pain. The same as in the case of every complex and important software service. In short, it introduces more complexity, but it can be a worthy complexity.

Caching algorithms

In this section, I will focus on a couple of different caching algorithms. Caching algorithms describe how our cache should behave while adding and removing elements to and from the cache, while ensuring that our cache states within the specified limit. Each of them has their unique set of advantages and disadvantages.

LRU

Let’s start with the Least Recently Used cache. In this approach, we evict the oldest unused elements. The assumption here is that the longer the element is not used the bigger the chances that it will not be used soon. Thus, we can safely remove it from our store.

In this approach, we keep track of all elements within our cache, while updating their usage time. When the cache is full and a new item arrives, remove the item from the oldest one. Then we can safely add a new element. It often performs well across real-world access patterns, so it is a good starting algorithm for cache service.

LFU

Now we have Least Frequently Used cache, which opposes LRU, evicting elements with the lowest frequency of usage. In this case, the assumption is that if the element is rarely used, it can be safely removed from the cache. In this approach, we keep track of all elements within our cache, while updating their usage count.

When the cache is full and a new item arrives, remove the item with the lowest usage count. Then we can safely add a new element. The biggest problem here is that LFU may lead to HotSpot heavy caches. Which can increase the chances of a chase stampede occurring.

FIFO

The oldest and simplest eviction model, First in First out. We evict entries that are stored in the cache for the longest time. It does not account for any “complex” traits like usage count or frequency; it just removes them.

On the plus side, it is extremely simple; however, it is suboptimal for most use cases.

SIEVE

Last and by far the most complex type of caching algorithm is Selective Invalidation and Eviction (SIEVE). It is not the singular algorithms but more of the family of algorithms following certain principles.

They try to notice the usage patterns of particular data, instead of just focusing on frequency or last usage time. This algorithm family will take both of the metrics into account. Then the algorithm will evict elements based on the insight it draws from these metrics.

In theory, it is the best possible solution as it promises to evict only the data that will not be used. Thus, greatly increasing our cache efficiency. While it may sound pretty complex, mostly due to metrics collection and drawing insight, there is actually an in-depth paper that shows SIEVE algorithms are simpler than LRU.

Nevertheless, I would still recommend starting with LRU and only if this approach will not be sufficient for migrating to SIEVE.

AlgorithmAssumptionProsCons
LRUEvict least recently used itemsGood starting point for most use casesCan sometimes evict hot spots
LFUEvict least frequently used itemsWill not remove hot spotsTends to create hot-spots heavy caches
FIFORemove oldest itemsExtremely simplePoor performance for most real-world
SIEVESelectively evict itemsPotentially the best solutionThe most computationally heavy

There are also approaches like MRU, RR, Clock, or 2Q but let’s not dig too deep into all of them. Let’s move to the most common tools that you can use as a caching service for your application.

Tools

Redis

Redis is an in-memory data store, most commonly used as a simple key-value store. However, it can also be successfully used as a vector database or queue. Redis is primarily known for its high scalability and simplicity.

It is easy to set up, integrate, and is capable of handling tens of thousands of requests with ease. A combination of all the traits above makes Redis an excellent pick for a caching utility. When correctly applied, it can greatly reduce the load put on our application.

Memcache

Memcache is simpler than Redis; it only supports simple key-value mapping. It focuses on handling small data, usually in string or binary format. In all the other characteristics, it goes head in head with Redis.

It is even simpler to set up and use Redis. However, it lacks some cluster support present in Redis. There is a very interesting paper by Facebook, on how they scaled their own memcache instance.

CategoryRedisMemcached
Data modelKey-value store
Rich data types
Key-value store only
ClusteringRedis ClusterNo built-in clustering or replication
PerformanceVery similarVery similar
AlgorithmsConfigurable – LRU, LFULRU only
Memory UsageBetter in reclaiming used memorySlightly better memory used for plain strings
Advanced featuresPub/Sub, Transactions, GeoSpatial supportNone

Best practices

Cache only what’s necessary

Think carefully about what you want to store in the cache. If all the elements are truly needed, some elements may not be even put in the cache. Keep in mind that each element in the cache results in multiple operations that need to be gone by your application. I have mentioned some key considerations on what to cache in the first paragraph.

Lazy caching

Do not pre-populate the cache; store items only after they are requested. Additionally, you can skip storing items in caching during the first retrieval and do it on the second one. This should help achieve two things. First, storing a lot of data in the cache – potentially mostly unused. Secondly, it helps to reduce the chances of storing in cache items that will be accessed once or twice at most.

Favor push model

Update the data stored in the cache directly after their respective entry in the database is updated. Doing so will reduce the cache of cache inconsistency and thundering herd problems occurring. While it addresses some of the problems that may accrue in lazy caching, it also brings its own set of issues. Mostly related to storing multiple, potentially unused items in the cache and putting an additional load on the caching service.

A good idea would be to mix the push model with lazy loading to get the best out of both concepts. We can build our cache in a lazy way, and then we can update it in push fashion. This will optimally mix both of the approaches.

Use Time-to-live

Always set up the TTL for all the elements added to the cache. With correctly setup TTL, if for some reason you will not delete a particular entry, the caching service will do it for you. It will help to reduce the long term load put on the service and literally work as a garbage collector for you. What is more, if you need short term storage, you can just use very short TTL instead of working with caching algorithms.

Monitor

Keep a close eye on your cache performance. Try to track which keys are the hottest and which are not. Besides metrics relative to caching optimizations, monitor it in the same way as other production services. Optimize your approaches according to the insight gathered. Try and test the behavior of different caching algorithms, approaches to updating values, and TTL values.

Summary

Let’s do a quick walkthrough of what we have learned.

  • Problems with caching:

    • Eviction
    • Inconsistency
    • Misses
    • Overhead
  • Caching algorithms LRU, LFU, FIFO, SIEVE, MRU, RR, Clock, 2Q, and others.

    • Start with LRU, keep a close eye on performance, and then adapt
  • Prefer Redis over Memcached unless your use case is very simple

  • Best practices:

    • Cache what is really needed
    • Combine lazy caching with a push model
    • Use proper TTL
    • Monitor

Thank you for your time.

Comments are closed.

Table of Contents