Intermediate

Cache Eviction

When the cache fills up, something has to go — and which one you pick decides your hit rate. Ten classic policies, side-by-side.

performancememorypatterns

What is Cache Eviction?

The 60-second primer

Cache eviction is the policy a cache uses to decide what to throw out when it runs out of room. A cache is a small, fast store sitting in front of something larger and slower. Once it's full, every new entry has to push an old one out — and which one you push out is what makes or breaks the hit rate.

The goal is simple to state: keep the items most likely to be requested next, evict the rest. The hard part is predicting the future from the access pattern you've seen so far. Each algorithm is a different bet about what the past tells you about the future.

There are ten classic policies worth knowing. Some lean on recency (LRU, MRU, FIFO, Clock). Some lean on frequency (LFU). Some try to balance both (ARC, 2Q, LIRS). And a few intentionally ignore the access pattern (Random, TTL). Pick the wrong one and you spend most of your time fetching from the slow store — you have a cache in name only.

Why the policy choice matters

  • Hit rate is everything — a cache that hits 95% of the time is twenty times better than one that hits 50%. The eviction policy is the lever that moves that number.
  • Caches are finite by design — RAM costs money, CPU caches are tiny, CDN edge nodes have hard disk caps. You will run out of room; the question is what you keep.
  • Wrong policy = thrashing — if your policy evicts the hot keys, every request becomes a miss and the cache turns into pure overhead.
  • Workload-dependent — there's no universal winner. A policy that crushes web traffic can lose to FIFO on a database scan workload.

Where eviction lives

Operating systems use it for page caches (Linux uses a Clock-variant). Databases use it for buffer pools (Postgres uses Clock-Sweep, MySQL InnoDB uses a 2Q-like LRU). CDNs use it at every edge. Your CPU's L1/L2 caches use pseudo-LRU in hardware. Redis lets you pick the policy at runtime.

Side-by-side

How they compare

The same concepts, on the same axes. Use this as a map; the individual pages are the territory.

01LFU
Signal used
Frequency
Hit rate
Great for stable hotspots
Overhead
O(1) — freq buckets
Best for
Long-tailed, skewed traffic
02LRU
Signal used
Recency
Hit rate
Good on most workloads
Overhead
O(1) — HashMap + DLL
Best for
General-purpose default
03MRU
Signal used
Anti-recency
Hit rate
Crushes LRU on scans/loops
Overhead
O(1) — HashMap + DLL
Best for
Sequential scans, looping workloads
04FIFO
Signal used
Insertion order
Hit rate
Mediocre — ignores re-use
Overhead
O(1) — single queue
Best for
Streaming, one-shot reads
05Random
Signal used
None
Hit rate
Surprisingly close to LRU
Overhead
O(1)
Best for
Embedded, hot loops, sampling
06TTL
Signal used
Age
Hit rate
Depends on TTL choice
Overhead
O(1) — expiry per key
Best for
Time-bounded data (DNS, sessions)
07Clock
Signal used
Recency (approx)
Hit rate
≈ LRU, cheaper
Overhead
O(N) — 1 bit/entry
Best for
OS page caches, hot paths
082Q
Signal used
Recency, two-tier
Hit rate
Better than LRU on scans
Overhead
O(N) — two queues
Best for
Buffer pools that face scans
09ARC
Signal used
Recency + frequency
Hit rate
Excellent, self-tuning
Overhead
O(N) — 4 lists
Best for
Mixed workloads (ZFS, Postgres-adj)
10LIRS
Signal used
Inter-reference recency
Hit rate
Beats LRU on weak-locality
Overhead
O(N) — stack + queue
Best for
DB buffer pools, MySQL-like

Decision guide

Which one should you use?

A practical tour of when each algorithm wins.

Decision guide

  • You don't know your workload yetLRU. Boring, well-understood, hard to beat without measurement.
  • A small set of keys is asked for much more often than the restLFU. Frequency dominates recency for skewed traffic.
  • Sequential scans or loops larger than the cacheMRU. The counter-intuitive opposite of LRU — drop what you just touched, because you won't come back to it for a while.
  • One-shot reads, streams, append-only logsFIFO. Re-use doesn't happen, so don't pay to track it.
  • You want LRU but in the OS kernel or a hot lock-free pathClock. One reference bit per page, no list shuffling on hits.
  • Big scans are wrecking your buffer pool's hit rate2Q or LIRS. Both protect frequently-used pages from being flushed by one-shot scans.
  • Mixed workload, you'd rather not tuneARC. Adapts the recency/frequency balance on the fly.
  • You need rock-bottom overhead and can tolerate noisy evictionsRandom. Try it before assuming it's bad.
  • Data has a natural lifetime (DNS records, session cookies, OTP codes) → TTL. Eviction by clock, not by capacity.

When in doubt, start with LRU and measure

LRU is the default in nearly every cache library (Caffeine, Guava, Redis with allkeys-lru) for good reason. Get a baseline hit rate, then only swap in something fancier if the data shows you should.

Related tracks

If this one clicks, try these next.