Overview
What this concept solves
The leaky bucket flips the token bucket on its head. Instead of tokens coming in and being consumed by requests, requests themselves fill the bucket — and they drain out at a fixed rate, like water leaking through a pinhole. The output rate is always constant, regardless of how spiky the input is.
If you've ever seen a network operator talk about 'traffic shaping' or a 'committed information rate' — they're talking about a leaky bucket. It's the algorithm that turns a chaotic input stream into a clean, steady output stream.
Mechanics
How it works
Water in, water out — at a fixed rate
The classic implementation is a FIFO queue with two parameters:
- Capacity (B) — the maximum number of queued requests. Overflow gets dropped.
- Leak rate (r) — how many requests are processed per second, regardless of arrival pattern.
The algorithm is two independent flows:
- On arrival: if
queue.length < B, append the request. Otherwise drop it. - On a timer (every 1/r seconds): if the queue is non-empty, dequeue one request and forward it downstream.
The smoothing property
Whatever the input pattern — a steady stream, a sudden burst, a flatline — the output is metronome-steady at exactly r per second. This is the leaky bucket's whole reason for existing: it makes downstream systems' job easy.
Two variants in the wild
The version shown here (a queue that drains at a fixed rate) is the leaky bucket as a meter — used for traffic shaping. There's also a leaky bucket as a counter variant that behaves almost identically to a token bucket. Be careful when reading older papers; they sometimes conflate the two.
Interactive prototype
Run it. Break it. Tune it.
Sandboxed simulation embedded right in the page. No setup, no install.
About this simulation
Requests fill the bucket. It drains at a constant 1 per second — no faster, no matter how hard you push. Overflow is dropped. Try 'Burst of 15' to see the smoothing effect.
Hands-on
Try these on your own
Open the prototype above, run each experiment, predict the answer, then verify.
Watch the smoothing
Click 'Burst of 15'. Notice the queue fills, then drains at exactly 1/sec for the next 10 seconds. The input was bursty, the output is metronomic.
Force an overflow
Hit 'Burst of 15' on a full queue. The 'Dropped' counter should jump. This is the trade-off: leaky bucket rejects under load instead of slowing down.
Compare with token bucket
Open the token bucket prototype in another tab. Burst both with the same input. Token bucket lets the burst through then runs out; leaky bucket smooths it out over time. Same goal, opposite philosophy.
In practice
When to use it — and what you give up
When to reach for it
- Smoothing bursts before a slow downstream — a database, an external API, a payment processor with strict TPS limits.
- Network traffic shaping — historically the original use case.
- Egress queues where you want predictable, constant outflow.
Real-world example
NGINX's limit_req module is a leaky bucket: it forwards requests at exactly the configured rate and queues (or rejects) excess. Many message brokers and ETL pipelines use the same shape for the same reason.
Pros
- Output rate is provably constant — perfect for protecting fragile downstreams.
- Implementation is a queue plus a timer — trivially understandable.
- Memory is bounded by capacity (the queue cannot exceed B entries).
Cons
- No bursting — even when downstream is idle, the leak rate is the leak rate. Capacity goes unused.
- Adds latency: a request that arrives during a queue of 9 waits 9 leak intervals before being served.
- Requires a timer or scheduler, which is more moving parts than token bucket's lazy refill.
Reference
Code & further reading
A minimal reference implementation and pointers worth bookmarking.
// A simple leaky bucket (FIFO queue + steady drain).
class LeakyBucket {
private queue: (() => void)[] = [];
constructor(
private capacity: number,
private leakIntervalMs: number,
) {
setInterval(() => this.leak(), this.leakIntervalMs);
}
enqueue(work: () => void): boolean {
if (this.queue.length >= this.capacity) return false;
this.queue.push(work);
return true;
}
private leak() {
const next = this.queue.shift();
if (next) next();
}
}
// Process at exactly 10 requests per second, max queue 50.
const bucket = new LeakyBucket(50, 100);
if (!bucket.enqueue(() => handleRequest(req))) {
// queue full — drop the request
}References & further reading
4 sources- Articleen.wikipedia.org
Wikipedia — Leaky bucket
The article distinguishes the two variants (meter vs. queue) clearly.
- Docsnginx.org
NGINX — ngx_http_limit_req_module
Production-grade leaky bucket via
limit_req_zoneandburst— NGINX names the method explicitly. - Articleen.wikipedia.org
Wikipedia — Generic cell rate algorithm (GCRA)
The leaky-bucket meter formalized: same behavior, half the storage and no queue.
- Spec
ATM Forum — Traffic Management Specification
The dense original definition from telecom networks. Skim section 4 for the math.
Knowledge check
Did the prototype land?
Quick questions, answers revealed on submit. No scoring saved.
question 01 / 03
What is the defining property of a leaky bucket compared to a token bucket?
question 02 / 03
A leaky bucket has capacity 10 and leak rate 1/sec. Twenty requests arrive at the same instant. What happens?
question 03 / 03
Why might you choose token bucket over leaky bucket for a public API?
0/3 answered
Was this concept helpful?
Tell us what worked, or what to improve. We read every note.