Leaky Bucket Algorithm — Rate Limiting

Overview

What this concept solves

The leaky bucket flips the token bucket on its head. Instead of tokens coming in and being consumed by requests, requests themselves fill the bucket — and they drain out at a fixed rate, like water leaking through a pinhole. The output rate is always constant, regardless of how spiky the input is.

If you've ever seen a network operator talk about 'traffic shaping' or a 'committed information rate' — they're talking about a leaky bucket. It's the algorithm that turns a chaotic input stream into a clean, steady output stream.

Mechanics

How it works

Water in, water out — at a fixed rate

The classic implementation is a FIFO queue with two parameters:

Capacity (B) — the maximum number of queued requests. Overflow gets dropped.
Leak rate (r) — how many requests are processed per second, regardless of arrival pattern.

The algorithm is two independent flows:

On arrival: if queue.length < B, append the request. Otherwise drop it.
On a timer (every 1/r seconds): if the queue is non-empty, dequeue one request and forward it downstream.

The smoothing property

Whatever the input pattern — a steady stream, a sudden burst, a flatline — the output is metronome-steady at exactly r per second. This is the leaky bucket's whole reason for existing: it makes downstream systems' job easy.

Two variants in the wild

The version shown here (a queue that drains at a fixed rate) is the leaky bucket as a meter — used for traffic shaping. There's also a leaky bucket as a counter variant that behaves almost identically to a token bucket. Be careful when reading older papers; they sometimes conflate the two.

Interactive prototype

Run it. Break it. Tune it.

Sandboxed simulation embedded right in the page. No setup, no install.

simulation › Leaky Bucket Algorithm

About this simulation

Requests fill the bucket. It drains at a constant 1 per second — no faster, no matter how hard you push. Overflow is dropped. Try 'Burst of 15' to see the smoothing effect.

Hands-on

Try these on your own

Open the prototype above, run each experiment, predict the answer, then verify.

try 01

Watch the smoothing

Click 'Burst of 15'. Notice the queue fills, then drains at exactly 1/sec for the next 10 seconds. The input was bursty, the output is metronomic.

try 02

Force an overflow

Hit 'Burst of 15' on a full queue. The 'Dropped' counter should jump. This is the trade-off: leaky bucket rejects under load instead of slowing down.

try 03

Compare with token bucket

Open the token bucket prototype in another tab. Burst both with the same input. Token bucket lets the burst through then runs out; leaky bucket smooths it out over time. Same goal, opposite philosophy.

In practice

When to use it — and what you give up

When to reach for it

Smoothing bursts before a slow downstream — a database, an external API, a payment processor with strict TPS limits.
Network traffic shaping — historically the original use case.
Egress queues where you want predictable, constant outflow.

Real-world example

NGINX's limit_req module is a leaky bucket: it forwards requests at exactly the configured rate and queues (or rejects) excess. Many message brokers and ETL pipelines use the same shape for the same reason.

Pros

Output rate is provably constant — perfect for protecting fragile downstreams.
Implementation is a queue plus a timer — trivially understandable.
Memory is bounded by capacity (the queue cannot exceed B entries).

Cons

No bursting — even when downstream is idle, the leak rate is the leak rate. Capacity goes unused.
Adds latency: a request that arrives during a queue of 9 waits 9 leak intervals before being served.
Requires a timer or scheduler, which is more moving parts than token bucket's lazy refill.

Reference

Code & further reading

A minimal reference implementation and pointers worth bookmarking.

leaky-bucket.ts

// A simple leaky bucket (FIFO queue + steady drain).
class LeakyBucket {
  private queue: (() => void)[] = [];

  constructor(
    private capacity: number,
    private leakIntervalMs: number,
  ) {
    setInterval(() => this.leak(), this.leakIntervalMs);
  }

  enqueue(work: () => void): boolean {
    if (this.queue.length >= this.capacity) return false;
    this.queue.push(work);
    return true;
  }

  private leak() {
    const next = this.queue.shift();
    if (next) next();
  }
}

// Process at exactly 10 requests per second, max queue 50.
const bucket = new LeakyBucket(50, 100);
if (!bucket.enqueue(() => handleRequest(req))) {
  // queue full — drop the request
}

References & further reading

4 sources

Articleen.wikipedia.org
Wikipedia — Leaky bucket
The article distinguishes the two variants (meter vs. queue) clearly.
Docsnginx.org
NGINX — ngx_http_limit_req_module
Production-grade leaky bucket via limit_req_zone and burst — NGINX names the method explicitly.
Articleen.wikipedia.org
Wikipedia — Generic cell rate algorithm (GCRA)
The leaky-bucket meter formalized: same behavior, half the storage and no queue.
Spec
ATM Forum — Traffic Management Specification
The dense original definition from telecom networks. Skim section 4 for the math.

Knowledge check

Did the prototype land?

Quick questions, answers revealed on submit. No scoring saved.

question 01 / 03

What is the defining property of a leaky bucket compared to a token bucket?

question 02 / 03

A leaky bucket has capacity 10 and leak rate 1/sec. Twenty requests arrive at the same instant. What happens?

question 03 / 03

Why might you choose token bucket over leaky bucket for a public API?

0/3 answered

Was this concept helpful?

Tell us what worked, or what to improve. We read every note.

Back to Rate Limiting