Glossary System Design

Rate Limiting

Controls the rate of requests to a service. Common algorithms: Token Bucket, Leaky Bucket, Fixed Window, and Sliding Window.

Why Rate Limiting?

Protects services from abuse, DoS attacks, and runaway clients. Required in every API gateway design question.

Algorithms

Token Bucket

A bucket holds N tokens. Each request consumes one token. Tokens replenish at a fixed rate. Allows bursting up to bucket capacity.

Leaky Bucket

Requests enter a queue and are processed at a fixed rate. Smooths out bursts but introduces latency.

Fixed Window Counter

Count requests per time window (e.g., 100 req/min). Simple but has a boundary spike problem: 200 requests can slip through at the window boundary.

Sliding Window Log

Store a timestamp for each request. Count requests in the last N seconds. Accurate but memory-intensive at scale.

Sliding Window Counter

Hybrid: fixed window + weighted previous window. Best accuracy/memory trade-off for production systems.

Distributed Rate Limiting

For multi-node APIs, use Redis INCR with TTL for atomic counters. Lua scripts for atomic check-and-increment to prevent race conditions.