GlossaryDistributed Systems

CAP Theorem

A distributed system can only guarantee two of three: Consistency, Availability, and Partition Tolerance.

Why This Theorem Exists

Distributed systems fail in ways that single-node systems cannot. A network partition means some nodes cannot communicate with others, even though all nodes are individually healthy. The question CAP answers is: what guarantees can a system make to its clients during that partition?

Eric Brewer proposed the CAP conjecture in 2000. Gilbert and Lynch formally proved it in 2002. The proof is narrow but the intuition matters for every system design decision involving replication.

The Three Properties

Consistency (C): Every read returns the most recent write or an error. Not "eventual consistency," meaning strict linearizability. A client that writes a value and then reads it must see that value, regardless of which node handles the read.

Availability (A): Every request to a non-failed node receives a response. The response may not reflect the most recent write, but it will not be an error. The system continues to serve requests even during degraded conditions.

Partition Tolerance (P): The system continues operating when the network drops or delays messages between nodes. Individual nodes may be healthy; the communication link between them is not.

The Real Trade-Off: C vs A, Not C vs A vs P

Here is the part most explanations get wrong: partition tolerance is not optional. Any system that runs on more than one machine across a real network will experience partitions. Hardware fails, switches drop packets, cross-datacenter links degrade. Designing a system that "gives up" partition tolerance means designing a system that stops working entirely when any network issue occurs. That is not a viable production system.

The actual forced choice is between C and A during a partition. When a partition occurs and two nodes disagree on the current state:

  • A CP system refuses to serve requests from the partitioned side until consistency is restored. It returns errors or blocks. It will not serve stale data.
  • An AP system continues to serve requests from all nodes, accepting that some responses may return stale data. Nodes reconcile when the partition heals.

This is not a one-time architectural decision. Different components of the same system can make different choices. A payment processing path should be CP (never show a stale balance). A social feed can be AP (showing a post 2 seconds late is acceptable).

Practical System Classification

System Classification What Happens During a Partition
Cassandra AP (tunable) Continues serving reads/writes; may return stale data
DynamoDB AP (tunable) Eventual consistency by default; strong consistency optional per-read
HBase CP Halts writes to partitioned region servers
ZooKeeper CP Minority partition stops accepting writes
etcd CP Minority partition stops accepting writes (Raft quorum required)
CockroachDB CP Transactions blocked until quorum is re-established
MongoDB (default) CP Primary election required before writes resume

"Tunable" systems like Cassandra and DynamoDB let you configure per-operation consistency. QUORUM reads require a majority of replicas to agree, pushing toward CP behaviour at the cost of availability. ONE reads return whatever the first responding replica has, trading consistency for lower latency and higher availability.

What CAP Does Not Model

CAP is a binary model: a partition either happens or it doesn't; a node is either available or it isn't. Real systems operate on a spectrum.

PACELC (Daniel Abadi, 2010) extends CAP to describe behaviour during normal operation. The full statement: if there is a Partition, choose between Availability and Consistency; Else (during normal operation), choose between Latency and Consistency. This models the real-world observation that even without a partition, replicating a write synchronously to 3 nodes before acknowledging it adds 5–20ms of latency compared to acknowledging after writing to 1 node.

Cassandra is PA/EL: available during partitions, low latency during normal operation. CockroachDB is PC/EC: consistent during partitions, consistent (higher latency) during normal operation.

Consistency Models Below Linearizability

CAP uses "consistency" to mean linearizability, the strictest model. Production systems operate on weaker guarantees:

  • Sequential consistency: All operations appear to have executed in some order, but not necessarily real-time order. Cheaper than linearizability.
  • Causal consistency: Causally related operations are seen in causal order. "If you saw my write, you will see the writes that caused it." Used in systems like MongoDB's causal sessions.
  • Eventual consistency: Given no new writes, all replicas converge to the same value. No timing guarantee. What Cassandra and DynamoDB provide by default.

Each step down the hierarchy reduces replication overhead and latency while accepting weaker guarantees to the client.

Interview Tip

The mistake most candidates make is treating CAP as a system label: "Cassandra is AP, therefore I'll use it." Interviewers at L5+ expect you to reason about CAP at the operation level, not the system level. A payment write needs CP semantics; a view count increment tolerates AP semantics. Calling out that Cassandra supports tunable consistency and explaining what QUORUM versus ONE means in terms of availability and staleness will put your answer in the top tier. Bonus: mention PACELC to show you understand the latency trade-off during normal operation, not just during partitions.