← Week 1: Consistency Models

Day 3: CAP Theorem

Phase 1 · May 22, 2026

← Week 1: Consistency Models

Agenda (2–3 hours)

  • Read (45 min): Brewer "CAP Twelve Years Later: How the Rules Have Changed" (IEEE Computer 2012); Gilbert & Lynch 2002 proof
  • Study (45 min): Work through the impossibility proof; understand where each real system sits
  • Practice (45 min): Draw the CAP triangle; plot Redis Cluster, Cassandra, CockroachDB, and DynamoDB
  • Challenge (30 min): Write a short explanation of why a network partition forces a choice between C and A
← Week 1: Consistency Models

CAP Theorem (Brewer 2000, Gilbert & Lynch 2002)

A distributed data store can provide at most two of:

  • Consistency — every read receives the most recent write or an error
  • Availability — every request receives a (non-error) response
  • Partition tolerance — the system continues to operate despite dropped messages

Catch: in any real network, partitions will happen. So the real choice is CP vs AP when a partition occurs.

CAP's "C" means linearizability (not just "no stale reads") — a common source of confusion.

← Week 1: Consistency Models

Proof Sketch

Assume we have two nodes N1 and N2, and a network partition between them.

If we require availability:

  • Both N1 and N2 must respond to all writes
  • A write to N1 cannot propagate to N2 during the partition
  • A subsequent read from N2 will not see N1's write → consistency violated

If we require consistency:

  • We must refuse writes to one side during the partition, or refuse reads that can't be confirmed
  • Some requests receive errors or timeouts → availability violated
← Week 1: Consistency Models

CP vs AP Systems

System Partition behavior Rationale
etcd / ZooKeeper Reject writes if no quorum Linearizability required for coordination
CockroachDB Reject writes if no quorum Serializable SQL transactions
Cassandra Accept writes to any available node Tunable consistency; AP by default
DynamoDB Accept writes; eventual sync AP for availability; strong mode available
HBase Block writes without ZooKeeper CP; built on HDFS/ZooKeeper
← Week 1: Consistency Models

What CAP Doesn't Tell You

CAP is a worst-case, partition-time model. It omits:

  • Latency — a system can be "available" but slow enough to be useless
  • Duration — partitions are transient; some systems recover gracefully
  • Granularity — consistency can vary per key, per operation, or per session
  • Partial availability — a quorum of 3/5 can serve reads even if 2 nodes are down

CAP became a marketing term ("we're AP!") — the nuance matters more than the label.

← Week 1: Consistency Models

Key Takeaways

  • In the presence of partitions, you must choose consistency or availability
  • CP: refuse requests you can't answer correctly (strong consistency guarantee)
  • AP: answer all requests, accept possible staleness (high availability)
  • Most production systems offer tunable consistency — the choice is per-operation
  • CAP is a framework for reasoning, not a binary classification

Tomorrow: PACELC — an extension of CAP that captures the latency/consistency tradeoff in the non-partition case.