5. CAP Theorem & Consistency Models

title: 5. CAP Theorem & Consistency Models

The CAP theorem is a fundamental principle in distributed systems that defines the trade-offs between consistency, availability, and partition tolerance.

CAP Theorem

Proposed by Eric Brewer in 2000 and proven by Seth Gilbert and Nancy Lynch in 2002.

In a distributed data store, it is impossible to simultaneously guarantee all three of the following:

The Three Guarantees

Property	Definition
Consistency (C)	Every read receives the most recent write or an error. All nodes see the same data at the same time.
Availability (A)	Every request receives a (non-error) response, without guarantee that it contains the most recent write.
Partition Tolerance (P)	The system continues to operate despite network partitions (communication breakdowns between nodes).

            Consistency
               /\
              /  \
             /    \
            / CP   \
           /  zone  \
          /----------\
         / CA    AP   \
        /   zone  zone \
       /________________\
  Availability ──── Partition Tolerance

Why "Pick 2 out of 3" is Misleading

In real-world distributed systems, network partitions are inevitable. You cannot choose to "not have" partition tolerance. Therefore, the real choice is:

During a network partition, do you prioritize Consistency or Availability?

CP system: Refuses to respond (sacrifices availability) to guarantee consistency during partitions.
AP system: Responds with potentially stale data (sacrifices consistency) to remain available during partitions.
CA system: Only possible in a single-node (non-distributed) system where partitions cannot occur.

CAP in Practice

CP Systems (Consistency + Partition Tolerance)

When a partition occurs, the system blocks or returns an error rather than serving potentially inconsistent data.

System	Description
ZooKeeper	Leader-based coordination; unavailable during leader election
etcd	Raft consensus; writes blocked without quorum
HBase	Strong consistency; unavailable if region server is partitioned
MongoDB (default)	Primary handles writes; unavailable if primary is unreachable
Redis Cluster	Slots become unavailable when master is down

Use cases: Banking, inventory management, leader election — where stale data is unacceptable.

AP Systems (Availability + Partition Tolerance)

When a partition occurs, the system continues serving requests but may return stale or conflicting data.

System	Description
Cassandra	Tunable consistency; can favor availability
DynamoDB	Eventually consistent reads by default
CouchDB	Multi-master replication, conflict resolution
Riak	Masterless, vector clocks for conflict resolution
DNS	Serves cached records even when authoritative server is unreachable

Use cases: Social media feeds, shopping carts, DNS — where serving stale data is acceptable.

CA Systems (Consistency + Availability)

Only possible in systems that don't need partition tolerance — essentially single-node databases.

System	Description
Single-node PostgreSQL	ACID on one machine; no partitions possible
Single-node MySQL	Same as above
SQLite	Embedded, single-process database

PACELC Theorem

An extension of CAP by Daniel Abadi (2012):

If there is a Partition (P), choose between Availability (A) and Consistency (C); Else (E), when the system is running normally, choose between Latency (L) and Consistency (C).

PACELC: if (Partition) then {A or C} else {L or C}

System	Partition? →	Else (Normal) →	Classification
DynamoDB	A (Available)	L (Low latency)	PA/EL
Cassandra	A (Available)	L (Low latency)	PA/EL
MongoDB	C (Consistent)	C (Consistent)	PC/EC
PNUTS (Yahoo)	C (Consistent)	L (Low latency)	PC/EL
VoltDB	C (Consistent)	C (Consistent)	PC/EC

Key insight: Even when there's no partition, there's a trade-off between latency and consistency in distributed systems.

Consistency Models

Strong Consistency (Linearizability)

Every read returns the value of the most recent completed write. All nodes appear as a single, atomic data store.

Time →
Writer:  WRITE(x=1) ───────── WRITE(x=2)
Reader A:            READ(x) → 1
Reader B:                              READ(x) → 2

All readers see writes in the same order.
After a write completes, all subsequent reads return the new value.
Implementation: Consensus protocols (Paxos, Raft), synchronous replication.
Cost: Higher latency, lower throughput.

Sequential Consistency

All operations appear to execute in some sequential order, and each individual process's operations appear in the order they were issued.

Weaker than linearizability: The total order doesn't need to respect real-time ordering.
Useful for multi-processor systems.

Causal Consistency

Operations that are causally related are seen by all nodes in the same order. Concurrent operations may be seen in different orders.

A posts: "Anyone want pizza?"      ← Event 1
B replies: "I do!"                 ← Event 2 (caused by Event 1)

All nodes must see Event 1 before Event 2.
But an unrelated Event 3 by C can appear anywhere.

Implementation: Vector clocks, version vectors.
Use case: Social media (replies must appear after the post).

Eventual Consistency

If no new writes are made, all replicas will eventually converge to the same value. No guarantee on when.

Time →
Writer:   WRITE(x=5)
Replica1: x=5 (immediately)
Replica2: x=3 (stale) ──→ x=5 (after replication)
Replica3: x=3 (stale) ──→ x=5 (after replication)

Convergence time: Typically milliseconds to seconds.
Read-your-writes: A session guarantee where a client always sees its own writes.
Monotonic reads: A client never sees an older value after seeing a newer one.
Implementation: Conflict resolution (last-writer-wins, CRDTs, application-level merge).

Read-Your-Writes Consistency

A client always sees the effects of its own writes.

Client A: WRITE(name="Alice") → READ(name) → "Alice"  ✅ (guaranteed)
Client B: READ(name) → may see old value  (not guaranteed)

Monotonic Read Consistency

Once a client reads a value, it will never see an older value in subsequent reads.

Client: READ(x) → 5 → READ(x) → 5 or higher (never 3)  ✅

Consistency Model Comparison

Model	Strength	Latency	Use Case
Strong (Linearizable)	Strongest	Highest	Banking, leader election
Sequential	Strong	High	Multi-processor cache coherence
Causal	Moderate	Moderate	Social media, collaborative editing
Eventual	Weakest	Lowest	DNS, shopping carts, CDN
Read-Your-Writes	Session-level	Low	User profile updates

Strongest ◄────────────────────────────────────────► Weakest
Linearizable → Sequential → Causal → Eventual
Highest Latency                      Lowest Latency

Conflict Resolution

When concurrent writes occur in an eventually consistent system, conflicts must be resolved.

Last-Writer-Wins (LWW)

The write with the latest timestamp wins. Simple but can lose data.

Vector Clocks

Each node maintains a vector of counters to track causality.

Node A: {A:1, B:0}  writes x=1
Node B: {A:0, B:1}  writes x=2
Conflict: {A:1, B:0} and {A:0, B:1} are concurrent → resolve

CRDTs (Conflict-Free Replicated Data Types)

Data structures designed to automatically merge without conflicts.

CRDT Type	Description	Example
G-Counter	Grow-only counter	Like count
PN-Counter	Positive-negative counter	Upvote/downvote
G-Set	Grow-only set	Tags added to a post
OR-Set	Observed-remove set	Shopping cart items
LWW-Register	Last-writer-wins register	User profile field

Application-Level Resolution

The application presents conflicting versions to the user (e.g., Amazon's shopping cart "merge all items" approach).

Quorum Consistency

In replicated systems, consistency can be tuned using quorums:

Where:

= Total number of replicas
= Number of replicas that must acknowledge a write
= Number of replicas that must respond to a read

Configuration	Consistency	Availability
	Strong consistency for reads	Writes fail if any node is down
	Strong consistency for writes	Reads fail if any node is down
	Balanced (majority quorum)	Tolerates minority failures
	Eventual consistency	Highest availability

Example with N=3:

: Strong consistency, tolerates 1 node failure.
: Eventual consistency, tolerates 2 node failures.

Summary

Concept	Key Point
CAP Theorem	During partitions, choose consistency OR availability
Real choice	CP (block on partition) vs AP (serve stale)
PACELC	Even without partitions: latency vs consistency trade-off
Strong consistency	Safest but slowest — for critical data
Eventual consistency	Fastest but may serve stale — for non-critical data
Quorums	Tunable consistency via W + R > N
Conflict resolution	LWW (simple), Vector Clocks (accurate), CRDTs (automatic)

Rule of thumb: Use strong consistency for financial/critical data. Use eventual consistency for social feeds, analytics, and caches. Most systems use a mix of both.