Chapter 9: Consistency and Consensus

Summary: This chapter explores the strongest consistency models and the fundamental problem of getting nodes in a distributed system to agree on something, even in the presence of faults.

Sources: chapter9

Last updated: 2026-04-17


The best way of building fault-tolerant systems is to find some general-purpose abstractions with useful guarantees, implement them once, and then let applications rely on those guarantees (source: chapter9, p. 321). This chapter focuses on abstractions that allow an application to ignore some of the problems with distributed systems, most importantly consensus.

Key Concepts

linearizability

The strongest consistency model, providing a recency guarantee: as soon as one client successfully completes a write, all clients reading from the database must be able to see the value just written (source: chapter9, p. 324). It makes a distributed system appear as if there is only one copy of the data (source: chapter9, p. 325).

causality and Ordering

causality defines a partial order of events (happened-before relationship) (source: chapter9, p. 339). It is a weaker but more efficient guarantee than linearizability. linearizability implies causality (source: chapter9, p. 342).

total-order-broadcast

A protocol for exchanging messages such that all nodes receive the same messages in the same order (source: chapter9, p. 348). It is equivalent to consensus (source: chapter9, p. 350).

distributed-transactions and atomic-commit

The problem of ensuring that a transaction either commits on all participating nodes or aborts on all of them (source: chapter9, p. 352). two-phase-commit (2PC) is a common but blocking algorithm for solving this (source: chapter9, p. 355).

Fault-Tolerant consensus

The goal of getting nodes to agree on a value despite failures (source: chapter9, p. 364). Key properties include uniform agreement, integrity, validity, and termination (source: chapter9, p. 365).

Membership and Coordination Services

Services like zookeeper and etcd implement consensus to provide linearizable atomic operations, locking, and service discovery (source: chapter9, p. 370).