Chapter 9: Consistency and Consensus
Summary: This chapter explores the strongest consistency models and the fundamental problem of getting nodes in a distributed system to agree on something, even in the presence of faults.
Sources: chapter9
Last updated: 2026-04-17
The best way of building fault-tolerant systems is to find some general-purpose abstractions with useful guarantees, implement them once, and then let applications rely on those guarantees (source: chapter9, p. 321). This chapter focuses on abstractions that allow an application to ignore some of the problems with distributed systems, most importantly consensus.
Key Concepts
linearizability
The strongest consistency model, providing a recency guarantee: as soon as one client successfully completes a write, all clients reading from the database must be able to see the value just written (source: chapter9, p. 324). It makes a distributed system appear as if there is only one copy of the data (source: chapter9, p. 325).
causality and Ordering
causality defines a partial order of events (happened-before relationship) (source: chapter9, p. 339). It is a weaker but more efficient guarantee than linearizability. linearizability implies causality (source: chapter9, p. 342).
total-order-broadcast
A protocol for exchanging messages such that all nodes receive the same messages in the same order (source: chapter9, p. 348). It is equivalent to consensus (source: chapter9, p. 350).
distributed-transactions and atomic-commit
The problem of ensuring that a transaction either commits on all participating nodes or aborts on all of them (source: chapter9, p. 352). two-phase-commit (2PC) is a common but blocking algorithm for solving this (source: chapter9, p. 355).
Fault-Tolerant consensus
The goal of getting nodes to agree on a value despite failures (source: chapter9, p. 364). Key properties include uniform agreement, integrity, validity, and termination (source: chapter9, p. 365).
Membership and Coordination Services
Services like zookeeper and etcd implement consensus to provide linearizable atomic operations, locking, and service discovery (source: chapter9, p. 370).