Chapter 5: Replication
Summary: This chapter explores the challenges and solutions for keeping multiple copies of the same data on different machines, focusing on leader-based, multi-leader, and leaderless architectures.
Sources: chapter5
Last updated: 2026-04-15
Replication means keeping a copy of the same data on multiple machines connected via a network. It is used to reduce latency by keeping data geographically close to users, increase availability if some parts of the system fail, and increase read throughput by scaling out the number of machines that can serve queries (source: chapter5, p. 151).
Main Replication Architectures
There are three popular algorithms for replicating changes between nodes:
- single-leader-replication: All writes are sent to one node (the leader), which sends a stream of data change events to the other replicas (followers). This is the most common approach and is used in databases like MySQL, PostgreSQL, and MongoDB.
- multi-leader-replication: More than one node can accept writes. This is useful for multi-datacenter operations, offline-capable applications, and collaborative editing.
- leaderless-replication: Any node can accept writes, and clients send their writes to several nodes in parallel. This is inspired by Amazon’s Dynamo system and is used in databases like Cassandra and Riak.
Replication Lag & Consistency Guarantees
In systems with asynchronous replication, followers may lag behind the leader, leading to eventual consistency. To improve user experience, several stronger guarantees can be implemented:
- read-after-write-consistency: Ensures that a user always sees any updates they submitted themselves.
- monotonic-reads: Guarantees that once a user has seen data at one point in time, they shouldn’t later see the data from some earlier point in time.
- consistent-prefix-reads: Guarantees that if a sequence of writes happens in a certain order, anyone reading those writes will see them appear in the same order.
Conflict Resolution
In multi-leader and leaderless systems, concurrent writes can lead to conflicts that must be resolved to ensure all replicas eventually converge to the same state. Techniques include:
- last-write-wins: Discarding concurrent writes based on a timestamp.
- version-vectors: Using counters to track causal dependencies between writes.
- crdts: Specialized data structures that automatically resolve conflicts in sensible ways.