System Models

Summary: Abstract frameworks that formalize the assumptions we make about the timing and failure characteristics of a distributed system.

Sources: chapter8

Last updated: 2026-04-17


Algorithms need to be written in a way that doesn’t depend too heavily on the details of the hardware and software. We use system models to specify what an algorithm may assume about the environment (source: chapter8, p. 306).

Timing Assumptions

  1. Synchronous model: Assumes bounded network delay, bounded process pauses, and bounded clock drift. This is not realistic for most practical systems.
  2. Partially synchronous model: Assumes that the system behaves like a synchronous model most of the time, but sometimes exceeds the bounds. This is the most useful model for reasoning about real-world systems (source: chapter8, p. 307).
  3. Asynchronous model: An algorithm is not allowed to make any timing assumptions at all (not even a clock). This is very restrictive.

(source: chapter8, p. 307)

Node Failure Assumptions

  1. Crash-stop faults: A node fails only by crashing, and it never comes back.
  2. Crash-recovery faults: Nodes may crash at any time and may resume responding after some unknown time. Durable storage is preserved across crashes, but in-memory state is lost. This is the most common model for real systems (source: chapter8, p. 307).
  3. Byzantine (arbitrary) faults: Nodes may do absolutely anything, including trying to trick other nodes.

(source: chapter8, p. 307)