Process Pauses

Summary: Temporary interruptions in the execution of a program that can cause leases to expire and nodes to appear dead.

Sources: chapter8

Last updated: 2026-04-17


A node in a distributed system may experience an unexpected pause in its execution. During this time, the node is not responding to messages, and its leases or locks might expire (source: chapter8, p. 295).

Causes of Process Pauses

  • Garbage Collection (GC): “Stop-the-world” GC pauses can last for seconds, stopping all application threads.
  • Virtual Machine Suspension: A VM in a public cloud might be suspended for live migration to another host.
  • Operating System Scheduling: Context switches or heavy load can delay a process’s execution.
  • Synchronous Disk Access: A thread might be blocked waiting for a slow disk I/O operation.
  • Paging (Swapping): If the system runs out of memory, it may swap pages to disk, causing significant delays.

(source: chapter8, p. 296)

The Danger of Pauses

If a node holds a lease (e.g., to be the leader) and experiences a pause longer than the lease duration, other nodes will declare it dead and elect a new leader. When the paused node resumes, it may still believe it is the leader, leading to data corruption if it continues to perform writes. This issue can be mitigated using fencing-tokens (source: chapter8, p. 302).