Exactly-once Semantics

Summary: A guarantee in stream-processing that the effect of an event is reflected exactly once in the final output, even if failures and retries occur.

Sources: chapter11

Last updated: 2026-04-18


The Problem

In a distributed system, network failures and node crashes are common. If a consumer crashes after processing a message but before acknowledging it, the message will be redelivered. This can lead to double-counting or other inconsistencies.

Approaches

  1. Microbatching and Checkpointing: Breaking the stream into small batches and periodically saving the state of the processor and the consumer offset to durable storage. If a failure occurs, the processor can restart from the last successful checkpoint.
  2. Atomic Commit: Performing the message acknowledgment, state update, and side effects (like database writes) as a single atomic transaction. This is often difficult to implement across heterogeneous systems.
  3. Idempotence: Designing processing steps such that they can be safely executed multiple times with the same result. For example, setting a value in a key-value store is idempotent, while incrementing a counter is not (unless extra metadata like a sequence number is used).

Terminology Note

The term “exactly-once” is slightly misleading; “effectively-once” is more accurate, as the system may perform the action multiple times but ensures the final result is as if it happened only once (source: chapter11, page 476).