Exactly-once Semantics
Summary: A guarantee in stream-processing that the effect of an event is reflected exactly once in the final output, even if failures and retries occur.
Sources: chapter11
Last updated: 2026-04-18
The Problem
In a distributed system, network failures and node crashes are common. If a consumer crashes after processing a message but before acknowledging it, the message will be redelivered. This can lead to double-counting or other inconsistencies.
Approaches
- Microbatching and Checkpointing: Breaking the stream into small batches and periodically saving the state of the processor and the consumer offset to durable storage. If a failure occurs, the processor can restart from the last successful checkpoint.
- Atomic Commit: Performing the message acknowledgment, state update, and side effects (like database writes) as a single atomic transaction. This is often difficult to implement across heterogeneous systems.
- Idempotence: Designing processing steps such that they can be safely executed multiple times with the same result. For example, setting a value in a key-value store is idempotent, while incrementing a counter is not (unless extra metadata like a sequence number is used).
Terminology Note
The term “exactly-once” is slightly misleading; “effectively-once” is more accurate, as the system may perform the action multiple times but ensures the final result is as if it happened only once (source: chapter11, page 476).