Log-based Message Brokers
Summary: A type of message broker that uses an append-only append-only-log on disk to store and distribute events, allowing for high throughput and message replay.
Sources: chapter11
Last updated: 2026-04-18
How it Works
A log-based broker (like Apache Kafka) partitions a topic into multiple logs. Each partition is an append-only sequence of records on disk.
- Offsets: Every message is assigned a monotonically increasing sequence number, or offset.
- Sequential Reads: Consumers read the log sequentially and keep track of their progress by periodically committing their current offset (source: chapter11, page 447).
Advantages
- Throughput: By using sequential I/O, log-based brokers can handle millions of messages per second.
- Replayability: Unlike traditional brokers (e.g., RabbitMQ), messages are not deleted once they are consumed. A consumer can “rewind” to an older offset and reprocess data.
- Durability: Messages are persisted to disk, providing a buffer that can withstand consumer crashes or slow processing.
Limitations
- Fixed Partitioning: The number of partitions often limits the maximum number of concurrent consumers.
- Head-of-line Blocking: If a single message in a partition is slow to process, it blocks all subsequent messages in that partition.
Comparison with Traditional Brokers
| Feature | Traditional (JMS/AMQP) | Log-based (Kafka) |
|---|---|---|
| Storage | Messages deleted after ack | Persistent log on disk |
| Consumption | Destructive read | Read-only (non-destructive) |
| Ordering | No guarantee with multiple consumers | Guaranteed within a partition |
| Load Balancing | Per-message | Per-partition |