Log-based Message Brokers

Summary: A type of message broker that uses an append-only append-only-log on disk to store and distribute events, allowing for high throughput and message replay.

Sources: chapter11

Last updated: 2026-04-18


How it Works

A log-based broker (like Apache Kafka) partitions a topic into multiple logs. Each partition is an append-only sequence of records on disk.

  • Offsets: Every message is assigned a monotonically increasing sequence number, or offset.
  • Sequential Reads: Consumers read the log sequentially and keep track of their progress by periodically committing their current offset (source: chapter11, page 447).

Advantages

  • Throughput: By using sequential I/O, log-based brokers can handle millions of messages per second.
  • Replayability: Unlike traditional brokers (e.g., RabbitMQ), messages are not deleted once they are consumed. A consumer can “rewind” to an older offset and reprocess data.
  • Durability: Messages are persisted to disk, providing a buffer that can withstand consumer crashes or slow processing.

Limitations

  • Fixed Partitioning: The number of partitions often limits the maximum number of concurrent consumers.
  • Head-of-line Blocking: If a single message in a partition is slow to process, it blocks all subsequent messages in that partition.

Comparison with Traditional Brokers

FeatureTraditional (JMS/AMQP)Log-based (Kafka)
StorageMessages deleted after ackPersistent log on disk
ConsumptionDestructive readRead-only (non-destructive)
OrderingNo guarantee with multiple consumersGuaranteed within a partition
Load BalancingPer-messagePer-partition