Compaction
Summary: A background process in log-structured storage engines that reclaims disk space by merging segments and discarding obsolete or deleted values.
Sources: chapter3
Last updated: 2026-04-15
Process
In append-only storage, updates to a key result in new records being added to the end of the log. Compaction processes these segments to retain only the most recent value for each key (source: chapter3).
Segment Merging
Compaction often occurs simultaneously with segment merging. Segments are never modified in-place; instead, the result of a merge is written to a new file, and the old segments are deleted after the process completes (source: chapter3).
Strategies
Common strategies include:
- Size-tiered compaction: Newer and smaller segments are merged into older and larger ones (source: chapter3).
- Leveled compaction: The key range is split into smaller SSTables, and data is moved into separate “levels” (source: chapter3).