Hash Indexes
Summary: A simple indexing strategy where an in-memory hash map stores the byte offsets of keys in an append-only log file on disk.
Sources: chapter3
Last updated: 2026-04-15
Mechanism
In a key-value store, the simplest indexing strategy is to maintain an in-memory hash map where every key maps to a byte offset in a data file. When a value is updated, the new data is appended to the log, and the hash map is updated with the new offset (source: chapter3).
This approach is used by storage engines like Bitcask. It provides high-performance reads and writes, provided all keys fit in the available RAM (source: chapter3).
Compaction
To prevent the log from growing indefinitely, the log is broken into segments. A background process performs compaction, throwing away duplicate keys and keeping only the most recent update (source: chapter3).
Limitations
- Memory Constraints: All keys must fit in RAM. Maintaining the hash map on disk is difficult to make performant due to random access I/O and collision handling (source: chapter3).
- Range Queries: Range queries are inefficient because you must look up each key individually in the hash map (source: chapter3).