Bloom Filters
Summary: A memory-efficient data structure used to approximate the contents of a set, helping to avoid unnecessary disk reads for non-existent keys in log-structured storage engines.
Sources: chapter3
Last updated: 2026-04-15
Mechanism
Bloom filters can tell you if a key might be in the database or if it is definitely not there. This allows a storage engine to skip checking multiple segments on disk if the Bloom filter indicates the key is absent (source: chapter3).
Application
They are particularly important for lsm-trees-based engines, where looking up non-existent keys can be slow because every segment may need to be checked from newest to oldest (source: chapter3).