Bloom Filters

Summary: A memory-efficient data structure used to approximate the contents of a set, helping to avoid unnecessary disk reads for non-existent keys in log-structured storage engines.

Sources: chapter3

Last updated: 2026-04-15


Mechanism

Bloom filters can tell you if a key might be in the database or if it is definitely not there. This allows a storage engine to skip checking multiple segments on disk if the Bloom filter indicates the key is absent (source: chapter3).

Application

They are particularly important for lsm-trees-based engines, where looking up non-existent keys can be slow because every segment may need to be checked from newest to oldest (source: chapter3).