Search Indexes
Summary: Specialized data structures optimized for text-based queries, such as full-text search or prefix matching.
Sources: chapter3
Last updated: 2026-04-18
While primary indexes (like b-trees) are optimized for exact key lookups or range scans, search indexes allow for more complex queries, such as finding all documents containing a specific word.
Inverted Index
The most common structure for full-text search. It maps terms to the list of documents where they occur (source: chapter3).
- Implementation: Usually implemented using a sorted structure like a Term Dictionary (similar to an sstables index).
- Complexity: Supporting synonyms, word stemming, and misspelling tolerance adds significant complexity to the index structure.
Distributed Search Indexes
- Local Secondary Index: Each partition has its own search index (scatter/gather) (source: chapter6, p. 206).
- Global Secondary Index: A search index that is partitioned independently of the data, allowing for term-partitioned queries (source: chapter6, p. 208).
Tools
Popular open-source search engines like Apache Lucene, Elasticsearch, and Solr use inverted indexes as their core data structure.