Search Indexes

Summary: Specialized data structures optimized for text-based queries, such as full-text search or prefix matching.

Sources: chapter3

Last updated: 2026-04-18


While primary indexes (like b-trees) are optimized for exact key lookups or range scans, search indexes allow for more complex queries, such as finding all documents containing a specific word.

Inverted Index

The most common structure for full-text search. It maps terms to the list of documents where they occur (source: chapter3).

  • Implementation: Usually implemented using a sorted structure like a Term Dictionary (similar to an sstables index).
  • Complexity: Supporting synonyms, word stemming, and misspelling tolerance adds significant complexity to the index structure.

Distributed Search Indexes

  • Local Secondary Index: Each partition has its own search index (scatter/gather) (source: chapter6, p. 206).
  • Global Secondary Index: A search index that is partitioned independently of the data, allowing for term-partitioned queries (source: chapter6, p. 208).

Tools

Popular open-source search engines like Apache Lucene, Elasticsearch, and Solr use inverted indexes as their core data structure.