Data Locality

Summary: A performance optimization where related data is stored together on disk to minimize disk seeks and speed up reads.

Sources: raw/chapter2

Last updated: 2026-04-15


The concept of data locality is a primary advantage of the document-model. Because a document is typically stored as a single continuous string (JSON/XML/BSON), fetching it requires fewer index lookups and disk seeks compared to a multi-table relational schema (source: chapter2, p. 41).

Trade-offs

  • Locality Advantage: Only applies if you need a large part of the document at the same time.
  • Wasteful Access: Loading a whole document just to access a small part can be wasteful.
  • Update Overhead: Updates that change the encoded size of a document often require the entire document to be rewritten (source: chapter2, p. 41).

Locality in Other Models

Locality is not limited to document databases. Other systems like Google Spanner (via table interleaving), Oracle (via index cluster tables), and Cassandra (via column-families) also provide ways to manage data locality (source: chapter2, p. 41).