Chapter 2: Data Models and Query Languages

Summary: This chapter explores the history and trade-offs of different data models, including relational, document, and graph-based models, and their respective query languages.

Sources: raw/chapter2

Last updated: 2026-04-15


Data models are a critical part of software development as they define how we think about the problem we are solving. Most applications are built by layering one data model on top of another, from application-level objects to bytes on a disk (source: chapter2, p. 27).

Relational vs. Document Model

The relational-model (SQL) dominated for decades, but the nosql movement introduced the document-model to address needs for greater scalability, schema flexibility, and better data-locality (source: chapter2, p. 29).

  • impedance-mismatch: The friction between object-oriented application code and the table/row structure of relational databases (source: chapter2, p. 29).
  • schema-on-read: Document databases typically use an implicit schema that is interpreted when data is read, whereas relational databases use an explicit schema-on-write (source: chapter2, p. 39).
  • normalization: The process of removing duplication by storing data in one place and referencing it via IDs, which is the core of the relational model (source: chapter2, p. 33).

Query Languages for Data

Query languages can be categorized as declarative-query-languages (like SQL, Cypher, and SPARQL) or imperative-query-languages (like IMS or CODASYL) (source: chapter2, p. 42).

  • mapreduce: A hybrid programming model for processing large amounts of data, often used in NoSQL datastores (source: chapter2, p. 46).

Graph-Like Data Models

For highly interconnected data with many-to-many relationships, graph-models are often more natural than relational or document models (source: chapter2, p. 49).

  • cypher: A declarative query language for the property graph model (source: chapter2, p. 52).
  • sparql: A query language for the triple-store model, often associated with the semantic-web (source: chapter2, p. 59).
  • datalog: A foundational, rule-based query language for graphs (source: chapter2, p. 60).