Normalization

Summary: The process of removing data duplication by storing information in one place and referencing it via unique identifiers (IDs).

Sources: raw/chapter2

Last updated: 2026-04-15


The key idea behind normalization is that anything meaningful to humans (like a string name) may need to change, whereas an ID (which has no meaning to humans) can remain the same even if the identified information changes (source: chapter2, p. 33).

Benefits

  • Consistency: Updating a name in one place reflects across the board, avoiding inconsistencies.
  • Ambiguity Removal: IDs can distinguish between different entities with the same name.
  • Localization: Standardized lists can be localized into different languages more easily (source: chapter2, p. 33).

Denormalization

The opposite of normalization, often used in the document-model to improve read performance through data-locality at the cost of potential inconsistencies and write overhead (source: chapter2, p. 34, 39).