Data Warehousing
Summary: A data warehouse is a separate database that contains a read-only copy of data from various transactional systems, optimized for analytical queries without affecting OLTP operations.
Sources: chapter3
Last updated: 2026-04-15
Purpose
Large enterprises often have many different oltp systems. Running expensive analytical queries directly on these databases can harm the performance of concurrent transactions. A data warehouse provides a separate environment for analysts to run queries (source: chapter3).
ETL Process
Data is moved from OLTP systems to the warehouse through a process called Extract-Transform-Load (ETL):
- Extract: Periodic data dumps or continuous updates from OLTP databases (source: chapter3).
- Transform: Data is cleaned and transformed into an analysis-friendly schema (source: chapter3).
- Load: The transformed data is loaded into the warehouse (source: chapter3).
Schemas for Analytics
Analytical databases often use formulaic schemas like the star schema (dimensional modeling), consisting of a large fact table at the center surrounded by smaller dimension tables (source: chapter3).