Scalability

Summary: A system’s ability to cope with increased load.

Sources: chapter1

Last updated: 2026-04-15

Scalability is not a one-dimensional label like “X is scalable.” Instead, discussing scalability means considering questions like “If the system grows in a particular way, what are our options for coping with the growth?” and “How can we add computing resources to handle the additional load?” (source: chapter1).

Key Concepts

load-parameters: Succinctly describe the current load on the system (requests per second, ratio of reads to writes, simultaneously active users, etc.).
latency-and-response-time: Measuring how performance is affected when load increases.
percentiles: Using p50 (median), p95, p99, and p99.9 to understand the distribution of response times.

Strategies for Coping with Load

Scaling Up (Vertical Scaling): Moving to a more powerful machine.
Scaling Out (Horizontal Scaling): Distributing the load across multiple smaller machines (also known as a shared-nothing architecture).
replication: Increasing read throughput by serving queries from multiple read replicas (source: chapter5, p. 151).
Elastic Systems: Automatically adding computing resources when a load increase is detected (source: chapter1).

Quartz 4

Explorer

scalability

Scalability

Key Concepts

Strategies for Coping with Load

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

scalability

Scalability

Key Concepts

Strategies for Coping with Load

Related pages

Graph View

Table of Contents

Backlinks