Percentiles
Summary: Statistics used to understand the distribution of response times.
Sources: chapter1
Last updated: 2026-04-15
Using an average (arithmetic mean) to measure response time is common but often misleading, as it doesn’t represent the typical user experience. Percentiles are better for this purpose (source: chapter1).
Key Percentiles
- p50 (Median): Half of the requests return in less than this time, and half take longer. It represents the typical response time (source: chapter1).
- p95, p99, p99.9: These are tail latencies. They represent the experience of users who encounter the slowest responses (e.g., p99 = 1% of requests are slower than this) (source: chapter1).
Why Tail Latencies Matter
High percentiles are important because they directly affect the experience of users who have made many requests or who are your most valuable customers (source: chapter1).
Related Concepts
- Service Level Objectives (SLOs): Contracts that define the expected performance and availability of a service (source: chapter1).
- Service Level Agreements (SLAs): SLOs that are legally binding and may include financial penalties for failure to meet targets (source: chapter1).
- Tail Latency Amplification: When multiple backend calls are required to serve a single end-user request, just one slow backend request can make the entire end-user request slow (source: chapter1).