Latency and Response Time
Summary: Measuring performance as a distribution of values when load increases.
Sources: chapter1
Last updated: 2026-04-15
In online systems, the service’s response time—the time between a client sending a request and receiving a response—is often more important than its throughput.
- Response Time: What the client sees. Besides the actual time to process the request (the service time), it includes network delays and queueing delays (source: chapter1).
- Latency: The duration that a request is waiting to be handled—during which it is latent, awaiting service (source: chapter1).
Measuring Response Time
Response times can vary widely. It is better to think of response time not as a single number, but as a distribution of values that we can measure. This distribution can be visualized using histograms or analyzed using percentiles.