Latency and Response Time

Summary: Measuring performance as a distribution of values when load increases.

Sources: chapter1

Last updated: 2026-04-15


In online systems, the service’s response time—the time between a client sending a request and receiving a response—is often more important than its throughput.

  • Response Time: What the client sees. Besides the actual time to process the request (the service time), it includes network delays and queueing delays (source: chapter1).
  • Latency: The duration that a request is waiting to be handled—during which it is latent, awaiting service (source: chapter1).

Measuring Response Time

Response times can vary widely. It is better to think of response time not as a single number, but as a distribution of values that we can measure. This distribution can be visualized using histograms or analyzed using percentiles.