This specification introduces Server Timing, which enables the server to communicate performance metrics about the request, and an interface to allow JavaScript mechanisms to collect and process server-sent performance metrics for the document and its resources.
Accurately measuring performance characteristics of web applications is an important aspect of making web applications faster. [[!NavigationTiming]] and [[!ResourceTiming]] interfaces provide detailed timing information for the document and its resources, which includes time when the request was initiated, and time when first and last response byte were received. However, while the server processing time can be a significant fraction of the total request time, the user agent does not know where or how the time is spent on the server.
This specification introduces Server Timing, which enables the server to communicate performance metrics about the request, and an interface to allow JavaScript mechanisms to collect and process server-sent performance metrics for the document and its resources.
Server Timing constitutes of two parts: a definition of the `Server-Timing` HTTP response header, which allows the server to communicate custom performance metrics in a well defined format, and a `ServerEntry` interface and defines how the `Server-Timing` HTTP response header is parsed to create associated metrics in the performance timeline.
The `Server-Timing` response header field describes the server metrics associated with processing of the request. The server is NOT required to provide server metrics and is in full control which metrics are returned, when, and to whom - e.g. the server may provide detailed metrics to correctly authenticed users only, it may provide a different set of metrics to others, or nothing at all.
The `Server-Timing` field-value is a comma-delimited list of metric header fields, with a mandatory metric `name` and optional `duration` and `description` values for each. For example, the following response header communicates four metrics:
Server-Timing: db=53, app=47;customView, dc;atl
Name | Value | Description |
---|---|---|
db | 53 | |
app | 47 | customView |
dc | atl |
Above metrics may indicate that the database ("db") time to query the database was 53 ms, the application server ("app") took 47 ms to process the data (via "customView" template or function), and that the request was routed through the "atl" datacenter ("dc").
...
Server processing time can be a significant fraction of the total request time. For example, a dynamic response may require one or more database queries, cache lookups, API calls, time to process relevant data and render the response, and so on. Similarly, even a static response can be delayed due to overloaded servers, slow caches, or other reasons.
Today, the user agent developer tools are able to show when the request was initiated, and when the first and last bytes of the response were received. However, there is no visibility into where or how the time was spent on the server, which means that the developer is unable to quickly diagnose if there is a performance bottleneck on the server, and if so, in which component. Today, to answer this question, the developer is required to use different techniques: check the server logs, embed performance data within the response (if possible), use external tools, and so on. This makes identifying and diagnosing performance bottlenecks hard, and in many cases impractical.
Server Timing defines a standard mechanism that enables the server to communicate relevant performance metrics to the client and allows the client to surface them directly in the developer tools - e.g. the requests can be annotated with server sent metrics to provide insight into where or how the time was spent while generating the response.
In addition to surfacing server sent performance metrics in the developer tools, a standard JavaScript interface enables analytics tools to automatically collect, process, beacon, and aggregate these metrics for operational and performance analysis.
Server Timing enables origin servers to communicate performance metrics about where or how time is spent while processing the request. However, the same request and response may also be routed through one or more multiple proxies (e.g. cache servers, load balancers, and so on), each of which may introduce own delays and may want to provide performance metrics into where or how the time is spent.
For example, a CDN edge node may want to report which data center was being used, if the resource was available in cache, and how long it took to retrieve the response from cache or from the origin server. Further, the same process may be repeated by other proxies, thus allowing full end-to-end visibility into how the request was routed and where the time was spent.
Similarly, when a Service Worker is active, some or all of the navigation and resource requests may be routed through it. Effectively, an active Service Worker is a local proxy that is able to reroute requests, serve cached responses, synthesize responses, and more. As a result, Server Timing enables Service Worker to report custom performance metrics about how the request was processed: whether it was fetched from server or server from local cache, duration of relevant the processing steps, and so on.
...