The definitive guide to HTTP caching

Caching is a mechanism that enables web services to optimize their bandwidth usage by letting them skip traffic that has already been transferred previously. Understanding how caching is used is essential to making modern web applications work efficiently.

A primer on caching

Using caches is an optimization for applications. In the HTTP ecosystem, there are two different types of caches to consider:

Browsers: When a visitor opens a website and clicks through it's pages, a lot of resources remain stable; stylesheets, scripts and images are the most common. It is unnecessary to re-download those on every page load, so browsers can benefit from caching those locally and only request the parts of the new page that weren't already downloaded previously.

Reverse proxies: Websites may choose to deploy a reverse proxy in in front of them that take the public traffic and proxy requests. A common example of this is a CDN serving website assets from geographically near servers to reduce latency and distribute traffic among multiple endpoints. CDNs will typically request missing assets from the origin server, then serve a cached copy of it as long as allowed.

It is important to understand that caching is a complex optimization mechanism: It doesn't merely reduce traffic on the origin server by avoiding to resend traffic, but also reduces load on the disk (from reading asset files) and ram (for frequently used files). Even CPU load will be decreased significantly, because traffic is almost always encrypted using TLS, which is based on computation-heavy maths.

HTTP resource validation

In order to use caching, HTTP servers must include some sort of information to tell if the resource differs from a cached version. This can be achieved through one of three methods:

Setting an Etag header. The value of this header is an arbitrary string and must be unique for every version of the file. Using a hash here is common because it meets the requirements and is cheap to compute, but you could provide any string you wanted, including timestamps, version numbers or arbitrary strings. Example: Etag: 1hf9172f63bf0187921
Setting a Last-modified header. This header holds the timestamp of the last change to the resource, in the datetime format specified by RFC 7231, for example Last-Modified:Sun,06Nov1994 08:49:37 GMT

When a server sends one of those header fields, the client (or reverse proxy) may cache the response (depending on Cache-Control contents, but we ignore this to focus on validation here). Assuming a request gets this response:

Assuming a client sends this request

GET /contact HTTP/1.1
Host: example.com

And gets this response

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 59500
Date: Wed, 18 Dec 2024 12:59:23 GMT
Etag: o6unkxtVmzd3TMWqeXeDHw

<request body ...>

The next time a request is made to this location, the client may choose to reuse the downloaded contents from last time and only validate that the version on the server hasn't changed in the meantime. It can check this by sending a GET request with the previous Etag value in the If-None-Match header field:

GET /contact HTTP/1.1
Host: example.com
If-None-Match: o6unkxtVmzd3TMWqeXeDHw

The response to this can be one of two options; if the resource changed, it will respond normally with the resource contents in the body. If the resource did not change, it will skip the body and only respond with the headers, changing the status code to 304 Not Modified to signal the cached version is up to date:

HTTP/1.1 304 Not Modified
Content-Type: text/html; charset=utf-8
Content-Length: 59500
Date: Wed, 18 Dec 2024 12:59:23 GMT
Etag: o6unkxtVmzd3TMWqeXeDHw

Note the body of the response is empty in this case.

Verification using the Last-Modified header behaves similarly, by supplying the previous date in the If-Modified-Since header of the request. Resources may be validated using either one approach, with Etag taking precedence if both are present.

Controlling cache behavior

Simply enabling caches to work is not ideal for all types of resources and responses of a web application. Different parts have different caching needs, for example the dashboard for a user's bank account is likely fine to cache in their browser, but will be extremely problematic if a CDN distributes it to hundreds of other geographically close visitors.

To get more control over the behavior of caches, the Cache-Control header can be set with one or more directives. It takes the form of Cache-Control: directive1, directive2, directive3... . The following directives are available:

public allows all caches to store this response, private will only permit the visitor's browser to cache it (not intermediate caches like CDNs or proxies). They are mutually exclusive.
no-cache demands that downloaded resources need to be validated before use every time, while no-store forbids the use of caching entirely. They are mutually exclusive.
maxage defines a duration in seconds during which browsers do not need to revalidate a resource (example Cache-Control: max-age=60). s-maxage does the same, only for intermediate caches like CDNs or proxies. Caching can work without Etag or Last-Modified headers just through these expiry durations, but when they become stale, the resource needs to be re-downloaded every time.
must-revalidate demands that a browser needs to validate the resource after it became stale (e.g. maxage expired) and cannot use the expired version in the meantime. proxy-revalidate does the same but for intermediate caches like CDNs and proxies.
no-transform is a directive to tell intermediate caches to keep the response unchanged. Some reverse proxies like varnish or squid will try to optimize responses, for example by adding better compression (brotli or gzip with higher levels) to optimize storage and bandwidth usage. This directive explicitly forbids those kinds of optimizations, demanding all parts of the response be kept as received.

The Cache-Control header is theoretically all you need to enable caching, but combining it with either an Etag or Last-Modified header allows for even more efficient cache handling, refreshing resources without downloading them again (unless they changed).

Be extra careful with the public and private directives; setting a response containing a session cookie to public could cause a reverse proxy to effectively share that user's session with any number of it's users. Anything involving cookies should be marked private as a precaution.

Common caching policies

If and how a resource can be cached heavily depends on it's type and contents. A good caching policy will almost always combine Cache-Control with either Etag or Last-Modified headers, and use Cache-Control to outline where and how the data may be cached.

Static pages can use a simply caching policy to keep cache contents for one hour:

Cache-Control: max-age=3600, public

Dynamic publicly accessible pages, for example the index of a blog, can benefit from the same approach with an added revalidation requirement, so outdated page contents aren't served to visitors:

Cache-Control: max-age=3600, public, must-revalidate

Private dynamic pages like the dashboard of a banking application should switch to the private caching strategy with the same approach (and probably reduce the expiry duration):

Cache-Control: max-age=60, private, must-revalidate

Static assets can be heavily optimized through the use of unique filenames. Imagine you have a file script.js, you could compute a hash of the file's contents, say hev52a1 and include that in the filename like script.hev52a1.js, then set the caching policy to a 10 year expiry duration:

Cache-Control: max-age=315360000, public

Since a change to the file contents would also change the name of the file and thus the request path, you can treat every file as immutable this way. Making use of this technique requires either an advanced CDN setup, or a step in the build/deployment process to dynamically compute hashes and rename files accordingly, but the performance gains can be tremendous, as each asset file only ever needs to be downloaded once.

Outdated caching headers

Much older approaches the caching included the Expires HTTP header to define a date until which the resource may be cache, along with the Pragma header to supply rules to how caching may occur. Both of these are outdated and should not be used for modern applications; use Cache-Control to achieve the same effects.

Varying responses

For some applications, a request to the same resource will produce different responses based on request headers. To give a few examples, a server may decide to

serve a different translation of the file based on Accept-language headers from the request
serve a different text encoding or compression depending on the Accept-Encoding header in the request
serve a different content type format depending on the request's Accept header
serve a different page version for different User-Agent values (automatically switching to mobile/desktop variants)
serve different page contents depending on Cookie value, like a user's username in a profile menu.

The HTTP protocol includes the Vary header to signal that responses may change for different request headers. It contains the names of header(s) that can affect the response:

Vary: Accept-language, Cookie

Caches will respect the Vary header as well, in this example by treating responses to requests with differing Cookie request header values like different resources. Properly setting up varying response contents can be a complex task, and many situations are better served by just setting the Cache-Control policy to private (since for example responses including cookies should really never be stored in intermediate caches, only the user's browser).