HTTP caching basics

Tags: http (2)

It’s important to understand HTTP caching because at some point an HTTP cache will protect your web platform from going down. Whether this HTTP cache is a reverse caching proxy like Varnish or a full-blown CDN, you need to understand the rules of the game, and you need to understand the basics of HTTP caching.

Luckily there are conventions for this. There are even standardized headers that are part of HTTP’s specification that will allow you to control the behavior of a web cache.

What is HTTP caching?

HTTP has caching capabilities built into the protocol to ensure that clients or proxies can store the HTTP response in the cache for a certain amount of time. By caching the response, clients don’t have to connect to the web server every time they want to access that content.

HTTP caching reduces network traffic and server load, which results in lower response times.

Browser cache versus caching proxies

Historically HTTP responses were cached by the web browser to reduce network traffic. In the early days of the web, bandwidth was limited. Being able to cache HTTP responses in the browser avoided expensive HTTP roundtrips.

Unfortunately browser cache is not reliable: users can flush the cache at any time, and they can even disable the cache. Another disadvantage is the fact that the cache is hosted locally, which means there is a cache per user.

By installing proxy servers closer to the user, either in the office or at the internet provider’s data center, clients can retrieve centrally cached copies of the requested content and act on behalf of the origin web server.

As broadband internet became more common, local caching proxy servers were no longer crucial. Instead the increase of bandwidth shifted the pressure from the client to the server: traffic spikes started jeopardizing the stability of servers. As a consequence, caching proxies also shifted.

Nowadays reverse caching proxies are put in front of the origin web platform to protect it against traffic spikes and prevent the platform from caving in under pressure.

HTTP’s caching policies allow HTTP responses to be cached by both clients and proxies using the same syntax. However, there are also specific instructions that only apply to proxies.

HTTP caching concepts

Cacheability

Not all HTTP responses can or should be cached: if the content is private, it should not be stored in the cache. If the type of request (for example an HTTP POST request) implies a change of the resource, it shouldn’t be cached either. If the returned response uses a Set-Cookie header to change state, the response shouldn’t be cached.

On the one hand you can decide whether or not to store a response in the cache. On the other hand, you can decide whether or not to serve a cached response from the cache.

These rules can be specifically enforced in the implementation or configuration of the cache. However, the HTTP protocol allows you to control the cacheability under the form of specific header syntax.

Public versus private content

The scope of cacheable content is either public or private.

  • Publicly cacheable content can be cached by both the requesting client as well as reverse caching proxies.
  • Privately cacheable content can only be cached by the requesting client and not by reverse caching proxies.

Cache lifetime

Cached objects are only valid for a limited amount of time. The time to live of a cached object can be defined in the implementation or configuration of the cache. But as expected, the HTTP protocol has ways to enforce the time to live through specific cache header syntax.

Revalidation

As long as the time to live of a cached object has not expired, the content is considered fresh. This means it can be served from the cache to requesting clients.

The remaining lifetime of an object is a value that changes every second. Once it hits zero, the content is no longer fresh. Instead it is considered stale and in need of revalidation.

Cache revalidation is the process of connecting back to the origin web server and fetching potentially updated content. As soon as the revalidation is finished, the object is considered fresh again for as long as the time to live allows.

Conditional revalidation

Revalidation can also be done conditionally. This means that the origin web server will only send the payload if the requested resource has changed. If the resource hasn’t changed, a 304 Not Modified status code will be returned without a response body.

This reduces the amount of data sent over the wire, and it can also result in a lower server resource consumption at the origin level.

When the cache receives an 304 Not Modified response, the time to live that is defined by the Cache-Control or Expires header will be used to set the lifetime of the object after revalidation.

HTTP response headers like Etag and Last-Modified allow web servers to identify when a resource has last changed. These values can be presented by the client or a reverse caching proxy under the form of If-None-Match and If-Modified-Since request headers to compare versions. If these versions differ, a regular 200 OK response will be sent, otherwise the 304 Not Modified status is returned.

Identifiying cached objects

Objects in the cache are generally identified by their URI and Host header values. These values are part of the HTTP request, as illustrated below:

GET /about HTTP/1.1
Host: example.com

This example is a request to http://example.com/about and /about and example.com are used to create a hash that identifies the object in the cache.

Cache variations

Sometimes, an HTTP resource can have multiple versions that depend on values coming from request headers. One example of this is a multilingual website that uses the Accept-Language response header to present the resource in multiple languages.

If a resource has multiple versions, knowing that it is identified by its URI, the cached output can be inconsistent. Only using the URI and Host header will not suffice and that’s where cache variations come into play.

A cache varation will extend the hash that is used to identify an object in the cache by adding the value of a request header. Per version of the resource, a variation is added to the cache.

The origin web server can issue a Vary header to tell the cache what request header it should use to base its variations on. In the case of the multilingual website, Vary: Accept-Language is the logical choice.

The goal is to store enough cache variations to cover the available versions of a resource. But if a resource has too many variations, caching all variations will have a detrimental effect on the hit rate and will fill up the cache.

Make sure you have your variations under control, otherwise you’re better off not caching the response at all.

HTTP caching headers

Here’s an overview of HTTP caching headers that you can leverage to control the cache from your origin web platform.

Cache-Control

The Cache-Control header is probably the most common HTTP caching header. Its syntax is quite extensive and has directives to control the following aspects of HTTP caching:

  • Time To Live
  • Cacheability
  • Scope (public or private)
  • Revalidation

Public and private

The public keyword is used to announce that the resource can be cached by both web clients and caching proxies. If private is used instead, a caching proxy will not store the object in cache whereas a web client will.

Here’s an example where a public resource is announced through the Cache-Control header:

Cache-Control: public

Here’s the equivalent for private content:

Cache-Control: private

Max-age and s-maxage

The max-age directive is used to set the lifetime of an object in the cache. Here’s an example:

Cache-Control: public, max-age=3600

This header instructs the cache to store the object for 3600 seconds, which corresponds to an hour.

The s-maxage directive does the same as max-age, but it is intended for caching proxies rather than for web clients.

Here’s an example where a caching proxy is instructed to cache the object for a day:

Cache-Control: public, s-maxage=86400

It is also possible to combine these directives:

Cache-Control: public, max-age=3600, s-maxage=86400

This will result in the web client caching the resource for an hour and the caching proxy applying a time to live of a day.

Stale-while-revalidate

The stale-while-revalidate directive sets the allowed staleness of a cached object, allowing expired content that has passed its expiration time to be served from the cache.

The stale-while-revalidate value sets the amount of seconds past the expiration time that stale content can be served while the cache is revalidating the content.

Here’s an example:

Cache-Control: public, max-age=900, stale-while-revalidate=100

In this example the cached object is considered fresh for 900 seconds. After that revalidation needs to take place. But because of the stale-while-revalidate=100 directive, the object can be served from the cache for another 100 seconds while the cache is asynchronously revalidating with the origin web server.

Stale-if-error

When staleness is allowed, the end user will not be impacted by potentially slow backends during the revalidation process. Thanks to stale-while-revalidate, the cache can be instructed to serve stale data while a new version of the resource is being fetched.

But what happens when the origin web server is down?

As long as the stale-while-revalidate value is high enough, stale content will be served and the failed revalidation will go unnoticed as far as the user is concerned.

By setting a very high stale-while-revalidate value, some business rules may be violated in situations where the origin web server is healthy.

The stale-if-error directive sets the staleness when the origin is down. Here’s an exammple:

Cache-Control: public, max-age=900, stale-while-revalidate=100, 
                                    stale-if-error=86400

In this example an object is stored in the cache for 900 seconds and if the origin is healthy, this object may be served up to 100 seconds past the expiration of the object.

If the backend is down, the staleness can be drastically increased. In this case, a stale object may be served a full day past its expiration time.

Must-revalidate and proxy-revalidate

When staleness is not allowed, the must-revalidate keyword is used to enforce this. Here’s an example:

Cache-Control: public, max-age=3600, must-revalidate

In this case the cached object is fresh for an hour, but as soon as it expires synchronous revalidation is mandatory.

If the web client is allowed to serve stale objects from the cache, but intermediary caches aren’t, you can use the proxy-revalidate keyword to enforce this.

Cache-Control: public, max-age=3600, stale-while-revalidate=100, 
                                     proxy-revalidate

In this case the object is cached for an hour. After that an extra 100 seconds of staleness is allowed while revalidation takes place. Because of proxy-revalidate, staleness is only allowed by the web client. Caching proxies are not allowed to serve stale content.

No-cache and no-store

If the web server returns an uncacheable response, the Cache-Control: no-cache, no-store syntax can be used to instruct the cache not to cache this response.

The no-cache directive forces the cache not to serve the cached resource to the requesting client and instead revalidate the content with the origin web server.

The no-store directive instructs the cache not to store this resource in the cache.

Both directives can be used separately, but they can also be combined.

Expires

The Expires header can also be used to set the time to live of an object. The Expires header doesn’t use relative numbers like the Cache-Control header does. Instead it sets the date and time of expiration.

Here’s an example:

Expires: Sat, 4 May 2024 08:00:00 GMT

This cacheable resource is considered fresh until Saturday May 4th 2024 at 8 o’clock GMT.

By setting a date and time in the past, the Expires header can instruct the cache not to store the object.

Vary

The Vary header is used to create cache variations. As explained earlier, cache variations are used to create multiple variations of a cached object, based on a request header.

Here’s an example:

Vary: Accept-Language

This example will create a cache variation for this resource based on the value of the Accept-Language request header. This will allow multilingual websites that use the same URL structure for multiple languages to be properly cached.

Here’s another example:

Vary: X-Forwarded-Proto

This example will create a cache variation based on the value of the X-Forwarded-Proto request header. This header is not sent by the client, but by a TLS proxy that terminates the TLS connection. Possible values are https and http.

This cache variation ensures that there’s an HTTP and an HTTPS version of each page to avoid mixed content.

Etag and If-None-Match

The Entity tag that is returned through the Etag response header is used to identify a specific version of the resource.

The Etag value could be any value, but it must be unique to the version it represents. Consider it a fingerprint of the content.

When a web server returns an Etag header, the value can be presented by the client upon subsequent requests under the form of an If-None-Match request header.

This If-None-Match header represents the version of the resource it currently has. This value can be compared to the Etag that is returned by the web server. If the values are identical, the content hasn’t changed and a 304 Not Modified status code can be returned without attaching a body to the HTTP response.

If the values of If-None-Match and Etag differ, the content has changed and a regular 200 OK response is returned that includes a body.

Last-Modified and If-Last-Modified

The Last-Modified response header also identifies a specific version of a resource. Unlike the Etag header, it uses a last modified date to identify that resource.

Here’s an example:

Last-Modified: Mon, 8 Nov 2021 18:28:00 GMT

This value represents the last time the resource was modified. The value of that response header can be presented to the web server upon subsequent requests under the form of a If-Modified-Since header.

Here’s an example:

If-Modified-Since: Sun, 7 Nov 2021 13:18:21 GMT

The value of the If-Modified-Since header is older than the one presented by the Last-Modified header. This means the content has changed and a 200 OK response should be returned.

If the values of If-Modified-Since and Last-Modified were identical, the client has the most recent version of the resource and a bodyless 304 Not Modified response could be returned.

Age

An Age header is used to inform the client how long the object has been stored in cache.

Here’s an example:

Age: 100

This means the object has been stored in the cache for 100 seconds.

Imagine the following example:

Cache-Control: max-age=300
Age: 100

We know that the max-age=300 directive sets the Time To Live of the cached object to 300 seconds. The fact that the Age header is set to 100 seconds means that the cache object has a remaining lifetime of 200 seconds.

HTTP caching flow

When a reverse caching proxy server like Varnish is used to accelerate your origin server, there is a specific flow depending on the scenario.

We’d like to present four scenarios:

  • The cache miss flow
  • The cache hit flow
  • The cache revalidation flow
  • The conditional revalidation flow

Cache miss flow

When a client requests content from an empty cache, a cache miss occurs and the cache has to fetch the content from the origin. The following diagram illustrates this process:

HTTP caching flow: miss

Although we try to keep origin fetches to a minimum, a cache miss is not necessarily a bad thing. A cache miss is simply a hit that hasn’t happened yet.

When the origin web server responds, the caching proxy will store the response in the cache with a lifetime that was specified by the Cache-Control or Expires header and will serve the cached object to clients requesting it.

Cache hit flow

Once the object is stored in cache, subsequent requests will result in a cache hit, as illustrated in the diagram below:

HTTP caching flow: hit

As you can see, no connection to the web server is needed. This is by design and takes away the pressure from that origin web server while the caching proxy is serving the cached version of those origin responses.

Cache revalidation flow

At some point the cached object will expire and the content will need to be revalidated with the origin web server.

This involves an origin fetch, just like a cache miss. But unlike the cache miss scenario, the cache can choose to serve the stale content while asynchronously revalidating with the origin.

The diagram below clarifies the revalidation flow:

HTTP caching flow: revalidation

If you pay close attention, you’ll see that the order of execution is different: the client response can be returned before the origin revalidation response is received.

The following Cache-Control will enable asynchronous revalidation thanks to its stale-while-revalidate directive:

Cache-Control: public, s-maxage=3600, stale-while-revalidate=200

If we want to serve stale content when the origin web server is down, we could use the following Cache-Control header and leverage the stale-if-error directive:

Cache-Control: public, s-maxage=3600, stale-if-error=86400

Conditional revalidation flow

Revalidation can also be done conditionally. This means that the caching proxy will identify the version of the object through specific request headers, such as If-None-Match or If-Modified-Since.

The values of these headers are come from the Etag or Last-Modified response headers that are part of the cached object.

If the latest version matches the version that is advertised by the proxy, the origin will acknowledge this and not send the full payload. A 304 Not Modified response is returned, the stale content is then considered fresh again and revalidation is paused until the content expires again.

A version matches if the Etag and If-None-Match values are identical or if the Last-Modified and If-Modified-Since values are identical:

HTTP caching flow: conditional revalidation

If the versions differ, the full response is sent by the origin and the content is considered fresh again:

HTTP caching flow: failed conditional revalidation

Conditional revalidation allows backends to consume less bandwidth by only adding payload to the HTTP response if the content has changed. If the origin is optimized for conditional revalidation, CPU, memory and disk I/O consumption can also be reduced.