Conditional requests

Caching can alleviate a lot of stress from your origin servers, and as a consequence your overall stability increases and latency gets reduced.

But caches don’t fill themselves up: content needs to be fetched from the origin. Depending on the hit rate of your cache, this can still result in heavy load on the origin, increased latency, and an overall decrease in stability.

Even when revalidating stale content, or when performing range requests, the full result needs to be fetched. This can be resource intensive on the origin side.

But if you optimize your origin for conditional requests, origin fetches will become a lot more efficient.

304 Not Modified

As this chapter is all about HTTP, it should come as no surprise to learn that HTTP has built-in support for conditional requests.

The idea is that you present the origin a fingerprint of stale content that you want to revalidate. When the corresponding fingerprint sent by the client matches the current fingerprint of the content, the origin can return an HTTP 304 Not Modified status.

An HTTP 304 response doesn’t require a body to be sent, as it implies that whatever the client has stored in cache is the most recent version of the content.

If the fingerprint doesn’t match, a regular HTTP 200 OK is returned instead.

Etag: the fingerprint

The fingerprint referred to is an arbitrary value that is set by the origin, which is returned through an Etag response header.

There is no conventional format for this header; the only requirement is that it is unique for the content that is returned.

Most web servers can automatically generate an Etag for resources that are files on disk. For web applications that use URL rewriting, it is up to the application itself to generate the Etag.

If you use an MVC framework, generating an Etag is quite simple. You could take a hash of the content before returning it, and use this hash as the value of the Etag response header. You can use algorithms like md5 or sha256 to create the hash.

Here’s an example of an HTTP response containing an Etag:

HTTP/1.1 200 OK
Cache-Control: max-age=3600, s-maxage=86400
Etag: "5985cb907f843bb60f776d385eea6c82"

If-None-Match

When a client receives an HTTP response that contains an Etag header, it can keep track of the Etag value, and send it back to the server via an If-None-Match header.

Based on the value of the If-None-Match header, the server can compare this value to the Etag value it was about to send. If both values match, the client has the most recent version of the content. The request can be satisfied by an HTTP 304 Not Modified response.

If the values don’t match, an HTTP 200 OK will be returned, containing the full payload.

The If-None-Match request header is automatically added to requests by typical web browsers. When performing command-line HTTP requests using curl for example, the If-None-Match header should be added manually.

The workflow

On the one hand you have an Etag response header, on the other hand, you have an If-None-Match request header, and somehow the HTTP 304 Not Modified fits into the story as well.

Here’s the workflow that will help you make sense of it all:

Conditional request workflow

The client sends a first request to the server for /foo``.
The server replies with an HTTP 200 OK``.
The server attaches an Etag: 1234 header.
The client keeps track of the 1234 value.
The client sends another request to the server for /foo``.
The client attaches an If-None-Match: 1234 request header to that request.
The server recognizes the fact that If-None-Match: 1234 was set by the client.
The server matches the 1234 value to whatever the Etag is supposed to be for this response.
The server notices that 1234 is still the up-to-date fingerprint of the content.
The server sends an HTTP 304 Not Modified to the client without any payload.
The client recognizes that the content hasn’t been modified, and keeps serving whatever is stored in its cache.

Strong vs weak validation

An Etag is a specific validator in HTTP. So-called strong validation implies that the content that is represented by this validator is byte-for-byte identical.

Weak validation implies that the response is not byte-for-byte identical to the version it is comparing itself to, yet the content can be considered the same. For example, the uncompressed and compressed versions of an object.

A weakened Etag is prefixed with a W/. Here’s an example:

Etag: "W/5985cb907f843bb60f776d385eea6c82"

This means that if an If-None-Match request header is received containing this value, the server should know how to validate the content, knowing it will not be byte-for-byte identical.

A practical use case is when the main content of a page remains the same, but certain ads, and certain information in the footer, might differ.

Weak validation can get quite complicated. It requires the validating system to be aware of the subtleties of content, and it must be able to spot content that is not modified, even if the payload differs.

Varnish also emits weakened Etags, when the requested content encoding differs from what was stored in cache. We’ll talk about this in just a minute.

Conditional request support in Varnish

Varnish supports conditional requests in both directions. This means Varnish will return an HTTP 304 Not Modified when a client sends a matching If-None-Match header.

But it also means that Varnish will send an If-None-Match header to the origin on certain cache misses, hoping to receive an HTTP 304 Not Modified.

Conditional request workflow in Varnish

Here’s a diagram that illustrates the workflow within Varnish:

Conditional request workflow in Varnish

Let’s walk through it:

When a client requests a resource for the first time, it will be a cache miss.
Varnish will fetch the content from the origin.
The origin returns the content, and adds a Cache-Control: s-maxage=10 header to indicate that the content should be cached for ten seconds.
The origin also includes an Etag: 1234 header to announce the fingerprint of the resource.
Varnish stores the object in cache for ten seconds and returns the response, including the Etag.
The client receives the response from Varnish, and keeps track of the Etag.
The client sends a new request to Varnish within ten seconds and adds If-None-Match: 1234``.
Varnish can deliver the object directly from cache because it is a hit, and the content is fresh.
Because the Etag matches, Varnish will not return a response body and will use the HTTP 304 Not Modified status code to indicate that the client has the most recent version of the object.
Sometime later, the client does another request for the same resources with the same If-None-Match header.
Varnish finds the object in cache, but it has expired, so it results in a cache miss.
Varnish will send a backend request to the origin, including the If-None-Match: 1234 header.
The origin notices this header coming from Varnish, matches it to the Etag, and returns a bodyless HTTP 304 Not Modified response.
Varnish knows it still has the most recent version of the object in cache and can safely return a HTTP 304 Not Modified to the client.
For some reason the client no longer sends the If-None-Match header to Varnish for its next request.
Varnish finds the object in cache, but notices it has expired.
Because Varnish still has the Etag stored internally, it will pass it to the origin using the If-None-Match: 1234 request header.
The origin acknowledges that this is still the latest version of the content and responds with an HTTP 304 Not Modified response.

Varnish supports conditional requests at the client level and at the backend level. But what is even more interesting is that once Varnish stores an Etag, it can use it for conditional requests to the backend for client requests that didn’t contain an If-None-Match header.

Grace vs keep

When the TTL of an object is still greater than zero, the content is still fresh, and it can be served from cache.

As we’ve seen in a previous section, expired objects that still have some grace left can be revalidated asynchronously. This revalidation can also be done conditionally.

If the grace period hasn’t expired, and Varnish has an Etag for this object, Varnish will send a background fetch to the origin, including the If-None-Match header. If the Etag matches, the origin will reply with an HTTP 304 Not Modified status code.

All of this happens in the background, while incoming client requests receive the stale object.

When the grace period has expired, the keep period kicks in, and revalidations to the origin become synchronous, meaning that clients will have to wait until the revalidation is finished.

But when that keep time has expired, the revalidation is unconditional, meaning that the response is supposed to be a regular HTTP 200 OK. This makes sense since after the keep period has expired, the object isn’t in cache any more, so it needs to be fetched fully again.

Here’s an overview to summarize TTL, grace, and keep:

TTL > 0 == fresh
TTL + grace > 0 == stale, (condtional) revalidation with background fetch possible
TTL + grace + keep > 0 == stale, conditional synchronous revalidation possible
TTL + grace + keep <= 0 == stale, unconditional synchronous revalidation

Optimizing the origin for conditional requests

If you optimize your web application for conditional requests, you can take away a lot of stress from the origin system.

Some context

The fact that a HTTP 304 Not Modified has no response body reduces the size of the response on the wire. The first observation is that conditional requests are good for your bandwidth.

But in a web acceleration context, the real problems are increased CPU usage, running out of memory, and disk I/O. When a web application is under heavy load, latency can severely increase, and at a certain point the application can become non-responsive, which affects stability.

The point of running Varnish is to avoid latency and stability issues, but even with a properly configured Varnish server, there can still be plenty of traffic to the origin system:

You can have a low hit rate because of diverse traffic patterns hitting non-cached resources.
You can have a low hit rate because of low TTLs.

In each of these situations, there will be more traffic on the origin, which can increase the load on that system.

Exit early

The crux of conditional requests is to send an HTTP 304 Not Modified as early as possible. In order to take advantage of this in your web application, you need your application framework to validate the Etag as quickly as possible and exit as early as possible.

This means that you should have quick access to the Etag without having to go through your entire application logic.

When content is created or edited, your application should store the Etag in a key-value store or database that has very quick read access. When your application logic has to validate the Etag, a very low overhead call is made to the Etag storage, the If-None-Match header is matched to the Etag, and if they match, an HTTP 304 Not Modified is returned immediately.

This so-called key-value store can either be the local memory of the application server, or well-known products like Redis or Memcached. As long as read access is fast, and the overhead for retrieval is low, you’ve got yourself a good solution.

When properly optimized, the application will exit correctly, without a full bootstrap of the framework being required, and without access to typical database systems. This will result in very quick response times, and a very low-resource footprint on the server.

When the Etag doesn’t match, the regular application flow takes place, resulting in an HTTP 200 OK response, and no real performance gain.

Conditional requests - application workflow

As you can see in the image above, there is a separate component that takes care of the Etag check. This component can be part of your regular application code, under the form of a pre-dispatch hook in your MVC framework.

It could also be a separate service, if need be.

Leveraging Varnish

As mentioned, Varnish supports conditional requests both on the client side and the backend side. But if you know how to optimize your web application for conditional requests, you can leverage Varnish to get even better results.

You could lower your TTLs, without the risk of destabilizing your origin with increased requests. For HTTP resources that are updated on a frequent basis, you could even set the TTL to one second, and still have a stable origin:

Grace mode will ensure asynchronous revalidation happens, while clients receive stale data.
Request coalescing will ensure that only one request per URL is sent to the origin server.

Varnish has purging and banning capabilities to remove specific objects from the cache, which we’ll cover in chapter 6. But by leveraging conditional requests in conjunction with low TTLs, there would be no need to actively remove content from the cache because the cache lifetime could just be a couple of seconds.

Last-Modified and If-Modified-Since as your backup plan

Etags aren’t the only way to perform conditional requests. There’s also the Last-Modified header to indicate when the content was last changed.

Here’s an example:

HTTP/1.1 200 OK
Cache-Control: max-age=3600, s-maxage=86400
Last-Modified: Mon, 24 Aug 2020 22:35:02 GMT

The origin indicates that the response is cacheable and can be cached by Varnish for a day, and by the browser for an hour. But the response also indicates that the content was last modified on Monday August 24th at 22:35:02 GMT.

The Last-Modified value can be stored by the client and will be sent to the server in the form of the If-Modified-Since header.

If the content wasn’t modified since the value of the If-Last-Modified header, an HTTP 304 Not Modified can be returned without any payload.

Here’s the diagram to illustrate the flow:

Conditional request workflow using last-modified

As you can see it’s quite similar to Etag and If-None-Match, but with timestamps instead of a content fingerprint.

Varnish supports both Etag/If-None-Match and Last-Modified/If-Modified-Since. I personally prefer using Etags because it’s more precise, but both mechanisms do the job just fine.

Conditional range requests

In one of the previous sections we talked about range requests. The goal is to receive partial content by requesting one or more byte ranges from a resource.

The client issues a Range header to indicate which portion of the content. When successful, an HTTP 206 Partial Content status is returned.

Range requests can also be done conditionally to ensure that the up-to-date version of a byte range is fetched.

The If-Range header can be used for validation purposes. This header either contains an Etag value or a Last-Modified value. If the If-Range matches the Etag or Last-Modified header, an HTTP 206 Partial Content is returned. If there’s no match, the full payload is sent via an HTTP 200 OK status code.

Here’s an example:

HTTP/1.1 200 OK
Etag: "5985cb907f843bb60f776d385eea6c82"
Accept-Ranges: bytes
Content-Length: 43

The response contains an Etag that can be used for conditional requests
Because of the Accept-Ranges: bytes header, we know the server supports range requests
The Content-Length header indicates the size response, which is 43 bytes. This is the upper limit that can be used for range requests
Because both the Etag and Accept-Ranges: bytes headers are there, we know we can perform conditional range requests

Here’s such a conditional range request:

GET / HTTP/1.1
Range: bytes=0-19
If-Range: "5985cb907f843bb60f776d385eea6c82"

This conditional range request will retrieve the first 20 bytes from the / resource, but will only do this if the Etag matches "5985cb907f843bb60f776d385eea6c82".

If that is the case, the following response can be expected:

HTTP/1.1 206 Partial Content
Etag: "5985cb907f843bb60f776d385eea6c82"
Accept-Ranges: bytes
Content-Range: bytes 0-19/43
Content-Length: 20

If the Etag doesn’t match, this means the content has changed. As a consequence, no partial content can be returned, and instead the full payload is returned:

HTTP/1.1 200 OK
Etag: "9985cb2a7f8413b60f7789aa5eea6c41"
Accept-Ranges: bytes
Content-Length: 43

Because Varnish only has built-in range support on the client side, conditional range requests are only performed on the client side.