Hooks, subroutines, and built-in VCL

The previous section featured the Varnish finite state machine. Every state has a corresponding subroutine that allows you to hook into that state to modify its behavior.

In this section, we’ll cover the various subroutines and their corresponding VCL code, and we’ll explain how this code fits into the Varnish finite state machine.

The VCL code you’re about to see is all part of what we call the built-in VCL. We’ve covered this behavior in the previous chapter; now you’ll see the actual code.

Remember: even if this code is not part of your VCL file, it will still be executed by Varnish if you don’t perform an explicit return call.

vcl_recv

vcl_recv is the first subroutine that is used in the built-in VCL. It hooks into the request-handling logic. Based on certain criteria, it transitions to another state by returning a specific action.

Let’s have a look at the vcl_recv VCL code:

sub vcl_recv {
	if (req.method == "PRI") {
		/* This will never happen in properly formed traffic (see: RFC7540) */
		return (synth(405));
	}
	if (!req.http.host &&
	  req.esi_level == 0 &&
	  req.proto ~ "^(?i)HTTP/1.1") {
		/* In HTTP/1.1, Host is required. */
		return (synth(400));
	}
	if (req.method != "GET" &&
	  req.method != "HEAD" &&
	  req.method != "PUT" &&
	  req.method != "POST" &&
	  req.method != "TRACE" &&
	  req.method != "OPTIONS" &&
	  req.method != "DELETE" &&
	  req.method != "PATCH") {
		/* Non-RFC2616 or CONNECT which is weird. */
		return (pipe);
	}

	if (req.method != "GET" && req.method != "HEAD") {
		/* We only deal with GET and HEAD by default */
		return (pass);
	}
	if (req.http.Authorization || req.http.Cookie) {
		/* Not cacheable by default */
		return (pass);
	}
	return (hash);
}

Error cases

There are two error cases that will result in synthetic responses being returned:

When the request method is PRI, this means an HTTP/2 request is received, whereas Varnish wasn’t configured to handle HTTP/2. This is not supposed to happen, and a HTTP 405 Method Not Allowed error is synthetically returned.

Here’s the VCL code for that:

if (req.method == "PRI") {
	/* This will never happen in properly formed traffic (see: RFC7540) */
	return (synth(405));
}

The other error case is when a top-level HTTP/1.1 request is made without a Host header. This goes against the rules of the protocol and results in an HTTP 400 Bad Request error being returned synthetically.

Here’s the corresponding VCL code:

if (!req.http.host &&
  req.esi_level == 0 &&
  req.proto ~ "^(?i)HTTP/1.1") {
	/* In HTTP/1.1, Host is required. */
	return (synth(400));
}

We referred to the term top-level request. This is the main HTTP request. Varnish can also trigger subrequests, which are part of the ESI parsing logic.

The top-level check is done by checking the value of the req.esi_level

To pipe or not to pipe

The next check that is performed in vcl_recv is also related to the request method. There is a series of HTTP request methods that Varnish accepts. If the header that is received doesn’t match this list, then return(pipe) is executed, as illustrated below:

if (req.method != "GET" &&
  req.method != "HEAD" &&
  req.method != "PUT" &&
  req.method != "POST" &&
  req.method != "TRACE" &&
  req.method != "OPTIONS" &&
  req.method != "DELETE" &&
  req.method != "PATCH") {
	/* Non-RFC2616 or CONNECT which is weird. */
	return (pipe);
}

Piping means that Varnish no longer considers this an HTTP request. Instead, it just treats the data as TCP and shuffles the payload over the wire, without further interference. If dealing with HTTP requests, always consider using a pass instead of a pipe, as piping relinquishes your ability to manipulate the transaction in further steps, and your logs will be blind to the backend response.

Only GET and HEAD

Varnish follows HTTP best practices. When it comes to caching, only idempotent requests may be cached. This means: request methods that don’t explicitly change the state of the resource.

As a result, GET and HEAD are the only two cacheable request methods. This rule is enforced using the following VCL snippet in vcl_recv:

if (req.method != "GET" && req.method != "HEAD") {
	/* We only deal with GET and HEAD by default */
	return (pass);
}

So if the request method is for example POST, the return(pass) logic will kick in, and you’ll be sent to the vcl_pass subroutine. Requests that end up in vcl_pass will bypass the cache, and will result in a backend fetch.

Stateless

Stateful content is always difficult to cache. As mentioned in chapter 3, cache variations allow you to have multiple variations on the same resource. But when content is for your eyes only, usually you will not cache this content.

In HTTP, there are two common ways to keep track of state:

Through a Cookie header, which contains key-value pairs of user data
Through an Authorization header, which contains an authentication token that authorizes the client

Technically, the Authorization header isn’t automatically conveying a state, but like a Cookie, it denotes a customization of the content and that without deeper knowledge, it may be dangerous to cache the data.

In vcl_recv, any request containing a Cookie header, or an Authorization will result in a return(pass) too. Here’s the VCL code to prove it:

if (req.http.Authorization || req.http.Cookie) {
	/* Not cacheable by default */
	return (pass);
}

Anything else gets cached

So in the end, if you jumped through all those hoops, Varnish will consider your request cacheable and will look the corresponding object up in cache.

In VCL, this means performing a return(hash), which is exactly what happens at the end of the vcl_recv subroutine.

Once a hash is created to identify the object in cache, it means that you have a stateless and idempotent request that complies with the HTTP spec in terms of the request method and the host header.

vcl_hash

Although the diagram of the Varnish finite state machine uses vcl_hash as a point-of-entry for many other states, there is only one return statement that is actually used in the built-in VCL, and that is return(lookup).

Here’s the VCL:

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (lookup);
}

This subroutine will use the hash_data() function to create the hash of the object that is requested.

As you know from chapter 3, the hash is composed using the request URL and the host header. If there is no host header, the server IP address will be used instead.

What happens next all depends on the result of return(lookup):

If the object is found, we’ll end up in vcl_hit.
If the object is found, but was marked as uncacheable, we will transition to vcl_pass.
If the object is found, but the request was a purge request, we’ll end up in vcl_purge.
If the object is not found, we’ll end up in vcl_miss.

vcl_hit

Whenever a requested object is found in cache, a transition will happen from vcl_hash to vcl_hit.

In the diagram, a multitude of return actions are available for this state. However, the built-in VCL only has two default outcomes for vcl_hit:

sub vcl_hit {
    if (obj.ttl >= 0s) {
        // A pure unadulterated hit, deliver it
        return (deliver);
    }
    if (obj.ttl + obj.grace > 0s) {
        // Object is in grace, deliver it
        // Automatically triggers a background fetch
        return (deliver);
    }
    // fetch & deliver once we get the result
    return (miss);
}

If it turns out the object still has some TTL left, the object will be delivered. This means we’ll transition to vcl_deliver.

If the TTL has expired, but there’s still some grace left, the object will also be delivered while a background fetch happens for revalidation. This is the typical stale while revalidate behavior we discussed in the previous chapter.

If none of these conditions apply, we can conclude that the object has expired without any possibility of delivering a stale version. This is the same thing as a cache miss, so we fetch and deliver the new version of the object.

A dirty little secret about vcl_hit

A dirty little secret about the VCL code in vcl_hit is that it doesn’t really behave the way it is set up. The if (obj.ttl + obj.grace > 0s) {} conditional will always evaluate to true.

In reality, the built-in VCL for vcl_hit could be replaced by the following snippet:

sub vcl_hit {
    return (deliver);
}

The VCL is just there to show the difference between a pure hit and a grace hit.

In newer Varnish versions, it’s actually what vcl_hit looks like, as grace is handled internally.

vcl_miss

There’s not a lot to say about vcl_miss, really. Although a transition to vcl_pass is supported, the built-in VCL just does a return(fetch) for vcl_miss.

Here’s the VCL:

sub vcl_miss {
    return (fetch);
}

vcl_purge

When you enter the vcl_purge stage, it means that you called a request to purge an object from cache. This is done by calling return(purge) in vcl_recv.

Based on the URL and hostname of the corresponding request, the object hash is looked up in cache. If found, all objects under that hash are removed from cache, and the transition to vcl_purge happens. If the hash didn’t exist we still transition to vcl_purge because the outcome is the same: not having an object in cache for that hash.

As illustrated in the VCL example below, vcl_purge will return a synthetic response:

sub vcl_purge {
    return (synth(200, "Purged"));
}

The response itself is very straightforward: HTTP/1.1 200 Purged.

vcl_pass

A lot of people assume that there is just hit or miss when it comes to caches. Hit or miss is the answer to the following question:

Did we find the requested object in cache?

But there are more questions to ask. The main question to ask beforehand is:

Do we want to serve this object from cache?

And that is where pass enters the proverbial chat.

As you know from the built-in VCL: when you don’t want something to be served from cache, you just execute return(pass). This is where you enter vcl_pass.

Apart from its intention, the built-in VCL implementation of vcl_pass is identical to vcl_miss: you perform a return(fetch) to fetch the content from the origin.

Here’s the VCL:

sub vcl_pass {
    return (fetch);
}

And during the lookup stage, when a hit-pass object is found, instead of a regular one, an immediate transition to vcl_pass happens as well.

vcl_pipe

The built-in VCL code for vcl_pipe has a big disclaimer in the form of a comment:

sub vcl_pipe {
    # By default Connection: close is set on all piped requests, to stop
    # connection reuse from sending future requests directly to the
    # (potentially) wrong backend. If you do want this to happen, you can undo
    # it here.
    # unset bereq.http.connection;
    return (pipe);
}

The implementation, and the comment are a bit special. But then again, piping only happens under special circumstances.

The fact that you ended up in vcl_pipe, means that Varnish is under the impression that the request is not an HTTP request. We’ve learned from the built-in VCL that return(pipe) is used when the request method is not recognized.

Piping steps away from the layer 7 HTTP implementation of Varnish and goes all the way down to layer 4: it treats the incoming request as plain TCP, it no longer processes HTTP, and just shoves the TCP packets over the wire.

When the transaction is complete Varnish will close the connection with the origin to prevent other requests from reusing this connection.

A lot of people think that return(pass) and return(pipe) are the same in terms of behavior and outcome. That’s clearly not the case, as vcl_pass is still aware of the HTTP context, whereas vcl_pipe has no notion of HTTP.

vcl_synth

You enter the vcl_synth state when you execute a return(synth()) using the necessary function parameters for synth().

As mentioned before, synthetic responses are HTTP responses that don’t originate from a backend response. The output is completely fabricated within Varnish.

In the built-in VCL, the vcl_synth subroutine adds some markup to the output:

sub vcl_synth {
    set resp.http.Content-Type = "text/html; charset=utf-8";
    set resp.http.Retry-After = "5";
    set resp.body = {"<!DOCTYPE html>
<html>
  <head>
    <title>"} + resp.status + " " + resp.reason + {"</title>
  </head>
  <body>
    <h1>Error "} + resp.status + " " + resp.reason + {"</h1>
    <p>"} + resp.reason + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + req.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"};
    return (deliver);
}

The assumption of the built-in VCL is that the output should be in HTML, which is also reflected in the Content-Type response header that is set.

Imagine the following synth call:

return(synth(200,"OK"));

The corresponding synthetic response would be the following:

HTTP/1.1 200 OK
Date: Tue, 08 Sep 2020 07:34:25 GMT
Server: Varnish
X-Varnish: 5
Content-Type: text/html; charset=utf-8
Retry-After: 5
Content-Length: 224
Accept-Ranges: bytes
Connection: keep-alive

<!DOCTYPE html>
<html>
  <head>
    <title>200 OK</title>
  </head>
  <body>
    <h1>Error 200 OK</h1>
    <p>OK</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 5</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>

The Content-Type and the Retry-After headers were set in vcl_synth, whereas all other headers are set behind the scenes by Varnish.

When using the built-in VCL untouched, this is the HTML output that will be returned to the client.

vcl_deliver

Before a response is served back to the client, served from cache or from the origin, it needs to pass through vcl_deliver.

The built-in VCL is not exciting at all:

sub vcl_deliver {
    return (deliver);
}

Most people use vcl_deliver to decorate or clean some response headers before delivering the content to the client.

vcl_backend_fetch

When an object cannot be served from cache, a backend fetch will be made. As a result, you’ll end up in vcl_backend_fetch, where the original request is converted into a backend request.

Here’s the built-in VCL:

sub vcl_backend_fetch {
    if (bereq.method == "GET") {
        unset bereq.body;
    }
    return (fetch);
}

The fact that return(fetch) is called in this subroutine is not surprising at all. But what is surprising is that the request body is removed when a GET request is made.

Although a request body for a GET request is perfectly allowed in the HTTP spec, Varnish decides to strip it off.

The reason for that makes a lot of sense from a caching point of view: if there’s a request body, the URL is no longer the only way to uniquely identify the object in cache. If the request body differs, so does the object. To make this work, one would have to perform a cache variation on the request body, which could seriously decrease the hit rate.

Since request bodies for GET requests aren’t all that common, Varnish protects itself by conditionally running unset bereq.body.

Also, if the request is a cache miss, Varnish will automatically turn the request into a GET request. If the request was a HEAD request, this is what we expect because Varnish must have the response body to operate correctly. However, if the request was a POST request or something else, and you want to cache the response, you must save the request method in a header and put it back in this subroutine.

You probably noticed that we’re using the bereq object to identify the request, instead of the req object we used earlier. That’s because we’re now in backend context, and the original request has been copied over into the backend request. You’ll learn all about objects and variable in VCL later in this chapter.

vcl_backend_response

The vcl_backend_response subroutine is quite a significant one: it represents the state after the origin successfully returned an HTTP reponse. This means the request didn’t result in a cache hit.

It is also the place where Varnish decides whether or not to store the response in cache. Based on the built-in VCL code below, you’ll see that there’s some decision-making in place:

sub vcl_backend_response {
    if (bereq.uncacheable) {
        return (deliver);
    } else if (beresp.ttl <= 0s ||
      beresp.http.Set-Cookie ||
      beresp.http.Surrogate-control ~ "(?i)no-store" ||
      (!beresp.http.Surrogate-Control &&
        beresp.http.Cache-Control ~ "(?i:no-cache|no-store|private)") ||
      beresp.http.Vary == "*") {
        # Mark as "Hit-For-Miss" for the next 2 minutes
        set beresp.ttl = 120s;
        set beresp.uncacheable = true;
    }
    return (deliver);
}

Uncacheable

There are two ways to mark an object as uncacheable. The first and more common way is to set beresp.uncacheable = true;. This marks the object as hit-for-miss.

You can also use the return(pass) syntax in this subroutine, which marks the object as hit-for-pass. Hit-for-pass and hit-for-miss are very similar in that they both instruct Varnish that the current object is not to be inserted into cache and to disable request serialization for future requests. The difference is that hit-for-miss is allowed to change its mind and insert a cacheable object into cache in the future. Hit-for-pass cannot: this object can never be cached, now, or in the future. This trade-off gives hit-for-pass slightly better performance when dealing with uncacheable objects.

The built-in VCL will perform a series of checks to decide whether or not the response is cacheable.

If it turns out it is not, the set beresp.uncacheable = true; logic is triggered, which marks the object as hit-for-miss.

As explained earlier in the book, we’re caching the decision not to cache, which prevents future requests for this object ending up on the waiting list.

And vcl_backend_response checks for uncacheable objects with the following built-in VCL code:

if (bereq.uncacheable) {
    return (deliver);
}

This logic can be triggered when a return(pass) is called in the client-side logic, or for a hit-for-pass object. But by default, we don’t perform hit-for-pass, but hit-for-miss, which is a more forgiving approach.

Zero TTL

The built-in VCL will make a response uncacheable when the TTL is zero (or less).

This can be caused by three things:

set beresp.ttl = 0s ended up in the VCL file, without performing a return(deliver).
The max-age or s-maxage value of the Cache-Control header was set to zero.
The Expires header contains a timestamp in the past.

This is the check that happens in the if-statement to validate the TTL:

if(beresp.ttl <= 0s) {

As expected, this is the result:

set beresp.uncacheable = true;

When the origin adds a Set-Cookie header to the response, it implies that the state of a cookie needs to change.

Whenever state is present, let alone changed, Varnish decides to bypass the cache. Both at the client side, and the backend side.

This is the cookie check that happens in the if-statement of the built-in VCL:

if(beresp.http.Set-Cookie) {

And again, this is the outcome:

set beresp.uncacheable = true;

Surrogate control

A Surrogate-Control header takes precedence over any other caching header. When such a header is set and its value contains no-store, the built-in VCL will make the response uncacheable.

Here’s the if-check:

if(beresp.http.Surrogate-control ~ "(?i)no-store") {

And once again, here’s the outcome:

set beresp.uncacheable = true;

Cache control says no

When your response doesn’t contain a Surrogate-Control header, the built-in VCL will check if your response has a Cache-Control header.

If that is the case, the built-in VCL will make the response uncacheable if the Cache-Control header contains one of the following statements:

no-cache
no-store
private

If you went through chapter 3, you already know about this. So, here’s the VCL code to perform the check:

if(!beresp.http.Surrogate-Control &&
    beresp.http.Cache-Control ~ "(?i:no-cache|no-store|private)") {

The outcome is:

set beresp.uncacheable = true;

Vary all the things

Cache variations are good, but as with all things in life, you shouldn’t exaggerate.

If you vary on all headers, there’s no point caching the response, which is exactly what the built-in VCL thinks as well. Here’s the code:

if(beresp.http.Vary == "*") {

Very predictably, the outcome is:

set beresp.uncacheable = true;

vcl_backend_error

When you reach vcl_backend_error, it means you didn’t receive a valid HTTP response from the selected backend. There’s a multitude of reasons why that could be the case. Not being able to connect to the backend is also part of that.

When this happens, we cannot return a regular response, and we have to return a synthetic response again. That’s why the built-in VCL code for vcl_backend_error is nearly identical to the vcl_synth one.

The main difference is that resp.http.Content-Type becomes beresp.http.Content-Type because we’re operating in a backend-side context, not a client-side context.

Here’s the built-in VCL code:

sub vcl_backend_error {
    set beresp.http.Content-Type = "text/html; charset=utf-8";
    set beresp.http.Retry-After = "5";
    set beresp.body = {"<!DOCTYPE html>
<html>
  <head>
    <title>"} + beresp.status + " " + beresp.reason + {"</title>
  </head>
  <body>
    <h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
    <p>"} + beresp.reason + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + bereq.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"};
    return (deliver);
}

And here’s the output you’ll probably get when you run into a backend error:

HTTP/1.1 503 Backend fetch failed
Date: Tue, 08 Sep 2020 12:16:31 GMT
Server: Varnish
Content-Type: text/html; charset=utf-8
Retry-After: 5
X-Varnish: 5
Age: 0
Via: 1.1 varnish (Varnish/6.0)
Content-Length: 278
Connection: keep-alive

<!DOCTYPE html>
<html>
  <head>
    <title>503 Backend fetch failed</title>
  </head>
  <body>
    <h1>Error 503 Backend fetch failed</h1>
    <p>Backend fetch failed</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 6</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>

There are slightly more response headers, but apart from that, it’s the same output template.

vcl_init

vcl_init is a subroutine that is called when the VCL is initialized, before requests are processed. It is the place where VMODs can be initialized, or where VCL objects can be created.

Out-of-the-box, no VMODS are initialized, and no objects are created. As a result, it just performs a return(ok), as you can see in the example below:

sub vcl_init {
    return (ok);
}

vcl_fini

Whereas vcl_init is used when VCL is loaded, vcl_fini is used before the VCL is discarded.

This is the place where VMODS and VCL objects are cleaned up. By default we just perform a return(ok). This is reflected in the example below:

sub vcl_fini {
    return (ok);
}

Hooks, subroutines, and built-in VCL