Cache invalidation Tutorial

Introduction

A good caching strategy, not only defines how the content should be cached, but most importantly it defines how it should be invalidated and evicted from cache. An object inserted in cache can be served to other clients until it expires, is evicted to make room for other objects, or is invalidated.

TTLs, Time to Live of an object, define for how long an object can be cached. An object’s Time to Live is set when the content is generated (by the backend) or when it is inserted (in Varnish). TTLs can be set via HTTP caching headers(i.e. “Expires”) or via VCL. Either ways Varnish will respect the defined TTLs and evict the object when its Time to Live has expired making room for fresher content to be inserted in cache.

Although Varnish will, by default, handle content insertion and invalidation of the cache, you still can define a more specific eviction strategy. This tutorial shows how you can invalidate objects through various mechanisms.

You can choose among the following methods methods:

1. Purge
2. Ban
3. Ykey
4. Varnish Broadcaster
Purge Ban Ykey Broadcaster
Target Specific object with its variants* Regex patterns All objects with a common ykey tag Specific object among different Varnish instances
VCL Yes Yes Yes No
CLI No Yes No No
VAC No Yes No Yes
  • variants defined by the Vary header

1. Purge

A PURGE request is when an object, with all its variants, is immediately discarded from cache freeing up space in cache; it is invoked through HTTP with the method PURGE, which is another request method just as HTTP GET.

If you use the verb PURGE instead of GET the object ,that would otherwise be hit and served to the client, will be purged from the cache with all it’s variants. For this to work, the requests need to have the same hash, as computed in vcl_hash. The default vcl_hash will take the Host header and the URL into account, and here the URL includes any query parameters that are present in the requests. There also needs to be explicit VCL code to respect the PURGE keyword, as explained below.

Purges cannot use regular-expressions and thye evict content from cache regardless the availability of the backend. That means that if you purge some objects and the backend is down, Varnish will end up having no copy of the content.

How do we purge?
1. VCL configuration

You can apply the following snippet to your VCL file:

# Access Control List to define which IPs
# can purge content
acl purge {
        "localhost";
        "192.168.55.0"/24;
}

sub vcl_recv {
        if (req.method == "PURGE") {
        # check if the client is allowed to purge content 
                if (!client.ip ~ purge) {
                        return(synth(405,"Not allowed."));
                }
                return (purge);
        }
}

NOTE: If your edit the VCL file via VAC jump to step 3.

2. VCL reload

We need to reload the new VCL to make sure changes gets applied, via command line run the following command: $ systemctl reload varnish

3. Issue a PURGE request

We can now start purging content, to do so we will ahve to issue PURGE HTTP request. You can use your most preferred tool to trigger a HTTP request, following two examples using HTTPie and curl:

HTTPie: http PURGE "www.example.com/foo"
curl: curl -X PURGE "www.example.com/foo"

Both commands will Purge the /foo resource coming from the Host example.com

2. Ban

We can use bans to invalidate content in cache. Whenever an object is banned it won’t be used anymore to fullfill incoming requests. Bans leverage regular expression syntax to invalidate content, therefore we can use any object property we have to issue a ban. A ban will only work on objects already stored in cache, it does not prevent new content from entering the cache or being served.

We can either ban content based on *req* or *obj* properties of an object. Unlike purges, banned content won’t immediately be evicted from cache freeing up memory, instead it will either stay in cache until its TTL expires, if we ban on req properties, or it will be evicted by a background thread, called ban_lurker, if we ban on the obj properties. The ban lurker will walk the heap and try to match objects and will evict the matching objects. How aggressive the ban lurker is can be controlled by the parameter ‘ban_lurker_sleep’. The ban lurker can be disabled by setting ‘ban_lurker_sleep’ to 0.

How do we BAN?

There are several ways to issue a ban in Varnish. You can use a ban statement in VCL, use the ban command in the Varnish Command Line Interface (CLI), or issue the ban through the Varnish Administration Console (VAC).

1. VCL configuration

Content can be banned via HTTP requests, to do so, you would have to add the BAN definition within you VCL logic, similarly to what we did for the purging session:

# Access Control List to define which IPs
# can purge content
acl purge {
        "localhost";
        "192.168.55.0"/24;
}

sub vcl_recv {
        if (req.method == "BAN") {
                # Same ACL check as above:
                if (!client.ip ~ purge) {
                        return(synth(405, "Not allowed."));
                }
                ban("req.http.host == " + req.http.host + " && req.url == " + req.url);

                # Throw a synthetic page so the request won't go to the backend.
                return(synth(200, "Ban added"));
        }
}
2. Service reload

Now, we would have to reload the VCL configuration: $ systemctl reload varnish

3. Issue a BAN request

And, as final step, we can issue HTTP BAN requests to ban content, i.e. : curl -X BAN "http://example.com/path/.*" The above command will invalidate every object matching under the path “/path/.*”

CLI

Support for bans is built into Varnish and available in the CLI interface, via varnishadm. To ban every “png” object belonging on “example.com”, issue the following command from the shell::

varnishadm ban req.http.host == example.com '&&' req.url '~' '\\.png$'

Make sure to use the right regular expression syntax as defined by PCRE rules. In particular, note that in the example given above, the quotes are required for execution from the shell and escaping the backslash in the regular expression is required by the varnish cli interface.

VAC

Bans can be issued via VAC as well. VAC has a graphical interface as the one below.

World

Via VAC a single BAN can be broadcasted to every single varnish instance included in a group.

3. ykey

vmod_ykey is a Varnish module that adds a tag or secondary key(in Varnish jargon) to objects, allowing fast purging on all objects matching the assigned tag/secondary key. As any other VMOD it can be used via VCL configuration.

The purge operation may be hard or soft. A hard purge immediately removes the matched objects from the cache completely. A soft purge will expire the objects, but keep the objects around for their configured grace and keep timeouts (grace for stale object delivery to clients while the next fetch is in progress, and keep for conditional fetches). It also interfaces with the MSE stevedore, providing persistence of the Ykey data structure on disk for persisted caches.

1. VCL configuration

To use Ykey, you need to import the Ykey VMOD into your VCL configuration. The keys to associate with an object needs to be specified specifically by calling one or more of the add key VMOD functions during vcl_backend_response{}.

The following example adds all keys listed in the backend response header named Ykey, and a custom one for all URLs starting with “/content/image/”::

        import ykey;
        sub vcl_backend_response {
                ykey.add_header(beresp.http.Ykey);
                if (bereq.url ~ "^/content/image/") {
                        ykey.add_key("IMAGE");
                }
        }

The following example creates a simple purge interface. If a header called Ykey-Purge is present, it will purge using ykey and the keys listed in the header. If not, fall back to regular purge::

import ykey;

# Access Control List to define which IPs
# can purge content
acl purge {
        "localhost";
        "192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (client.ip !~ purgers) {
			return (synth(403, "Forbidden"));
		}
		if (req.http.Ykey-Purge) { 
			set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge);
		
			# or for soft purge: 
			# set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge, soft=true);
			return (synth(200, "Invalidated "+req.http.n-gone+" objects"));
		} else {
			return (purge);
		}
	}
}
2. Service reload

We made VCL changes, hence the Varnish service would have to be reloaded.

3. Issue an invalidation request

To purge or softpurge one or more objects in cache we can issue a HTTP request which includes the Ykey-Purge header. In example: curl -X PURGE -H "Ykey-Purge; purging_key" "http://example.com/path/.*".

The curl request will purge every object in cache matching the key: “purging_key” defined as header.

4. Varnish-broadcaster

Varnish broadcaster broadcasts requests to multiple Varnish caches from a single entry point. The initial thought is to ease-up purging/banning across multiple Varnish Cache instances. The broadcaster consists of a web-server with a REST API which will receive invalidation requests and distribute these requests to all configured caches.

1. Installation

The very first step to start using the Varnish Broadcaster is to install it, the full guide and the documentation can be found here.

2. Configuration

To use the Broadcaster you would have to define a configuration file, under the path /etc/varnish/nodes.conf which defines the nodes to broadcast against. The format of the file is similar to the ini format.

Below an example from a valid configuration file. This configuration has two clusters (Europe/US) each with its own nodes:

# this is a comment 
[Europe]
First = 1.2.3.4:9090
Second = 9.9.9.9:6081
Third = example.com

[US]
Alpha = http://[1::2]
Beta = 8.8.8.8
3. Service reload

Once we have a configuration file we have to start or reload the broadcaster service: $ service broadcaster start

4. Invalidation

Now that the Broadcaster is properly configure and it is running, we can issue PURGES and BAN HTTP requests and have them replicated among different Varnish nodes.
For HTTP PURGE requests see the purging tutorial, while for HTTP BAN requests see the banning tutorial.