Cache invalidation (Deprecated using VAC) Tutorial

Introduction

A good caching strategy, not only defines how the content should be cached, but most importantly it defines how it should be invalidated and evicted from cache. An object inserted in cache can be served to other clients until it expires, is evicted to make room for other objects, or is invalidated.

TTLs, Time to Live of an object, define for how long an object can be cached. An object’s Time to Live is set when the content is generated (by the backend) or when it is inserted (in Varnish). TTLs can be set via HTTP caching headers(i.e. “Expires”) or via VCL. Either ways Varnish will respect the defined TTLs and evict the object when its Time to Live has expired making room for fresher content to be inserted in cache.

Although Varnish will, by default, handle content insertion and invalidation of the cache, you still can define a more specific eviction strategy. This tutorial shows how you can invalidate objects through various mechanisms.

You can choose among the following methods:

	Purge	Ban	Ykey	Broadcaster
Target	Specific object with its variants*	Regex patterns	All objects with a common ykey tag	Specific object among different Varnish instances
VCL	Yes	Yes	Yes	No
CLI	No	Yes	No	No
VAC	No	Yes	No	Yes

variants defined by the Vary header

Purge

A PURGE request is when an object, with all its variants, is immediately discarded from cache freeing up space in cache; it is invoked through HTTP with the method PURGE, which is another request method just as HTTP GET.

If you use the verb PURGE instead of GET the object ,that would otherwise be hit and served to the client, will be purged from the cache with all it’s variants. For this to work, the requests need to have the same hash, as computed in vcl_hash. The default vcl_hash will take the Host header and the URL into account, and here the URL includes any query parameters that are present in the requests. There also needs to be explicit VCL code to respect the PURGE keyword, as explained below.

Purges cannot use regular-expressions and they evict content from cache regardless the availability of the backend. That means that if you purge some objects and the backend is down, Varnish will end up having no copy of the content.

How do we purge?

VCL

You can apply the following snippet to your VCL file:

# Access Control List to define which IPs
# can purge content
acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		# check if the client is allowed to purge content
		if (!client.ip ~ purge) {
			return(synth(405,"Not allowed."));
		}
		return (purge);
	}
}

NOTE: If your edit the VCL file via VAC jump to step 3.

Configuration reload

We need to reload the new VCL to make sure changes gets applied, via command line run the following command:

systemctl reload varnish

Purge request

We can now start purging content, to do so we will have to issue PURGE HTTP request. You can use your most preferred tool to trigger a HTTP request, following two examples using HTTPie and curl:

# HTTPie
http PURGE "www.example.com/foo"

# curl
curl -X PURGE "www.example.com/foo"

Both commands will Purge the /foo resource coming from the Host example.com

Ban

We can use bans to invalidate content in cache. Whenever an object is banned it won’t be used anymore to fulfill incoming requests. Bans leverage regular expression syntax to invalidate content, therefore we can use any object property we have to issue a ban. A ban will only work on objects already stored in cache, it does not prevent new content from entering the cache or being served.

We can either ban content based on req.* or obj.* properties of an object. Like purges, bans will immediately stop the invalidated content from being served, but banned objects won’t immediately be evicted from cache freeing up memory. Instead, they are tested against the ban in vcl_hit, and by a background thread, the ban_lurker (for bans only referencing obj properties). The ban lurker will walk the cache and evict matching objects in the background, completing (and discarding) bans much faster than only testing in vcl, in turn, this limits the amount of simultaneous bans, which can be a performance concern.

How aggressive the ban lurker is can be controlled by the parameter ‘ban_lurker_sleep’. The ban lurker can be disabled by setting ‘ban_lurker_sleep’ to 0.

There are several ways to issue a ban in Varnish. You can use a ban statement in VCL, use the ban command in the Varnish Command Line Interface (CLI), or issue the ban through the Varnish Administration Console (VAC).

Via HTTP(s) request

VCL

As explained above, it’s important to use ban-lurker friendly expression to be performant, so this VCL snippet copies the request URL as an object header so we can refer to it while banning, without using req.url.

sub vcl_recv {
	if (req.method == "BAN") {
		ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url == " + req.url);
		return(synth(200, "Banned added"));
	}
}

sub vcl_backend_response {
	# Store URL and HOST in the cached response.
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}

sub vcl_deliver {
	# Prevent the client from seeing these additional headers.
	unset resp.http.x-url;
	unset resp.http.x-host;
}

To keep things simple, this snippet only handles one invalidation scheme, and doesn’t do any sort of access control. A more in-depth look at direct obj bans (including tests) is available here.

Configuration reload

Now, we would have to reload the VCL configuration:

systemctl reload varnish

Ban request

And, as final step, we can issue HTTP BAN requests to ban content, i.e. :

curl -X BAN "http://example.com/ -H "ban-url: ^/path/"

The above command will invalidate every object matching under the path “/path/.*”

CLI

Support for bans is built into Varnish and available in the CLI interface, via varnishadm. To ban every “png” object belonging on “example.com”, issue the following command from the shell::

varnishadm ban req.http.host == example.com '&&' req.url '~' '\\.png$'

Make sure to use the right regular expression syntax as defined by PCRE rules. In particular, note that in the example given above, the quotes are required for execution from the shell and escaping the backslash in the regular expression is required by the varnish cli interface.

VAC

Bans can be issued via VAC as well. VAC has a graphical interface as the one below.

Via VAC a single BAN can be broadcasted to every single varnish instance included in a group.

YKey

vmod_ykey is a Varnish module that adds a tag or secondary key(in Varnish jargon) to objects, allowing fast purging on all objects matching the assigned tag/secondary key. As any other VMOD it can be used via VCL configuration.

The purge operation may be hard or soft. A hard purge immediately removes the matched objects from the cache completely. A soft purge will expire the objects, but keep the objects around for their configured grace and keep timeouts (grace for stale object delivery to clients while the next fetch is in progress, and keep for conditional fetches). It also interfaces with the MSE stevedore, providing persistence of the Ykey data structure on disk for persisted caches.

VCL configuration

To use Ykey, you need to import the Ykey VMOD into your VCL configuration. The keys to associate with an object needs to be specified specifically by calling one or more of the add key VMOD functions during vcl_backend_response{}.

The following example adds all keys listed in the backend response header named Ykey, and a custom one for all URLs starting with “/content/image/”::

import ykey;

sub vcl_backend_response {
	ykey.add_header(beresp.http.Ykey);
	if (bereq.url ~ "^/content/image/") {
		ykey.add_key("IMAGE");
	}
}

The following example creates a simple purge interface. If a header called Ykey-Purge is present, it will purge using ykey and the keys listed in the header. If not, fall back to regular purge::

import ykey;

# Access Control List to define which IPs
# can purge content
acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (client.ip !~ purgers) {
			return (synth(403, "Forbidden"));
		}
		if (req.http.Ykey-Purge) {
			set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge);

			# or for soft purge:
			# set req.http.n-gone =  ykey.purge_header(req.http.Ykey-Purge, soft=true);
			return (synth(200, "Invalidated "+req.http.n-gone+" objects"));
		} else {
			return (purge);
		}
	}
}

Configuration reload

We made VCL changes, hence the Varnish service would have to be reloaded.

YKey request

To purge or softpurge one or more objects in cache we can issue a HTTP request which includes the Ykey-Purge header. In example: curl -X PURGE -H "Ykey-Purge; purging_key" "http://example.com/path/.*".

The curl request will purge every object in cache matching the key: “purging_key” defined as header.

varnish-broadcaster

broadcaster replicates requests to multiple Varnish caches from a single entry point. The main goal is to facilitate purging/banning across multiple Varnish Cache instances. The broadcaster consists of a web-server with a REST API which will receive HTTP requests and distribute them to all configured caches.

Installation

The very first step to start using the Varnish Broadcaster is to install it, the full guide and the documentation can be found here.

Configuration

To use the Broadcaster you would have to define a configuration file, under the path /etc/varnish/nodes.conf which defines the nodes to broadcast against. The format of the file is similar to the ini format.

Below an example from a valid configuration file. This configuration has two clusters (Europe/US) each with its own nodes:

# this is a comment
[Europe]
First = 1.2.3.4:9090
Second = 9.9.9.9:6081
Third = example.com

[US]
Alpha = http://[1::2]
Beta = 8.8.8.8

Service reload

Once we have a configuration file we have to start or reload the broadcaster service: $ service broadcaster start

Invalidation

Now that the Broadcaster is properly configured and is running, we can issue PURGE and BAN HTTP requests to it the same way we would to a Varnish server. There’s one exception though: for PURGE requests, you need to force the host header in the curl request (-H "host: my.domain.com) to ensure the requests is hashed as the objects you want to purge.