Banning

Banning is a concept in Varnish that allows expression-based cache invalidation. This means that you can invalidate multiple objects from the cache without the need for individual purge calls.

A ban is created by adding a ban expression to the ban list. All objects in the cache will be evaluated against the expressions in the ban list before being served. If the object is banned Varnish will mark it as expired and fetch new content from the backend.

Ban expressions

A ban expression consists of fields, operators, and arguments. Expressions can be chained using the && operator. Only logical AND operations can be performed. Logical OR operations are done by evaluating multiple ban expressions.

Expression format

This is the format of ban expressions:

<field> <operator> <arg> [&& <field> <oper> <arg> ...]

The following fields are supported:

req.url: the request URL
req.http.*: any request header
obj.status: the cache object status
obj.http.*: the response headers stored in the cached object

These fields look familiar, and they represent some of the objects and variables in VCL.

The operator can be:

==: the field equals an arg
!=: the field is not equal to an arg
~: the field matches a regular expression defined by the arg
!~: the field doesn’t match a regular expression defined by the arg

The argument of a ban expression is either a literal string or a regular expression pattern. Strings are not delimited by double quotes " or the long string format {"…"}.

Expression examples

Let’s start with a very basic example that is the ban equivalent of a regular purge:

req.url == / && req.http.host == example.com

So the request’s URL equals /, and the request’s Host header equals example.com.

In another example we’ll invalidate all objects from the cache that have an HTTP 404 status:

obj.status == 404

We can also create an expression that uses response headers that are stored in the object. Let’s say we want to invalidate all images at once. We’d use the following expression:

obj.http.Content-Type ~ ^image/

This expression looks at the Content-Type response header and invalidates all items that match the ^image/ regular expression.

For the last example, we’ll match on a URL pattern, instead of an individual URL :

req.url ~^/products(/.+|$) && req.http.host == example.com

This pattern will match all objects where the URL starts with /products/... or equals /products.

Executing a ban from the command line

Now that you know what a ban is and what ban expressions look like, it’s time to explain how to execute such a ban.

The quickest way to do this is by using the varnishadm program. This program makes a connection to the CLI interface of varnishd.

You can choose to call the varnishadm program without any arguments, where you can enter individual commands. This is what happens in the example below:

$ varnishadm
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------

Type 'help' for command list.
Type 'quit' to close CLI session.

varnish> ban obj.status == 404
200

The ban obj.status == 404 command will issue a ban that aims to invalidate all objects with an HTTP 404 status code.

Another way you can ban using varnishadm is by adding the ban expression as an argument. Here’s an example of this:

$ varnishadm ban obj.status == 404

Sometimes certain characters in your ban expression might interfere with how your Linux shell interprets commands:

$ varnishadm ban obj.http.Content-Type ~ ^image/
expected conditional (~, !~, == or !=) got "/root"
Command failed with error code 106

In that case, you’re better off using quotes to avoid errors:

$ varnishadm ban "obj.http.Content-Type ~ ^image/"

Ban VCL code

Although banning can be done using varnishadm and doesn’t require a VCL implementation, it would be nice to use the BAN method to invalidate objects via bans. Much like our purging example.

We could have exactly the same behavior, but we strip out the purge logic from under the hood, and replace it with ban logic. VCL provides a ban() function that takes the ban expression as the argument.

Purge replacement

The following example is an exact copy of the purge VCL example, but instead of performing a return(purge), we return a synthetic response and use the ban() function to execute the ban:

vcl 4.1;

acl banners {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		ban("req.url == " + req.url 
			+ " && req.http.host == " + req.http.host);
		return(synth(200,"Ban added"));
	}
}

We have now created a purge replacement, but didn’t gain any flexibility.

Invalidate URL patterns

A more flexible approach would be to invalidate URL patterns rather than individual URLs. We could send a custom request header that contains this pattern.

The following example will enforce the custom x-ban-pattern request header to be set:

vcl 4.1;

acl banners {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		if(!req.http.x-ban-pattern) {
			return(synth(400,"Missing x-ban-pattern header"));
		}
		ban("req.url ~ " + req.http.x-ban-pattern 
			+ " && req.http.host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}

This ban would be triggered using the following HTTP request:

BAN / HTTP/1.1
Host: example.com
X-Ban-Pattern: ^/products/[0-9]+

The ban we just issued using HTTP, will result in the following ban expression:

req.url ~ ^/products/[0-9]+ && req.http.host == example.com

Long story short: we are banning all objects where the URL starts with /products/, followed by a numeric value. It is an open-ended regular expression, so URLs containing even more data after the numeric value will also match.

Complete flexibility

We can even take it up a notch, and allow even more flexibility, to the extent that the user is responsible for formulating the complete ban expression.

Here’s such an example:

vcl 4.1;

acl banners {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		if(!req.http.x-ban-expression) {
			return(synth(400,"Missing x-ban-expression header"));
		}
		ban(req.http.x-ban-expression);
		return (synth(200,"Ban added"));
	}
}

This VCL code would then be invoked using the following HTTP request:

BAN / HTTP/1.1
Host: example.com
X-Ban-Expression: obj.status == 404

The advantage here is that you’re not restricted to the request context, and you can also match other fields.

The downside is that you’re exposing the complexity of ban expressions to the end user. Instead, the concept of URL patterns would seem more intuitive for users.

The best of both worlds

Sometimes you don’t want to choose and just want to have it all:

Regular purges when you want to invalidate an individual URL
Bans when you want to invalidate a URL pattern

This can be done with a single implementation. Whereas we returned an HTTP 400 status when the x-invalidate-pattern header was not set, we can use purging as a fallback instead.

Here’s the code:

vcl 4.1;

acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-invalidate-pattern) {
			return(purge);
		}
		ban("req.url ~ " + req.http.x-invalidate-pattern 
			+ " && req.http.host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}

So if you just want to purge the /products page, you can issue the following HTTP request:

PURGE /products HTTP/1.1
Host: example.com

But if you want to invalidate all subordinate resources of /products/, you can add the x-invalidate-pattern header and specify the URL pattern:

PURGE / HTTP/1.1
Host: example.com
X-Invalidate-Pattern: ^/products/

The ban list

Unlike purges, bans will not immediately remove objects from the cache. The synthetic message from the VCL examples already gave it away: bans are added.

When you execute a ban, the ban expression is added to the ban list. This is a list containing all the bans that need to be evaluated, and matched against all the objects in cache.

The easiest way to see the contents of the ban list is by running varnishadm ban.list.

There is always an item on the list

Here’s the output of the ban list when the varnishd process was just started:

$ varnishadm ban.list
Present bans:
1603270370.244746     0 C

Although no bans were issued, and no objects are stored in the cache, there is already an item on the list. Let’s break it down:

1603270370.244746 is the time at which the ban was added. It is in Unix timestamp format and has microsecond precision.
0 is the refcount. There are currently 0 objects that refer to this ban.
C stands for Completed. This means the ban is fully evaluated.

The reason there is already a ban on the list is because every object in cache needs to refer to the last ban it has seen when entering the cache. This allows bans that are older than the object to be disregarded.

So as soon as the first object is stored in cache, the refcount increases to 1:

$ varnishadm ban.list
Present bans:
1603270370.244746     1 C

The refcount will increase as objects get inserted.

Adding a first ban

The ban list will change as soon as the first ban is added.

The following example looks a bit weird:

$ varnishadm ban obj.status != 0

We’re banning all objects that do not have a 0 status. That’s literally every object in the cache.

When we consult the ban list, we see it has been added:

$ varnishadm ban.list
Present bans:
1603272627.622051     0 -  obj.status != 0
1603270370.244746     3 C

Initially all three objects still refer to the initial ban as the one they have seen last. But with the addition of the new ban, that will change.

After a short while, the ban list will look like this:

$ varnishadm ban.list
Present bans:
1603272627.622051     0 C
1603270370.244746     0 C

The newly added ban is completed, and no objects refer to it because we just removed all objects from the cache. The initial ban is also still around.

As soon as a new object enters the cache, it refers to the last one it has seen:

$ varnishadm ban.list
Present bans:
1603272627.622051     1 C

If you look at the timestamp, it is 1603272627.622051, which matches the ban we just executed.

Adding multiple bans

Let’s have a look at a ban list that already has some ban expressions on it:

$ varnishadm ban.list
Present bans:
1603273224.960953     2 -  req.url ~ ^/[a-z]$
1603273216.857785     0 -  req.url ~ ^/[a-z]+/[0-9]+
1603272627.622051     9 C

Nine objects saw 1603272627.622051 as their last ban. This means that up to two ban expressions should be evaluated for those objects.

For two objects, 1603273224.960953 was the last one they saw. These objects aren’t subject to any invalidation. These were objects that were inserted into to cache after the two recent bans were added.

There are zero objects that saw 1603273216.857785 as their last ban. This kind of makes sense because if you do the math between the last and the second-to-last ban, you’ll see there’s only an eight second difference between the two bans. During those eight seconds, no new objects got added to the cache.

As time progresses, you’ll see that the req.url ~ ^/[a-z]+/[0-9]+ evaluation has completed, and that those nine objects have been processed:

$ varnishadm ban.list
Present bans:
1603273224.960953     2 -  req.url ~ ^/[a-z]$

This means that nine objects were invalidated since they are no longer referenced.

Any future bans that are executed will apply to the two remaining objects, as long as they have not expired.

The ban lurker

We have to be honest: there is one piece of crucial information we held back from you.

Throughout this section about banning, we talked about ban expressions, the ban list, and about objects being matched. But we never mentioned what mechanism is responsible for that.

There is a thread, which was mentioned in the Under the hood section of chapter 1, that is called the ban lurker.

This thread will inspect the ban list, and match the ban expression to the right objects.

Runtime parameters

The ban lurker thread has some runtime parameters that control its behavior:

ban_lurker_age is the minimum age an object should have before it is processed by the ban lurker. The default value is 60 seconds.
ban_lurker_sleep is the number of seconds the ban lurker sleeps before processing another batch. The default value is 0.010 seconds.
ban_lurker_batch is the number of bans the ban lurker processed before going back to sleep. The default value is 1000.
ban_lurker_holdoff sets the number of seconds the ban lurker holds off when lock contention occurs during a cache lookup. The default value is 0.010 seconds.
ban_cutoff limits the ban lurker from inspecting the ban list until the ban_cutoff limit is reached; beyond that it treats all objects as if they matched a ban and removes them from cache. The default value is 0.
ban_dup eliminates older identical bans when a new ban is added. The default value is on.

Ban lurker workflow

Every 0.010 seconds the ban lurker will look for objects that are at least one minute old. The lurker will process 1000 at a time. It looks for the position of that object on the ban list and applies the most recent bans up until the point when a ban expression matches.

When a match is found that object is put on the expiry list and is removed from the cache shortly thereafter.

Ban lurker scope

Because the ban lurker is a separate thread that has no knowledge of any incoming HTTP request, its scope is limited to the object context.

Any ban expression that refers to an obj.http.* or an obj.status field can be processed by the ban lurker. Basically only the response information that is part of the object is available to the ban lurker.

This begs the question: how do expressions that contain req.url or req.http.* get processed? It’s obvious that these bans are not the responsibility of the ban lurker.

When the request context is used in a ban expression, the worker thread that handles the incoming request is responsible for this.

This means that such bans aren’t processed asynchronously and that space is only freed from the cache when a request comes in that matches one of these ban expressions.

We’ll talk about the performance impact of synchronous bans in one of the next sections.

Enforcing asynchronous bans

Now that we know the scope of the ban lurker, and the fact that the worker thread is responsible for handling bans within the request scope, it seems as though request-based ban expressions cannot be used in an efficient way.

To that we say: >Use your imagination, and be creative.

If this were a Tweet or a Facebook post, we would have added an emoji.

If the object has no information about the request, add this information in VCL:

sub vcl_backend_response {
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}

These two custom headers basically store the request context as custom response headers.

And then you can trust that the ban lurker will be able to evaluate the following expression:

obj.http.x-url == / && obj.http.x-host == example.com

This is what we call lurker-friendly bans. As this is quite the mouthful, we can also just call them asynchronous bans.

Let’s take our best-of-both-worlds example, and apply asynchronous banning:

vcl 4.1;

acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-invalidate-pattern) {
			return(purge);
		}
		ban("obj.http.x-url ~ " + req.http.x-invalidate-pattern
			+ " && obj.http.x-host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}

sub vcl_backend_response {
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}

sub vcl_deliver {
	unset resp.http.x-url;
	unset resp.http.x-host;
}

To make this work, we had to add the beresp.http.x-url and beresp.http.x-host URLs, but we also have to clean them up upon delivery. And the fields we’re matching in the ban() function now reflect these two custom response headers.

And then you can send the following HTTP request to Varnish:

PURGE / HTTP/1.1
Host: example.com
X-Invalidate-Pattern: ^/products/

After the object reaches the ban_lurker_age value, the ban lurker will come in and expire the object. Unlike synchronous bans within the request scope, the worker thread won’t have to do the heavy lifting, the ban lurker will.

Tag-based invalidation

Most cache invalidation implementations focus solely on the URL as a way to identify and invalidate objects. This only works if the content in your application can easily be mapped to one or more URLs.

But for more advanced content that appears all over the place, it is nearly impossible to map this to the right URLs.

An alternative approach is to tag content, and ban objects based on these tags.

Consider the following HTTP response:

HTTP/1.1 200 OK
Cache-Control: public, s-maxage=60
Tags: tag1 tag2 tag3
...

This response will be stored in cache for a minute, but if you want to get the corresponding object out of cache earlier, you can issue a ban that matches a tag from the Tags header.

The following ban expression would remove every object from cache that has tag1 in its Tags header:

obj.http.tags ~ tag1

You can also include this in your VCL code:

vcl 4.1;

acl purge {
	"localhost";
	"192.168.55.0"/24;
}

sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-ban-tag) {
			return(synth(400,"x-ban-tag missing"));
		}
		ban("obj.http.tags ~ " + req.http.x-ban-tag);
		return (synth(200,"Ban added"));
	}
}

Triggering the ban via HTTP could result in the following HTTP request:

PURGE / HTTP/1.1
Host: example.com
X-Ban-Tag: tag1

Integrating bans in your application

Just like purges, you can call bans using command line HTTP clients:

#HTTPie
http PURGE "www.example.com" "X-Purge-Pattern:^/contact"
# curl
$ curl -X PURGE -H "X-Purge-Pattern:^/contact" "www.example.com"

But as we’ve seen earlier, there are other command line tools in place to trigger bans:

$ varnishadm ban obj.http.Content-Type ~ ^image/

And for WordPress, Drupal, Magento, and many others, there are community-maintained plugins available that perform bans in Varnish.

But not all of these frameworks implement their cache purging logic using an HTTP endpoint. Magento, for example, connects to the Varnish command line over a TCP socket.

We’ll talk about the Varnish CLI socket protocol in chapter 7.

Ban limitations

If you factor in the scope of bans, and enforce request-based bans to be lurker friendly, it does seem like a great solution. For most people it is.

However, banning is architected in such a way that it can become a very CPU-intensive process.

Because every object (n) has to be matched against all remaining ban expressions (m), the complexity is n * m. This means that if you have a lot of objects and a lot of bans, a lot of computations need to happen.

For asynchronous bans, the burden is on the ban lurker thread. But for synchronous bans, the worker thread is responsible for processing request-related items on the ban list. In that case, the computationally heavy logic might slow down the request.

Potential performance issues related to bans also depend on the quality of the regular expression that is used: the more complicated the regex, the longer it takes to process.

Header matching will also have an impact because the ban lurker or the worker thread needs to cycle through all headers until the matching header is found.

This adds extra complexity, and the more headers a request or response has, the more work that needs to happen.

Long story short: the unparalleled flexibility of bans comes at a cost. Perhaps like all things in life.