Banning is a concept in Varnish that allows expression-based cache invalidation. This means that you can invalidate multiple objects from the cache without the need for individual purge calls.
A ban is created by adding a ban expression to the ban list. All objects in the cache will be evaluated against the expressions in the ban list before being served. If the object is banned Varnish will mark it as expired and fetch new content from the backend.
A ban expression consists of fields, operators, and arguments.
Expressions can be chained using the && operator. Only logical AND
operations can be performed. Logical OR operations are done by
evaluating multiple ban expressions.
This is the format of ban expressions:
<field> <operator> <arg> [&& <field> <oper> <arg> ...]
The following fields are supported:
req.url: the request URLreq.http.*: any request headerobj.status: the cache object statusobj.http.*: the response headers stored in the cached objectThese fields look familiar, and they represent some of the objects and variables in VCL.
The operator can be:
==: the field equals an arg!=: the field is not equal to an arg~: the field matches a regular expression defined by the arg!~: the field doesn’t match a regular expression defined by the
argThe argument of a ban expression is either a literal string or a
regular expression pattern. Strings are not delimited by double quotes
" or the long string format {"…"}.
Let’s start with a very basic example that is the ban equivalent of a regular purge:
req.url == / && req.http.host == example.com
So the request’s URL equals /, and the request’s Host header
equals example.com.
In another example we’ll invalidate all objects from the cache that have an HTTP 404 status:
obj.status == 404
We can also create an expression that uses response headers that are stored in the object. Let’s say we want to invalidate all images at once. We’d use the following expression:
obj.http.Content-Type ~ ^image/
This expression looks at the Content-Type response header and
invalidates all items that match the ^image/ regular expression.
For the last example, we’ll match on a URL pattern, instead of an individual URL :
req.url ~^/products(/.+|$) && req.http.host == example.com
This pattern will match all objects where the URL starts with
/products/... or equals /products.
Now that you know what a ban is and what ban expressions look like, it’s time to explain how to execute such a ban.
The quickest way to do this is by using the varnishadm program. This
program makes a connection to the CLI interface of varnishd.
You can choose to call the varnishadm program without any arguments,
where you can enter individual commands. This is what happens in the
example below:
$ varnishadm
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Type 'help' for command list.
Type 'quit' to close CLI session.
varnish> ban obj.status == 404
200
The ban obj.status == 404 command will issue a ban that aims to
invalidate all objects with an HTTP 404 status code.
Another way you can ban using varnishadm is by adding the ban
expression as an argument. Here’s an example of this:
varnishadm ban obj.status == 404
Sometimes certain characters in your ban expression might interfere with how your Linux shell interprets commands:
$ varnishadm ban obj.http.Content-Type ~ ^image/
expected conditional (~, !~, == or !=) got "/root"
Command failed with error code 106
In that case, you’re better off using quotes to avoid errors:
varnishadm ban "obj.http.Content-Type ~ ^image/"
Although banning can be done using varnishadm and doesn’t require a
VCL implementation, it would be nice to use the BAN method to
invalidate objects via bans. Much like our purging example.
We could have exactly the same behavior, but we strip out the purge
logic from under the hood, and replace it with ban logic. VCL
provides a ban() function that takes the ban expression as the
argument.
The following example is an exact copy of the purge VCL example, but
instead of performing a return(purge), we return a synthetic response
and use the ban() function to execute the ban:
vcl 4.1;
acl banners {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		ban("req.url == " + req.url 
			+ " && req.http.host == " + req.http.host);
		return(synth(200,"Ban added"));
	}
}
We have now created a purge replacement, but didn’t gain any flexibility.
A more flexible approach would be to invalidate URL patterns rather than individual URLs. We could send a custom request header that contains this pattern.
The following example will enforce the custom x-ban-pattern request
header to be set:
vcl 4.1;
acl banners {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		if(!req.http.x-ban-pattern) {
			return(synth(400,"Missing x-ban-pattern header"));
		}
		ban("req.url ~ " + req.http.x-ban-pattern 
			+ " && req.http.host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}
This ban would be triggered using the following HTTP request:
BAN / HTTP/1.1
Host: example.com
X-Ban-Pattern: ^/products/[0-9]+
The ban we just issued using HTTP, will result in the following ban expression:
req.url ~ ^/products/[0-9]+ && req.http.host == example.com
Long story short: we are banning all objects where the URL starts with
/products/, followed by a numeric value. It is an open-ended regular
expression, so URLs containing even more data after the numeric
value will also match.
We can even take it up a notch, and allow even more flexibility, to the extent that the user is responsible for formulating the complete ban expression.
Here’s such an example:
vcl 4.1;
acl banners {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "BAN") {
		if (!client.ip ~ banners) {
			return(synth(405));
		}
		if(!req.http.x-ban-expression) {
			return(synth(400,"Missing x-ban-expression header"));
		}
		ban(req.http.x-ban-expression);
		return (synth(200,"Ban added"));
	}
}
This VCL code would then be invoked using the following HTTP request:
BAN / HTTP/1.1
Host: example.com
X-Ban-Expression: obj.status == 404
The advantage here is that you’re not restricted to the request context, and you can also match other fields.
The downside is that you’re exposing the complexity of ban expressions to the end user. Instead, the concept of URL patterns would seem more intuitive for users.
Sometimes you don’t want to choose and just want to have it all:
This can be done with a single implementation. Whereas we returned an
HTTP 400 status when the x-invalidate-pattern header was not set, we
can use purging as a fallback instead.
Here’s the code:
vcl 4.1;
acl purge {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-invalidate-pattern) {
			return(purge);
		}
		ban("req.url ~ " + req.http.x-invalidate-pattern 
			+ " && req.http.host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}
So if you just want to purge the /products page, you can issue the
following HTTP request:
PURGE /products HTTP/1.1
Host: example.com
But if you want to invalidate all subordinate resources of /products/,
you can add the x-invalidate-pattern header and specify the URL
pattern:
PURGE / HTTP/1.1
Host: example.com
X-Invalidate-Pattern: ^/products/
Unlike purges, bans will not immediately remove objects from the cache. The synthetic message from the VCL examples already gave it away: bans are added.
When you execute a ban, the ban expression is added to the ban list. This is a list containing all the bans that need to be evaluated, and matched against all the objects in cache.
The easiest way to see the contents of the ban list is by running
varnishadm ban.list.
Here’s the output of the ban list when the varnishd process was just
started:
$ varnishadm ban.list
Present bans:
1603270370.244746     0 C
Although no bans were issued, and no objects are stored in the cache, there is already an item on the list. Let’s break it down:
1603270370.244746 is the time at which the ban was added. It is in
Unix timestamp format and has microsecond precision.0 is the refcount. There are currently 0 objects that refer to
this ban.C stands for Completed. This means the ban is fully evaluated.The reason there is already a ban on the list is because every object in cache needs to refer to the last ban it has seen when entering the cache. This allows bans that are older than the object to be disregarded.
So as soon as the first object is stored in cache, the refcount
increases to 1:
$ varnishadm ban.list
Present bans:
1603270370.244746     1 C
The refcount will increase as objects get inserted.
The ban list will change as soon as the first ban is added.
The following example looks a bit weird:
varnishadm ban obj.status != 0
We’re banning all objects that do not have a 0 status. That’s
literally every object in the cache.
When we consult the ban list, we see it has been added:
$ varnishadm ban.list
Present bans:
1603272627.622051     0 -  obj.status != 0
1603270370.244746     3 C
Initially all three objects still refer to the initial ban as the one they have seen last. But with the addition of the new ban, that will change.
After a short while, the ban list will look like this:
$ varnishadm ban.list
Present bans:
1603272627.622051     0 C
1603270370.244746     0 C
The newly added ban is completed, and no objects refer to it because we just removed all objects from the cache. The initial ban is also still around.
As soon as a new object enters the cache, it refers to the last one it has seen:
$ varnishadm ban.list
Present bans:
1603272627.622051     1 C
If you look at the timestamp, it is 1603272627.622051, which matches
the ban we just executed.
Let’s have a look at a ban list that already has some ban expressions on it:
$ varnishadm ban.list
Present bans:
1603273224.960953     2 -  req.url ~ ^/[a-z]$
1603273216.857785     0 -  req.url ~ ^/[a-z]+/[0-9]+
1603272627.622051     9 C
Nine objects saw 1603272627.622051 as their last ban. This means that
up to two ban expressions should be evaluated for those objects.
For two objects, 1603273224.960953 was the last one they saw. These
objects aren’t subject to any invalidation. These were objects that were
inserted into to cache after the two recent bans were added.
There are zero objects that saw 1603273216.857785 as their last ban.
This kind of makes sense because if you do the math between the last and
the second-to-last ban, you’ll see there’s only an eight second
difference between the two bans. During those eight seconds, no new
objects got added to the cache.
As time progresses, you’ll see that the req.url ~ ^/[a-z]+/[0-9]+
evaluation has completed, and that those nine objects have been
processed:
$ varnishadm ban.list
Present bans:
1603273224.960953     2 -  req.url ~ ^/[a-z]$
This means that nine objects were invalidated since they are no longer referenced.
Any future bans that are executed will apply to the two remaining objects, as long as they have not expired.
We have to be honest: there is one piece of crucial information we held back from you.
Throughout this section about banning, we talked about ban expressions, the ban list, and about objects being matched. But we never mentioned what mechanism is responsible for that.
There is a thread, which was mentioned in the Under the hood section of chapter 1, that is called the ban lurker.
This thread will inspect the ban list, and match the ban expression to the right objects.
The ban lurker thread has some runtime parameters that control its behavior:
ban_lurker_age is the minimum age an object should have before it is
processed by the ban lurker. The default value is 60 seconds.ban_lurker_sleep is the number of seconds the ban lurker sleeps
before processing another batch. The default value is 0.010 seconds.ban_lurker_batch is the number of bans the ban lurker processed
before going back to sleep. The default value is 1000.ban_lurker_holdoff sets the number of seconds the ban lurker holds
off when lock contention occurs during a cache lookup. The default
value is 0.010 seconds.ban_cutoff limits the ban lurker from inspecting the ban list
until the ban_cutoff limit is reached; beyond that it treats all
objects as if they matched a ban and removes them from cache. The
default value is 0.ban_dup eliminates older identical bans when a new ban is added. The
default value is on.Every 0.010 seconds the ban lurker will look for objects that are at least one minute old. The lurker will process 1000 at a time. It looks for the position of that object on the ban list and applies the most recent bans up until the point when a ban expression matches.
When a match is found that object is put on the expiry list and is removed from the cache shortly thereafter.
Because the ban lurker is a separate thread that has no knowledge of any incoming HTTP request, its scope is limited to the object context.
Any ban expression that refers to an obj.http.* or an obj.status
field can be processed by the ban lurker. Basically only the response
information that is part of the object is available to the ban
lurker.
This begs the question: how do expressions that contain req.url or
req.http.* get processed? It’s obvious that these bans are not the
responsibility of the ban lurker.
When the request context is used in a ban expression, the worker thread that handles the incoming request is responsible for this.
This means that such bans aren’t processed asynchronously and that space is only freed from the cache when a request comes in that matches one of these ban expressions.
We’ll talk about the performance impact of synchronous bans in one of the next sections.
Now that we know the scope of the ban lurker, and the fact that the worker thread is responsible for handling bans within the request scope, it seems as though request-based ban expressions cannot be used in an efficient way.
To that we say: >Use your imagination, and be creative.
If this were a Tweet or a Facebook post, we would have added an emoji.
If the object has no information about the request, add this information in VCL:
sub vcl_backend_response {
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}
These two custom headers basically store the request context as custom response headers.
And then you can trust that the ban lurker will be able to evaluate the following expression:
obj.http.x-url == / && obj.http.x-host == example.com
This is what we call lurker-friendly bans. As this is quite the mouthful, we can also just call them asynchronous bans.
Let’s take our best-of-both-worlds example, and apply asynchronous banning:
vcl 4.1;
acl purge {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-invalidate-pattern) {
			return(purge);
		}
		ban("obj.http.x-url ~ " + req.http.x-invalidate-pattern
			+ " && obj.http.x-host == " + req.http.host);
		return (synth(200,"Ban added"));
	}
}
sub vcl_backend_response {
	set beresp.http.x-url = bereq.url;
	set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
	unset resp.http.x-url;
	unset resp.http.x-host;
}
To make this work, we had to add the beresp.http.x-url and
beresp.http.x-host URLs, but we also have to clean them up upon
delivery. And the fields we’re matching in the ban() function now
reflect these two custom response headers.
And then you can send the following HTTP request to Varnish:
PURGE / HTTP/1.1
Host: example.com
X-Invalidate-Pattern: ^/products/
After the object reaches the ban_lurker_age value, the ban lurker
will come in and expire the object. Unlike synchronous bans within the
request scope, the worker thread won’t have to do the heavy lifting,
the ban lurker will.
Most cache invalidation implementations focus solely on the URL as a way to identify and invalidate objects. This only works if the content in your application can easily be mapped to one or more URLs.
But for more advanced content that appears all over the place, it is nearly impossible to map this to the right URLs.
An alternative approach is to tag content, and ban objects based on these tags.
Consider the following HTTP response:
HTTP/1.1 200 OK
Cache-Control: public, s-maxage=60
Tags: tag1 tag2 tag3
...
This response will be stored in cache for a minute, but if you want to
get the corresponding object out of cache earlier, you can issue a ban
that matches a tag from the Tags header.
The following ban expression would remove every object from cache that
has tag1 in its Tags header:
obj.http.tags ~ tag1
You can also include this in your VCL code:
vcl 4.1;
acl purge {
	"localhost";
	"192.168.55.0"/24;
}
sub vcl_recv {
	if (req.method == "PURGE") {
		if (!client.ip ~ purge) {
			return(synth(405));
		}
		if(!req.http.x-ban-tag) {
			return(synth(400,"x-ban-tag missing"));
		}
		ban("obj.http.tags ~ " + req.http.x-ban-tag);
		return (synth(200,"Ban added"));
	}
}
Triggering the ban via HTTP could result in the following HTTP request:
PURGE / HTTP/1.1
Host: example.com
X-Ban-Tag: tag1
Just like purges, you can call bans using command line HTTP clients:
#HTTPie
http PURGE "www.example.com" "X-Purge-Pattern:^/contact"
# curl
curl -X PURGE -H "X-Purge-Pattern:^/contact" "www.example.com"
But as we’ve seen earlier, there are other command line tools in place to trigger bans:
varnishadm ban obj.http.Content-Type ~ ^image/
And for WordPress, Drupal, Magento, and many others, there are community-maintained plugins available that perform bans in Varnish.
But not all of these frameworks implement their cache purging logic using an HTTP endpoint. Magento, for example, connects to the Varnish command line over a TCP socket.
We’ll talk about the Varnish CLI socket protocol in chapter 7.
If you factor in the scope of bans, and enforce request-based bans to be lurker friendly, it does seem like a great solution. For most people it is.
However, banning is architected in such a way that it can become a very CPU-intensive process.
Because every object (n) has to be matched against all remaining ban
expressions (m), the complexity is n * m. This means that if you
have a lot of objects and a lot of bans, a lot of computations need to
happen.
For asynchronous bans, the burden is on the ban lurker thread. But for synchronous bans, the worker thread is responsible for processing request-related items on the ban list. In that case, the computationally heavy logic might slow down the request.
Potential performance issues related to bans also depend on the quality of the regular expression that is used: the more complicated the regex, the longer it takes to process.
Header matching will also have an impact because the ban lurker or the worker thread needs to cycle through all headers until the matching header is found.
This adds extra complexity, and the more headers a request or response has, the more work that needs to happen.
Long story short: the unparalleled flexibility of bans comes at a cost. Perhaps like all things in life.