Which VMODs are shipped with Varnish Cache?

VMODs can be built and installed separately, but Varnish also ships a couple of VMODs on its own.

Varnish Cache has a set of in-tree VMODs that are part of the source code. This means that these VMODs are included in the standard installation.

It’s quite easy to spot them. When you go to https://github.com/varnishcache/varnish-cache, you’ll see them in the vmod folder:

VMOD name	Description
`vmod_blob`	Utilities for encoding and decoding BLOB data in VCL
`vmod_cookie`	Inspect, modify, and delete client-side cookies
`vmod_directors`	Directors group multiple backends as one, and load balance backend requests these backends using a variety of load-balancing algorithms
`vmod_proxy`	Retrieve TLS information from connections made using the PROXY protocol
`vmod_purge`	Perform hard and soft purges
`vmod_std`	A library of basic utility functions to perform conversions, interact with files, perform custom logging, etc.
`vmod_unix`	Get the user, the group, the gid, and the uid from connections made over Unix domain sockets (UDS)
`vmod_vtc`	A utility module for `varnishtest`

This list comes from the master branch of this Git repository. It represents the current state of the open source project. As mentioned, Varnish Software maintains a 6.0 LTS version of Varnish Cache. In this version all VMODs from the open source project are included, except vmod_cookie, which has been replaced by vmod_cookieplus.

vmod_blob

A BLOB is short for a Binary Large Object. It’s a data type that is used for the hash keys and for the response body.

Here’s an example where we transfer req.hash, a BLOB that represents the hash key of the request, into a string value:

vcl 4.1;

import blob;

sub vcl_deliver {
	set resp.http.x-hash = blob.encode(encoding=BASE64,blob=req.hash);
}

The blob.encode() function is used for the conversion. The name of the function indicates that an encoding format is required. We use base64, which is a common encoding format suitable for use in HTTP header fields.

blob.encode() has three arguments, but we only set two. The second argument has been omitted. It is the case argument that defaults to DEFAULT. But because we’re using named arguments, it is perfectly fine to omit arguments.

When we request http://localhost/, the corresponding x-hash header is the following:

x-hash: 3k0f0yRKtKt7akzkyNsTGSDOJAZOQowTwKWhu5+kIu0=

vmod_blob has plenty of other functions and methods. In the example above we used blob.encode(); there’s also a blob.decode(), which converts a string into a blob. All these functions and methods can be found at https://varnish-cache.org/docs/6.0/reference/vmod_generated.html#vmod-blob.

vmod_cookie

vmod_cookie is only available as of Varnish Cache 6.4, and it facilitates interaction with the Cookie header. You’ve already seen a couple of examples where this VMOD was used to remove and to get cookies.

Imagine the following Cookie header:

Cookie: language=en; accept_cookie_policy=true;
_ga=GA1.2.1915485056.1587105100; _gid=GA1.2.71561942.1601365566; _gat=1

Here’s what we want to do:

If accept_cookie_policy is not set, redirect to the homepage
If /fr is called, set the language cookie to fr
Remove the tracking cookies

Here’s the VCL code to achieve this:

vcl 4.1;

import cookie;

sub vcl_recv {
	cookie.parse(req.http.Cookie);
	if(!cookie.isset("accept_cookie_policy")) {
		return(synth(301,"/"));
	}
	if (req.url ~ "^/fr/?" && cookie.get("language") != "fr") {
		cookie.set("language","fr");
	}
	cookie.filter_re("^_g[a-z]{1,2}$");
	set req.http.Cookie = cookie.get_string();
}

sub vcl_synth {
	if (resp.status == 301) {
		set resp.http.location = resp.reason;
		set resp.reason = "Moved";
		return (deliver);
	}
}

The return(synth(301,"/")) in conjunction with the vcl_synth logic allows you to create a custom HTTP 301 redirect.

The rest of the API and more vmod_cookie examples can be found here: http://varnish-cache.org/docs/trunk/reference/vmod_cookie.html

vmod_directors

vmod_directors is a load-balancing VMOD. It groups multiple backends and uses a distribution algorithm to balance requests to the backends it contains.

We’ll briefly cover two load balancing examples using this VMOD, but in chapter 7 there will be a dedicated section about load balancing.

You already know the next example because we covered it in the VMOD initialization section:

vcl 4.1;

import directors;

backend backend1 {
	.host = "backend1.example.com";
	.port = "80";
}

backend backend2 {
	.host = "backend2.example.com";
	.port = "80";
}

sub vcl_init {
	new vdir = directors.round_robin();
	vdir.add_backend(backend1);
	vdir.add_backend(backend2);
}

sub vcl_recv {
	set req.backend_hint = vdir.backend();
}

We initialize the director in vcl_init, where we choose the round-robin distribution algorithm to balance load across backend1 and backend2.

For the second example, we’re going to take the same VCL, but instead of a round-robin distribution, we’re going for a random distribution. The changes aren’t that big though:

vcl 4.1;

import directors;

backend backend1 {
	.host = "backend1.example.com";
	.port = "80";
}

backend backend2 {
	.host = "backend2.example.com";
	.port = "80";
}

sub vcl_init {
	new vdir = directors.random();
	vdir.add_backend(backend1,10);
	vdir.add_backend(backend2,20);
}

sub vcl_recv {
	set req.backend_hint = vdir.backend();
}

The example above uses a random distribution of the load, but not with equal weighting:

backend1 will receive 33% percent of all the requests
backend2 will receive 66% percent of all the requests

And that’s because of the weight arguments that were added to each backend of the director. The equation for random is as follows: 100 * (weight / (sum(all_added_weights))).

The rest of the API, and more director examples can be found here: https://varnish-cache.org/docs/6.0/reference/vmod_generated.html#vmod-directors

vmod_proxy

vmod_proxy is used to extract client- and TLS- information from a request to Varnish via the PROXY protocol.

Imagine the following Varnish runtime parameters:

$ varnishd -a:80 -a:8443,PROXY -f /etc/varnish/default.vcl

Here’s what this means:

Varnish accepts regular HTTP connections on port 80.
Varnish also accepts connections on port 8443, which are made using the PROXY protocol.
The VCL file is located at /etc/varnish/default.vcl.

Assuming the PROXY connection was initiated by a TLS proxy, we can use vmod_proxy to extract TLS information that is transported by the PROXY protocol.

Here’s a VCL example that extracts some of the information into custom response headers:

vcl 4.1;
import proxy;

sub vcl_deliver {
set resp.http.alpn = proxy.alpn();
set resp.http.authority = proxy.authority();
set resp.http.ssl = proxy.is_ssl();
set resp.http.ssl-version = proxy.ssl_version();
set resp.http.ssl-cipher = proxy.ssl_cipher();
}

And this is some example output, containing the custom headers:

alpn: h2
authority: example.com
ssl: true
ssl-version: TLSv1.3
ssl-cipher: TLS_AES_256_GCM_SHA384

This is what we learn about the SSL/TLS connection through these values:

A successful TLS/SSL connection was made.
The communication protocol is HTTP/2.
SNI determined that example.com is the authoritative hostname.
The SSL/TLS version we’re using is TLSV1.3.
The SSL/TLS cipher is an Advanced Encryption Standard with 256bit key in Galois/Counter mode. The hashing was done using a 384-bit Secure Hash Algorithm.

vmod_std

vmod_std is the standard VMOD that holds a collection of utility functions that are commonly used in everyday scenarios. Although these could have been native VCL function, they were put in a VMOD nevertheless.

vmod_std performs a variety of tasks, but its functions can be grouped as follows:

String manipulation
Type conversions
Logging
File access
Environment variables
Data extraction from complex types

Only a couple of examples for this VMOD have been added, but the full list of functions can be consulted here: https://varnish-cache.org/docs/6.0/reference/vmod_generated.html#varnish-standard-module.

Logging

Let’s start with an example that focuses on logging, but that uses other functions as utilities:

vcl 4.1;

import std;

sub vcl_recv {
	if (std.port(server.ip) == 443) {
		std.log("Client connected over TLS/SSL: " + server.ip);     
		std.syslog(6,"Client connected over TLS/SSL: " + server.ip);
		std.timestamp("After std.syslog");
	}
}

- `std.log()` will add an item to the *Varnish Shared Memory Log (VSL)*
and tag it with a `VCL_Log` tag.
- `std.timestamp()` will also add an item to the *VSL*, but will use a
`Timestamp` tag, and will drop in a timestamp for measurement
purposes.
- `std.syslog()` will add a log item to the *syslog*.

Here's the *VSL* output that was captured using `varnishlog`. You
clearly see the `std.log()` and `std.timestamp()` string values in
there:

    -   VCL_Log        Client connected over TLS/SSL: 127.0.0.1
    -   Timestamp      After std.syslog: 1601382665.510435 0.000147 0.000140

When we look at the *syslog*, you'll see the log line that was triggered
by `std.syslog()`

    Sep 29 14:47:05 server varnishd[1260]: Varnish client.ip: 127.0.0.1

#### String manipulation

Let's immediately throw in an example where we combine a few string
manipulation functions:

``` vcl
vcl 4.1;

import std;

sub vcl_recv {
	set req.url = std.querysort(req.url);
	set req.url = std.tolower(req.url);
	set req.http.User-Agent = std.toupper(req.http.User-Agent);
}

So imagine sending the following request to Varnish:

HEAD /?B=2&A=1 HTTP/1.1
Host: localhost
User-Agent: curl/7.64.0

Here’s what’s happening behind the scenes, based on a specific varnishlog command:

$ varnishlog -C -g request -i requrl -I reqheader:user-agent
*   << Request  >> 23
-   ReqURL         /?B=2&A=1
-   ReqHeader      User-Agent: curl/7.64.0
-   ReqURL         /?A=1&B=2
-   ReqURL         /?a=1&b=2
-   ReqHeader      user-agent: CURL/7.64.0

The input for the URL is /?B=2&A=1
The input for the User-Agent header is curl/7.64.0
The URL’s query string arguments are sorted alphabetically, which results in /?A=1&B=2
The URL is in lowercase, which results in /?a=1&b=2
The User-Agent header is in uppercase, which results in CURL/7.64.0

Environment variables

The std.getenv() function can retrieve the values of environment variables.

The following example features an environment variable named VARNISH_DEBUG_MODE. If it is set to 1, debug mode is enabled, and a custom X-Varnish-Debug header is set:

vcl 4.1;

import std;

sub vcl_deliver {
	if(std.getenv("VARNISH_DEBUG_MODE") == "1") {
		if(obj.hits > 0) {
			set resp.http.X-Varnish-Debug = "HIT";
		} else {
			set resp.http.X-Varnish-Debug = "MISS";        
		}
	}
}

You can set environment variables for your systemd service with systemd edit varnish, and then add Environment="MYVAR=myvalue" under the [Service] section.

Reading a file

vmod_std has a function called std.fileread(), which will read a file from disk and return the string value.

We’re not going to be too original with the VCL example. In one of the previous sections, we talked about setting a custom HTML template for vcl_synth. Let’s take that example again:

vcl 4.1;

import std;

sub vcl_synth {
	set resp.http.Content-Type = "text/html; charset=utf-8";
	set resp.http.Retry-After = "5";
	set resp.body = regsuball(std.fileread("/etc/varnish/synth.html"),
	"<<REASON>>",resp.reason);
	return (deliver);
}

Whenever return(synth()) is called, the contents from /etc/varnish/synth.html are used as a template, and the <<REASON>> placeholder is replaced with the actual reason phrase that was set in synth().

You could also make this conditional by using std.file_exists():

vcl 4.1;

import std;

sub vcl_synth {
	if(std.file_exists("/etc/varnish/synth.html")) {
		set resp.http.Content-Type = "text/html; charset=utf-8";
		set resp.http.Retry-After = "5";
		set resp.body = regsuball(std.fileread("/etc/varnish/synth.html"),
		"<<REASON>>",resp.reason);
		return (deliver);
	}
}

Server ports

The IP type in VCL that is returned by variables like client.ip doesn’t just contain the string version of the IP address. It also contains the port that was used.

But when IP output is cast into a string, the port information is not returned. The std.port() function extracts the port from the IP and returns it as an integer.

Here’s an example:

vcl 4.1;

import std;

sub vcl_recv {
	if(std.port(server.ip) != 443) {
		set req.http.Location = "https://" + req.http.host + req.url;
		return(synth(301,"Moved"));
	}
}

sub vcl_synth {
	if (resp.status == 301) {
		set resp.http.Location = req.http.Location;
		return (deliver);
	}
}

This example will check if the port that was used to connect to Varnish was 443 or not. Port 443 is the port that is used for HTTPS traffic. If this port is not used, redirect the page to the HTTPS equivalent.

vmod_unix

If a connection to Varnish is made over UNIX domain sockets, vmod_unix can be used to figure out the following details about the UDS connection:

The username of the peer process owner
The group name of the peer process owner
The user id of the peer process owner
The group id of the peer process owner

In the list, we refer to the peer process owner: this is the user that executes the process that represents the client-side of the communication. Because connections over UDS are done locally, the client side isn’t represented by an actual client, but another proxy.

A good example of this is Hitch: Hitch is a TLS PROXY that is put in front of Varnish to terminate the TLS connection. For performance reasons, we can make Hitch connect to Varnish over UDS.

Because the peer process doesn’t use TCP/IP to communicate with Varnish, we cannot restrict access based on the client IP address. However, file system permissions can be used to restrict access.

Here’s how vmod_unix can be used to restrict access to Varnish:

vcl 4.1;

import unix;

sub vcl_recv {
	# Return "403 Forbidden" if the connected peer is
	# not running as the user "trusteduser".
	if (unix.user() != "trusteduser") {
		return(synth(403) );
	}

	# Require the connected peer to run in the group
	# "trustedgroup".
	if (unix.group() != "trustedgroup") {
		return(synth(403) );
	}

	# Require the connected peer to run under a specific numeric
	# user id.
	if (unix.uid() != 4711) {
		return(synth(403) );
	}
	
	# Require the connected peer to run under a numeric group id.
	if (unix.gid() != 815) {
		return(synth(403) );
	}
}

The unix.user() is used to retrieve the username of the user that is running the peer process. The example above restricts access if the username is not trusteduser.

You can also use the unix.uid() function to achieve the same goal, based on the user id, instead of the username. In the example above, we restrict access to Varnish if the user id is not 4711.

And for groups, the workflow is very similar: user.group() can be used to retrieve the group name, and user.gid() can be used to retrieve the group id. Based on the values these functions return, access can be granted or restricted.