Varnish High Availability

Controlling replication

Knowing what to replicate can be tricky, so VHA only uses a small set of rules to identify what’s worth replicating and keep things clean. To be replicated, a client request:

  • must trigger a backend fetch.
  • must place the object in cache, but not in Transient storage.
  • must not be flagged with the forbid_replication() instruction.
  • must not have an x-vha-done header containing the name of the local cluster

If a request is deemed good for replication, the original request is used (as seen in varnishlog, before the first VCL_call line), which means that any modification made through VCL is ignored.

Avoiding Transient storage

The Transient storage is a special area dedicated to shortlived objects, that will disappear very soon from the cache. VHA’s bet is that such objects are not worth replicating because they’ll probably expire before someone requests them again.

By default, the shortlived parameter is set to 10 seconds, any objects receiving a TTL+Grace+Keep period shorter than that will go into Transient. This may be a bit short, notably for live streaming where video chunks and manifest files expire quickly.

To change it, simply add -p shortlived=X to your varnishd command line, with X being the desired duration in seconds.

Preventing replication

On the opposite side, it is also possible to forbid VHA to replicate a request. To do this, simply call the forbid_replication() function from vmod_vha:

import vha;

sub vcl_deliver {
    if ( {

Adding headers to replication requests

Because VHA uses headers to avoid loops in replication, it doesn’t allow users to blindly modify the replication requests. However, you may need to add some extra information, such as the original port or IP for example, as they get lost during replication. VHA allows you to inject information using vmod_vha’s add_header() function:

import vha;

sub vcl_recv {
	vha.add_header("x-ip", client.ip);

Replicating bodies

The generated vcl will automatically replicate cache bodies, so you just need to call std.cache_req_body() from vmod_std in vcl_recv.