Knowing what to replicate can be tricky, so VHA only uses a small set of rules to identify what’s worth replicating and keep things clean. To be replicated, a client request:
forbid_replication()
instruction.x-vha-done
header containing the name of the local
clusterIf a request is deemed good for replication, the original request is used (as
seen in varnishlog
, before the first VCL_call
line), which means that
any modification made through VCL is ignored.
The Transient storage is a special area dedicated to shortlived
objects,
that will disappear very soon from the cache. VHA’s bet is that such objects are
not worth replicating because they’ll probably expire before someone requests
them again.
By default, the shortlived
parameter is set to 10 seconds, any objects
receiving a TTL+Grace+Keep period shorter than that will go into Transient. This
may be a bit short, notably for live streaming where video chunks and manifest
files expire quickly.
To change it, simply add -p shortlived=X
to your varnishd
command line,
with X
being the desired duration in seconds.
On the opposite side, it is also possible to forbid VHA to replicate a request.
To do this, simply call the forbid_replication()
function from vmod_vha
:
import vha;
sub vcl_deliver {
if (resp.http.no-replication) {
vha.forbid_replication();
}
}
Because VHA uses headers to avoid loops in replication, it doesn’t allow users
to blindly modify the replication requests. However, you may need to add some
extra information, such as the original port or IP for example, as they get lost
during replication. VHA allows you to inject information using vmod_vha
’s
add_header()
function:
import vha;
sub vcl_recv {
vha.add_header("x-ip", client.ip);
}
The generated vcl will automatically replicate cache bodies, so you just need to
call std.cache_req_body()
from vmod_std
in vcl_recv
.