Here we will go through some of the typical use cases for VCS. There are endless opportunities for tracking all aspects of website behavior. These examples will hopefully give you a good idea of what is possible and provide you with a bit of inspiration.
Most of these examples can be implemented with a few lines of VCL code in your Varnish setup. They work with both the vanilla Varnish Cache release and the Varnish Enterprise release.
By default, vcs-agent
installs with the -d
parameter enabled.
This configuration automatically generates a key for each URL, HOST, and a global ALL key.
Manually tagging a request with a key is done in VCL, by writing an std.log()
line prefixed with the string "vcs-key:"
. The default key configuration is
equivalent to the following VCL, used as an example:
sub vcl_deliver {
std.log("vcs-key:ALL");
std.log("vcs-key:HOST/" + req.http.Host);
std.log("vcs-key:URL/" + req.http.Host + req.url);
}
In the above example, all requests will be tagged with the following keys:
Host
header (example.com
)Host
+ URL
(example.com/foo
)vcs-key
ALL
For std.log()
you will also need to include the std
VMOD, with
an import std;
directive in your VCL.
VCS has a flat namespace. Every key is created in this namespace. So, in order to add a bit of organization to your VCS setup we recommend your split the namespace into various sub-namespaces.
To split a namespace we recommend you use a separator. We recommend
you use /
and we’ll be using it in our examples here.
The reasons for splitting the namespace would be to create queries
against VCS that gives you some subset of the data in VCS. Lets say
that you use VCS to track the number of views on your website. If you
prepend those keys with VIEWS
you can query VCS to give you a top
list of the views by asking it to show you the top list of every key
beginning with VIEWS
.
Then you might have another query that gives you the top list of
caches misses - MISSES
and other logical groups.
To omit certain URLs (or requests) from the default keys, remove the
-d
parameter from vcs-agent
’s systemd configuration.
Next, add the following VCL to generate the VCS keys, using an if
statement to skip the requests you do not want to send to VCS:
sub vcl_deliver {
if (req.url != "/healthcheck") {
std.log("vcs-key:ALL");
std.log("vcs-key:HOST/" + req.http.Host);
std.log("vcs-key:URL/" + req.http.Host + req.url);
}
}
In the above example, requests for /healthcheck
will not be sent to
VCS.
Using histograms is a good way to get more detailed data from each bucket. To
specify a histogram to be generated for a certain stat you use the -H
option.
vcs -H reqbytes:0,10,1000,10000
The above example will produce a histogram for stat reqbytes with limits 0,10,1000,10000. The output from VCS will have a corresponding histogram added to each bucket like this:
{
"allowlist/login.css": [
{
"timestamp": "2021-08-30T17:58:00+00",
"n_req": 17,
"n_req_uniq": "NaN",
"n_miss": 0,
...
"resp_5xx": 0,
"histograms": [
{
"type": "reqbytes",
"limits": [0, 10, 1000, 10000],
"counts": [0, 4, 12, 1],
"sum": 37645,
"count": 17
}
]
},
{
"timestamp": "2021-08-30T17:57:30+00",
"n_req": 3,
"n_req_uniq": "NaN",
"n_miss": 0,
...
"resp_5xx": 0,
"histograms": [
{
"type": "reqbytes",
"limits": [0, 10, 1000, 10000],
"counts": [0, 3, 0, 0],
"sum": 314,
"count": 3
}
]
},
{
"timestamp": "2021-08-30T17:57:00+00",
...
}
]
}
You can specify more than one histogram, and you can use different formats. You can also specify the histogram with a label. Histograms that are specified with a label will only be generated for key that include a matching label string in their name.
$ vcs -H reqbytes:0,10,1000,10000 -H lbl1:restarts:A,1,4 -H lbl2:n_bodybytes:G,50,100,500 -H berespbytes:A,1,3,A,150,500,A,10,500
Let’s break down the above example.
First we have the explicitly specified sequence from the previous example:
-H reqbytes:0,10,1000,10000
Next we have a histogram with a label that is specified as an arithmetic sequence:
-H lbl1:restarts:A,1,4
This will produce a histogram with ranges that go from 0 and increase by 1 up to, but not exceeding, 4. Adding a label means that this histogram will only be generated for keys that have the specified label included as a label string in the key name.
Following we have a histogram specified as a geometric sequence:
-H lbl2:n_bodybytes:G,50,100,500
A geometric specification will produce a sequence where each limit increases with 100 percent of the previous value up to, but not exceeding, 500.
And finally we have a sequence with a repeated arithmetic sequence:
-H berespbytes:A,1,3,A,150,500,A,10,500
This syntax is perfectly valid and each new sequence will continue where the previous one ended.
The resulting vcs output from the above specification (when specifying both labels) will be:
{
"allowlist/login.css:lbl1,lbl2": [
{
"timestamp": "2021-08-30T17:58:00+00",
"n_req": 17,
"n_req_uniq": "NaN",
"n_miss": 0,
...
"resp_5xx": 0,
"histograms": [
{
"type": "reqbytes",
"limits": [0, 10, 1000, 10000],
"counts": [0, 4, 12, 1],
"sum": 37645,
"count": 17
},
{
"type": "lbl1:restarts",
"limits": [0, 1, 2, 3, 4],
"counts": [4, 0, 0, 0, 0],
"sum": 5,
"count": 17
},
{
"type": "lbl2:n_bodybytes",
"counts": [2, 13, 1, 0, 1],
"limits": [0, 50, 100, 200, 400],
"sum": 1903,
"count": 17
},
{
"type": "berespbytes",
"counts": [0, 1, 8, 1, 5, 0, 0, 0, 0, 0, 2],
"limits": [0, 1, 2, 3, 153, 303, 453, 463, 473, 483, 493],
"sum": 19915,
"count": 17
}
]
},
{
"timestamp": "2021-08-30T17:57:30+00",
...
}
]
}
As indicated above, you can use labels to control which histograms to produce
for which vcs keys. Specifying a label name for a histogram on the VCS
command line is one part of this, the other is to add the label to the VCS log
in you VCL. This is done by adding a :
after the regular key name and the
label. To specify more than one label you separate them with a comma (,
).
This example shows how you can apply a label to a key to control what histograms
should be generated by VCS for that key:
sub vcl_deliver {
if (req.http.user-agent ~ "mobile") {
std.log("vcs-key:MOBILE/" + req.url + ":lbl1,lbl2");
}
else {
std.log("vcs-key:" + req.url + ":lbl2");
}
}
This snippet would give you one extra histogram if the request comes from a mobile browser, but skip that for any other requests. By using labels you can choose the type of histogram that makes sense for a particular key without using up resources by generating this data for all keys.
Note:
:
that’s ok, only the part after the last
occurrence of :
will be considered when vcs looks for label specifications.:
that doesn’t match any labels will be ignored.To track which URLs have the slowest response times, we can make use of VCS’ ability to provide a sorted list of response times for the keys it is tracking. Simply issuing a request for:
/all/top_ttfb
will produce a list of the keys associated with the 10 slowest requests. To further get a breakdown of this, for example to get the actual URLs, we can make use of the default keys and combine this with VCS’ regex matching capabilities:
/match/^URL/top_ttfb
The abbreviation ttfb
stands for time to first byte, and is the time
between Varnish first started handling the request until it started
transmitting the first byte to the client.
For a news site there are a few specific things you might want to track. CMS systems typically have unique article IDs that identify one article. Logging the article IDs into VCS gives you easy real time access to what stories are being read right now. We have customers that are embedding this information on their websites generating the what is hot right now lists we often see on a news site.
Logging the article ID and not just the URL make the list ignore different presentations of the same article and makes the list about the articles themselves. It also removes the need to normalize the URL in any way, so query strings that annotate links will not pollute the list itself.
If your CMS can produce an x-artid
header you should be all set.
In vcl_deliver
you would need to add the following:
sub vcl_deliver {
std.log("vcs-key:ARTICLE_ID/" + resp.http.x-artid);
}
You can expand on the setup in several ways. One might for instance also want to measure the social impact of each article by looking at the referrer header (if set).
In vcl_deliver
add the following:
sub vcl_deliver {
if (req.http.referer) {
std.log("vcs-key:ARTREF/" + resp.http.x-artid + "/" + req.http.referer);
}
}
You might also want to expand it further by looking at the user agent
and adding a separate time series for mobile views. In vcl_deliver
:
sub vcl_deliver {
if (req.http.user-agent ~ "mobile") {
std.log("vcs-key:MOBILE/" + resp.http.x-artid);
}
}
Many websites want to measure conversions. A conversions might be having a user click a link to sign up, putting an item in the shopping basket. Another use case would be for a paid content site, where the conversion happens with the user clicking the sign up page when reading a specific article.
The first step is to identify the conversion taking place, typically done by looking at the request URL, maybe in combination with the HTTP method used.
In this example our article page might be /news/art/23245
. On that
page there is a link pointing to /signup
. To register the conversion
in VCS with the article as the main key we would need the following
VCL in vcl_deliver
:
sub vcl_deliver {
if (req.url == "/signup") {
set req.http.artid = regsub(...);
std.log("vcs-key:CONVERSION/SIGNUP/" + req.http.artid);
}
}
For a more in depth discussion on using VCS to track conversions, and also a how-to on doing AB testing with Varnish and VCS, please see this blog post: https://info.varnish-software.com/blog/live-ab-testing-varnish-and-vcs
If you are streaming HLS/HDS/Smooth/DASH through Varnish you might want to count the number of users on each Varnish server. This might be useful for statistical reasons but might also be used for directing traffic to your various Varnish Cache clusters.
The tricky part is to uniquely identify a user. In order to do this you need some sort of session cookie to be preset on the client. All the HTTP video clients are suppose to support cookies. If there is a cookie already present we can probably utilize it, if not we have to generate a random one.
We recommend using the cookie VMOD when working with cookies. It will make the VCL much more readable. The following VCL sets a cookie if there is none present.
In vcl_deliver
:
import cookie;
sub vcl_deliver
{
cookie.parse(req.http.cookie);
set req.http.X-vcsid = cookie.get("_vcsid");
if (req.http.X-vcsid == "") {
set req.http.X-vcsid = std.random(1, 10000000) + "." + std.random(1, 10000000);
set resp.http.Set-Cookie = "_vcsid=" + req.http.X-vcsid + "; HttpOnly; Path=/";
}
std.log("vcs-key:SESSION/" + req.http.X-vcsid + "/" + req.http.Host + req.url);
}
There is a blog post on the matter that discusses this in some detail: https://info.varnish-software.com/blog/getting-live-statistics-varnish-hlshds
In an e-commerce setting VCS can be used to give stats about how various SKUs behave. A typical use case would be running statistics on which SKUs receive what traffic. In addition there are various other aspects that VCS can help gather data on:
In vcl_deliver
:
sub vcl_deliver {
if (req.url ~ "/sku/\d+") {
set req.http.sku = regsub(...);
std.log("vcs-key:VIEWSKU/" + req.http.sku);
if (req.http.referer ~ "facebook.com|twitter.com") {
std.log("vcs-key:SOCIAL/" + req.http.sku);
}
if (req.http.referer ~ "yahoo.com|google.com") {
std.log("vcs-key:ORGANIC/" + req.http.sku);
}
if (req.url ~ "/ajax/put/\d+") {
std.log("vcs-key:PUTBASKET/" + req.http.sku);
}
}
}