Varnish Controller has support for reporting Varnish statistics. The statistics gathered are the same as the varnishstat
tool
produces. The parts that should be gathered can be configured per agent.
The vmod-accounting
will be used for gathering VCLGroup specific statistics if the Varnish server supports vmod-accounting
(>=6.0.8r2). It will be
automatically activated if an applicable version is in use. It can be turned off by a flag (-accounting=false
) to the agent.
Varnish Controller supports persisted aggregated statistics. These are reported as an average over the given aggregation period. The following exists:
There are current reported statistics available which is a snapshot of current values from varnishstat
. The current values are updated
as often as configured for each agent (-stats-interval
).
The statistics are stored in the database as long as configured in brainz. Brainz will perform removal of old statistics based on the following configuration:
-keep-stats-1m
(default 1 hour)-keep-stats-10m
(default 1 day)-keep-stats-1h
(default 1 week)-keep-stats-1d
(default 1 year)-keep-stats-1mo
(default 10 years)The amount of statistics gathered highly depends on number of agents, VCLGroups and configured counters. The database usage is depending on the amount of statistics gathered and also how long statistics is kept in the database, configured by the flags described above.
There are three counter types that are supported for persistent statistics, they are boolean(q
), counter(c
) and gauge(g
).
Counters are stored in differential way in the persistent database, meaning that each time frame (e.g. 1m slot) will store the number of incoming client requests during that time period (as an example).
For boolean values, the average of the values for a certain time period is calculated. A value of 0.5 for a 10min time period means that half the period 0 (false) was reported and half the period 1 (true) was reported. It will not state which time in this period it was reporting which value.
For gauge values, an average is reported for the given time period.
When traffic routers are setup with Varnish Controller they are sampling statistics. These counters cannot be retrieved via varnishstat
rather they are
specific for the Varnish Controller system. The router statistics are sampled per router, domain and agent.
Requests that comes in and are forwarded towards a given agent (endpoint) by the router, will be counted on that particular agent, router and domain that was hit.
The statistics can be retrieved for the particular agents via the agent statistics and also via router statistics.
Examples fetching statistics using the CLI:
# Counters for the given agent(1) related to routing
vcli agent stats -f name=Router_\* -f agent_id=1
# Counters for router(1)
vcli router stats -f router_id=1
Counters are defined per agent and has a default set configured. It’s possible to change which counters to report per VCLGroup/Accouting/Agent using the following flags to the agent (using a comma separated list of counter names):
-agent-stats-filter
(Counters such as MAIN.client_req
)-vclgroup-stats-filter
(Backend counters for VCLGroups)-accounting-stats-filter
(VMOD Accounting specific counters)Historical statistics can be configured in the UI dashboard or retrieved via API/CLI.
Some CLI examples:
# Fetch last reported values for all agents
vcli agent stats
# Fetch statistics for agent with id 1 and show aggregated statistics of 1 minute intervals.
vcli agent stats -fagent_id=1 -a 1m
# Fetch statistics for agent with id 1 and show aggregated statistics of 1 hour intervals.
# Sort by counter name descending order.
vcli agent stats -fagent_id=1 -a 1h -fsort=name:desc
# Fetch statistics for a VCLGroup for 1 minute intervals.
vcli vg stats -fvcl_group_id=1 -a 1m
It is possible to create custom counters via vmod_accounting
when using the Varnish Controller. If the VCLGroup is deployed using a shared deployment, the agent generates
a so-called “root vcl” that is creating an accounting namespace for the deployed VCLGroup(s). Therefore, an accounting namespace should not be created. The namespace is called
vg_<vclgroup_id>
and the default accounting key that the agent samples statistics for is the total
key. To keep sampling all statistics for a VCLGroup, keep this total
key
when adding custom keys (see example below).
Adding custom keys to sample statistics for can be done by following the example below:
vcl 4.1;
import accounting;
backend default none;
sub vcl_recv {
if (req.http.User-Agent ~ "^curl") {
accounting.add_keys("browser_curl");
} else {
accounting.add_keys("browser_other");
}
return (synth(200, "OK"));
}
Then configure each agent that should report these statistics with the configuration parameter -accounting-stats-keys
(VARNISH_CONTROLLER_ACCOUNTING_STATS_KEYS).
Example:
# Report total for the VCLGroup and the custom keys
VARNISH_CONTROLLER_ACCOUNTING_STATS_KEYS="browser_curl,browser_other,total"
# Only report custom keys
VARNISH_CONTROLLER_ACCOUNTING_STATS_KEYS="browser_curl,browser_other"
These statistics will be available on the VCLGroup and can be seen via the GUI dashboard or vcli
.
Viewing these statistics using the vcli
:
# To list all client_req for the browser counter keys
vcli vg stats -f name=browser_\*client_req\*
This is in preview and may be changed in the future!
The statistics endpoints (using API) have support for outputting the statistics in Prometheus format.
API Example:
$ curl -H "Authorization: bearer <access_token>" http://localhost:8002/api/v1/vclgroups/stats?format=prom
# HELP varnish_controller_bereq_bodybytes Request body bytes
# TYPE varnish_controller_bereq_bodybytes untyped
varnish_controller_bereq_bodybytes{agent_id="1",agent_name="server2", vcl_group_id="2", vcl_group_name="test2"} 10
varnish_controller_bereq_bodybytes{agent_id="1",agent_name="server2", vcl_group_id="1", vcl_group_name="test1"} 20
...
As of version 5 we added support for the oAuth token request, this enables the periodic pull of the statistics endpoints with automatic authentication. This can be used for example with Prometheus or Grafana as seen in the example below. Replace the following:
controller_username
with a username of a Varnish Controller user.controller_password
with a password of the same Varnish Controller user.controller_organization
with the organization name that is known in the Varnish Controller and belongs to that user. If you want to authenticate as the system administrator you can remove the query parameter ?org=
.127.0.0.1:8002
with the IP or host where the API-GW of the controller is running.http
to https
if you are using HTTPS.Example Prometheus configuration:
scrape_configs:
- job_name: "varnish-stats"
scrape_interval: 15s
scrape_timeout: 10s
oauth2:
client_id: "controller_username"
client_secret: "controller_password"
token_url: "http://127.0.0.1:8002/api/v1/auth/oauth/token?org=controller_organization"
static_configs:
- targets: ["127.0.0.1:8002"]
metrics_path: "/api/v1/vclgroups/stats"
params:
format:
- "prom"
scheme: "http"
bitmap
is not supported in the persistent statistics.vmod-goto
counters are currently not supported.See agent configuration how to configure statistics gathering per agent.