There are multiple tools and metrics available to monitor a varnish installation. This tutorial aims to provide information on important counters that will assist in monitoring vital aspects of a vanish installation.
There are various commands that can be used to access additional information.
Run varnishlog command:
$ varnishlog -d -g raw -w tmp/varnishlog.raw
Slow client responses:
$ varnishlog -d -g request -q "Timestamp:Resp > 1.0"
Slow backend responses:
$ varnishlog -d -g request -q "Timestamp:Beresp > 1.0"
Requests in waiting list:
$ varnishlog -d -g request -q "Timestamp:Waitinglist > 0.0"
$ varnishlog -d -g request -q "RespStatus ~ '^5' or Timestamp:Resp > 10.0 or Error" -R 100/1m
The varnishstat utility collects and display counter and metrics of a Varnish instance since startup time.
Run varnishstat once and exit:
$ varnishstat -1
varnishstat also accepts filters which can be applied as follows:
varnishstat -1 -f 'filter'
More information on varnishstat and how to use it is available here.
The Varnishadm utility establishes a CLI connection to varnishd(Varnish daemon). The following are useful commands to troubleshoot a Varnish instance via varnishadm.
Collect Varnish parameters:
$ varnishadm -- param.show
Collect ban list:
$ varnishadm -- ban.list
$ varnishadm -- panic.show
Collect backend health:
$ varnishadm -- backend.list -p
Number of parsable client requests received.
Number of cache hits.
Number of cache misses.
Number of times more threads were needed, but limit was reached in a thread pool.
Number of HTTP objects (headers + body, if present) in the cache.
How many objects have been forcefully evicted from storage to make room for a new object.
Number of all bans in the system, including bans superseded by newer bans and bans already checked by the ban-lurker.
Backend content fetches failed.
Contains the number of sessions that are queued because there are no available threads immediately. Consider to increase the thread_pool_min parameter.
Counts how many times sessions are dropped because varnishd hits the maximum thread queue length. You may consider to increase the thread_queue_limit Varnish parameter as a solution to drop less sessions.
Number of objects mailed to expiry thread for handling.
Number of objects received by expiry thread for handling.
Total number of threads being used by Varnish.
Number of least recently used (LRU) objects thrown out to make room for new objects. If this is zero, there is no reason to enlarge your cache. Otherwise, your cache is evicting objects due to space constraints. In this case, consider increasing the size of your cache.
Number of inserts that timed out.
Number of LRU nuked objects.
Number of LRU move operations.
Stored objects cache hits.
Stored objects cache misses.
Number of YKeys registered.
Number of objects purged with YKey.
Number of bytes allocated from the storage.
Number of bytes left in the storage.
These counters serve as a great tool to monitor the health and performance of varnish. There are, however, a multitude of additional counters that can be utilized.
More detailed information on the predefined counters can be found in the
varnish-counters man page.
$ man varnish-counters