Search
Varnish Controller

Debugging

Introduction

This tutorial explains how to debug Varnish Controller using built in tools to identify issues and troubleshoot problems.

This guide expects a full Varnish Controller environment to be running

Built in Tools

Varnish Controller have a few built in tools that can be used to debug the components in the system. These built in tools are using the REST API and can be reached either via the REST API directly or using the VCLI.

Debug Command

The vcli have support for a debug command that can be issued towards agents and routers.

From version 4.0.0 of the controller, the debug output contains last logged errors and warnings. These log entries are also logged to stdout (syslog/systemd etc), but only the last (configurable) log entries are saved in this debug output. These log entries can give a lot of information if something has gone wrong.

Agent Debug

# Debug a specific agent by ID
vcli agent debug <agent_id>

# Debug one or more agents using filtering
# Wildcards might have to be escaped depending on shell (\*server\*)
vcli agent debug -f name='*server*'

The output of this command is a JSON structure containing a lot of information from the agent(s), such as:

  • Agent Version and revision
  • Agent Memory usage
  • Agent runtime profiling data
  • NATS information
  • Agent Uptime
  • Brainz UID
  • Deployed VCLs and root VCL
  • Agent Last log outputs (errors and warnings)
  • Agent State
  • Agent Configuration

Router Debug

The same ‘debug’ call exists for the routers as well.

# Debug a specific router by ID
vcli router debug <router_id>

# Debug one or more routers using filtering
# Wildcards might have to be escaped depending on shell (\*router\*)
vcli router debug -f name='*router*'

The output is a JSON structure containing information for the router such as:

  • Version and revision
  • Router Memory usage
  • Router runtime profiling data
  • NATS information
  • Router Uptime
  • Router Last log outouts (errors and warnings)
  • Routing Endpoints known to the router
  • Router State
  • Router configuration

System Debug

System debug is a feature used by the controller to bundle debug information from all available components in the Varnish Controller system. The command always takes 5sec to complete. During this time brainz will broadcast messages to all components in the system and wait for responses. The answers will be collected and reported back. The response will contain all components including brainz, agents, routers and api-gw processes.

Since the system is fully scalable, brainz has no actual control of how many components are running. Hence, it will only gather information from the ones responding to the broadcast request. It could miss components that are currently not reachable via NATS.

# Print system debug JSON to stdout 
vcli debug

# Save JSON output to file 
vcli debug --file=/tmp/debug.json 

# Upload the output to Varnish Software filebin for help in support cases.
vcli debug -u <ticket_id>

The system debug contains a lot more information than the individual agent/router debug outputs.

  • Debug for all (responding) agent instances
  • Debug for all (responding) router instances
  • Debug for all (responding) api-gw instances
  • Debug for all (responding) brainz instances
  • Database information (size, tables, rows, statistics)
  • License information
  • Last APILog entries
  • API-GW timing information for all API endpoints (last call + average)

When debugging, interesting parts is especially the database information, API-GW timing information (if any API call takes long time).

VCLI varnishadm Command

Varnish Controller supports to execute arbitrary commands towards all varnish servers and return back the result. This can be useful to gather information from varnish itself from one, all or a set of servers.

For example fetching the last panic using vcli:

# Show panic for agent with ID 1
vcli command panic.show -f id=1

# Show panic for agents with name matching *server*
vcli command panic.show -f name='*server*'

# Show panic for all varnish servers 
vcli command panic.show

# Example output:
$ vcli command panic.show -fname='*server*'
+---------+-------+--------+--------------------------------+
| AgentID | Error | Status |             Result             |
+---------+-------+--------+--------------------------------+
|       1 |       |    300 | Child has not panicked or      |
|         |       |        | panic has been cleared         |
|       2 |       |    300 | Child has not panicked or      |
|         |       |        | panic has been cleared         |
|       3 |       |    300 | Child has not panicked or      |
|         |       |        | panic has been cleared         |
+---------+-------+--------+--------------------------------+

The same vcli command can be used for any command that varnishadm supports (e.g. vcl.list to show VCLs loaded into varnish). Note that making configuration changes of varnish parameters using the vcli command will not be persistant after a varnish restart.

API Logs

Varnish controller saves its own logs to the database. These logs mostly contains errors/information originating from API requests. These logs can give some information on failed actions and API calls.

How long time log entries are saved can be configured with brainz (-keep-api-logs, VARNISH_CONTROLLER_KEEP_API_LOGS and takes a duration like 3h30m).

To view and filter on these logs, using the vcli:

# Show all log entries with severity = error 
vcli apilogs ls -f severity=3 

# Show all logs entries with msg containing 'denied'
vcli apilogs ls -f msg=*denied*

# Show all session entries with msg containing 'magnus' to view logins for a user.
apilogs ls -f key_type=session -f msg='*magnus*'

Component Logs

All components log to stdout which ends up in either syslog, container logs or systemd journal depending on how the processes are run.

All components (except NATS) have a set of log levels, info, warning, error and debug. For debugging a component, it’s recommended to add -log debug (VARNISH_CONTROLLER_LOG=debug) to increase logging. The components will output a lot of debug information, so make sure to disable this after debugging.

Some examples of viewing these logs on different systems, tailing the log in real-time:

# Systemd to view logs on the agent
journalctl -u varnish-controller-agent -f

# For a docker container 
docker logs <container_id> -f 

# Docker-compose
docker-compose logs -t --tail=100 <container_name> -f

# Syslog  (/var/log/messages in some distributions)
tail -f /var/log/syslog |grep -i "varnish-controller-"

Debug single agent/router

From version 5.0 it is possible to retrieve debug information for a single agent or router from the UI.

Note: You will need to be logged in as a system administrator or own the resource via private tokens to retrieve the debug information

image alt >

Navigate to an agent or router that you can access the debug view from, and click the debug tab.

Retrieving a Controller gather

From version 4.1.0 it is possible to retrieve the Controller gather from the UI.

Note: You will need to be logged in as a system administrator to retrieve all the debug information

Go in the main menu at the left of the page to the bottom and click on Help And Support. Click at the top of the page on Controller-gather, next click on the Run Controller Gather button.

image alt >

When the data is retrieved it is displayed on the page. At the bottom of the data you see a button, click on the Download File button to download the Controller gather.

image alt >