Setting up Varnish Enterprise and Varnish Controller for Prometheus monitoring

Introduction

Varnish Enterprise and Varnish Controller fully support exporting metrics to Prometheus. This can be done through vmod_stat via Varnish Enterprise itself, or by using aggregated metrics exported by Varnish Controller.

Setting up Prometheus scraping through vmod_stat

vmod_stat expose Varnish Enterprise metrics through each individual Varnish Enterprise instance. In the most basic configuration, if Prometheus is already configured to scrape metrics on a Pod matching annotations, this can be done by configuring server.vclConfig and server.podAnnotations as follows:

---
server:
  vclConfig: |
    vcl 4.1;

    import stat;

    backend default {
      .host = "www.example.com";
      .port = "80";
    }

    sub vcl_recv {
      if (req.url == "/metrics") {
        return (pass);
      }
    }

    sub vcl_backend_fetch {
      if (bereq.url == "/metrics") {
        set bereq.backend = stat.backend_prometheus();
      }
    }

  podAnnotations: |
    prometheus.io/scrape: "true"
    prometheus.io/path: "/metrics"
    prometheus.io/port: "{{ .Values.server.http.port }}"
    prometheus.io/scheme: "http"

Adjust server.podAnnotations as needed. In particular, prometheus.io/path must be configured to the path that was configured in server.vclConfig. Once Varnish Enterprise Helm Chart is deployed, the metrics will be available as varnish_*.

Setting up Prometheus scraping through Varnish Controller

Varnish Controller exposes aggregated metrics of the Varnish Enterprise cluster and its VCLGroup in the format that Prometheus expected at /api/v1/vclgroups/stats?format=prom. As this endpoint requires authentication, it is necessary to configure Prometheus with proper credentials prior to scraping. Normally, an admin account is required to access aggregated metrics from all Varnish Enterprise servers and VCLGroups.

It is required that to scrape an aggregate metric through Varnish Controller, Varnish Enterprise must be deployed in a single namespace, as Kubernetes namespace information is currently not collected by Varnish Controller.

Using prometheus-community/prometheus Helm Chart

To configure metrics scraping through Varnish Controller with prometheus-community/prometheus Helm Chart, first create a prometheus-varnish-controller-credentials secret to store the password for an admin user in the same namespace as Prometheus installation (for example, prometheus namespace):

kubectl create secret generic prometheus-varnish-controller-credentials --namespace prometheus --from-literal=password=<admin-password>

Replace <admin-password> with your Varnish Controller’s password. In a typical installation with Varnish Controller Helm Chart where passwords are auto-generated, this password can be obtained with:

export VARNISH_PASSWORD=$(kubectl get secret --namespace varnish varnish-controller-credentials -o jsonpath="{.data.varnish-admin-password}" | base64 --decode)
echo $VARNISH_PASSWORD

Then in prometheus-community/prometheus Helm Chart’s values.yaml, configure server.extraVolumeMounts, server.extraVolumes, and extraScrapeConfigs accordingly:

# in values.yaml of prometheus-community/prometheus
---
server:
  extraVolumeMounts:
    - name: prometheus-varnish-controller-credentials
      readOnly: true
      mountPath: "/etc/secrets/varnish-controller"

  extraVolumes:
    - name: prometheus-varnish-controller-credentials
      secret:
        secretName: prometheus-varnish-controller-credentials

extraScrapeConfigs: |
  - job_name: "varnish-stats"
    scrape_interval: 1m
    scrape_timeout: 10s
    oauth2:
      client_id: "admin"
      client_secret_file: "/etc/secrets/varnish-controller/password"
      token_url: "http://<apigw-host>/api/v1/auth/oauth/token"
    static_configs:
      - targets: ["<apigw-host>"]
        labels:
          namespace: "<namespace>"
    metrics_path: "/api/v1/vclgroups/stats"
    params:
      format:
        - "prom"
      "agents.state":
        - "1"
    scheme: "http"

Replace <apigw-host> with the hostname and port of Varnish Controller APIGW. In Kubernetes cluster, this can be set to the internal service DNS name in the format of <service-name>.<namespace>.svc.cluster.local. For example, if Varnish Controller Helm Chart was installed with the default apigw.service.port (8080) in the varnish namespace, the DNS name for <apigw-host> would become varnish-controller-apigw.varnish.svc.cluster.local:8080.

In the extraScrapeConfigs, Prometheus is configured to apply a static namespace label to scrapped metrics. As these metrics are aggregated from all Varnish Controller Agents, the namespace label should be set to that of Varnish Enterprise’s namespace to allow these metrics to be used with prometheus-adapter for autoscaling. Aggregating metrics through Varnish Controller from Varnish Enterprise deployed in multiple namespaces for use with prometheus-adapter is currently unsupported. Finally, we specify format=prom for Prometheus metrics format, and agents.state to only include active agent data.

Once the prometheus-community/prometheus Helm Chart is deployed, Varnish Controller metrics can be found at varnish_controller_*.