Search
Varnish Controller

Known Issues

This page lists known issues with Varnish Controller. These can be identified bugs that are being worked on with potential workarounds for them. If nothing else is stated, these issues applies to all versions.

Cannot migrate version v400

Note: This issue has been addressed in Version 5.3.1.

Varnish Controller uses SQL scripts to migrate from version to version. Release 5.3.0 contains a bug with migrations that will cause a duplicate key error, a data discrepancy that was created with the release of Version 4.0.0. Only Varnish Controller versions running prior to 4.0.0 may receive the following start-up error when upgrading Brainz to 5.3.0:

failed to migrate DB: cannot migrate version v400: pq: duplicate key value violates unique constraint "varnish_stat_uniq"

The changes in Version 4.0.0 causes the varnish_stat table to return duplicates for all rows that were in the table at that time. To resolve this error, we must delete the duplicate rows that are no longer in use. While this may be a lot of rows, remember that they are no longer of use to Varnish Controller.

DELETE FROM varnish_stat WHERE router_id IS NULL AND domain_id IS NULL;

Max sequence ID reached for varnish_stat table

Note: This issue has been addressed in Version 5.1.0.

The Varnish Controller uses the PostgreSQL UPSERT logic, a query would look like this INSERT INTO varnish_stat ON CONFLICT DO UPDATE data. These queries may cause issues after running the Varnish Controller for some time. Depending on the number of agents, Varnish statistic counters, routers, and domains you will get this error:

ERROR:  nextval: reached maximum value of sequence "varnish_stat_id_seq" (2147483647)

The error explains that the maximum value of the ID sequence has been reached, but when you look in your varnish_stat table you will see that there are not as many rows in there. This has to do with the way PostgreSQL is handling the INSERT ... ON CONFLICT .... PostgreSQL will increment the ID sequence even when the ON CONFLICT part is reached. Without actually inserting data the ID is still incremented to prevent concurrency issues. When processing the Varnish statistics we run many of these queries and eventually the Varnish Controller is running out of IDs. We are working on a proper fix for our next release, but to keep your system collecting statistics we have written a query that can be run straight into your database to reset the ID sequence.

Workaround

By running the following query we are resetting all the IDs of the varnish_stat table and instructing the varnish_stat ID sequence to start counting from the real row count again. This SQL script can be ran multiple times without any problems.

BEGIN;
-- Use the same advisory lock as we do when inserting statistics to prevent concurrency
-- This is a transactional advisory lock which is automatically unlocked upon rollback or commit
SELECT pg_advisory_xact_lock(4011117430);

DO $$
  DECLARE
    statrec record;
    i int = 0;
  BEGIN
    -- Loop over all varnish_stat records
    for statrec in
      SELECT * FROM varnish_stat ORDER BY id
    loop
      -- PSQL loops do not have indexes build in so we build our own index which is also our new ID
      i := i+1;

      -- First update the values tables as those will be updated automatically when we update the varnish_stat table
      IF statrec.id != i THEN
        UPDATE varnish_stat_values_1m SET varnish_stat_id = i WHERE varnish_stat_id = statrec.id;
        UPDATE varnish_stat_values_10m SET varnish_stat_id = i WHERE varnish_stat_id = statrec.id;
        UPDATE varnish_stat_values_1h SET varnish_stat_id = i WHERE varnish_stat_id = statrec.id;
        UPDATE varnish_stat_values_1d SET varnish_stat_id = i WHERE varnish_stat_id = statrec.id;
        UPDATE varnish_stat_values_1mo SET varnish_stat_id = i WHERE varnish_stat_id = statrec.id;
        UPDATE varnish_stat SET id = i WHERE id = statrec.id;
        RAISE NOTICE 'Updated counter: % from ID % to ID %', statrec.name, statrec.id, i;
      END IF;
    end loop;

    -- After we have updated the records we reset the sequence to start from the last ID number
    -- we add 1 as the last value of i is the last ID of the counter we used. So if the last counter has ID
    -- 100 we want the sequence to restart at 101.
    i := i+1;
    RAISE NOTICE 'Restart varnish_stat_id_seq sequence from %', i;
    execute 'ALTER SEQUENCE varnish_stat_id_seq RESTART WITH ' || i;
  END;
$$;

-- Transactional lock is automatically unlocked after commit or rollback

COMMIT;

Same Domain for Multiple VCLGroups (Request Routing)

Only one VG can be deployed for the same domain to the router for it to be routed correctly. The router only knows about the domain of incoming requests and can currently not distinguish between different VCLGroups with the same domain. It will not know to which one the request should be routed to.

Workaround

In order to route to a different set of servers such as Edge first, then Storage as fallback. One can create a second VCLGroup with a different domain, say example2.com and use that as an External Route and use the External Route as a fallback in the first Routing Rule for the first VCLGroup. This will create an extra HTTP 302 redirect, but only when the first VCLGroup has no available servers.

- VCLGroup1
    - Domain: example.com
    - RoutingRule: rr1
        - Lookup-order: <decision>,external
        - ExternalRoutes: ext1
            - URL: http://example2.com
- VCLGroup2
    - Domain: example2.com
    - RoutingRule: rr2
        - Lookup-order: <decision>