Varnish Cache Plus

Content Transformation (xbody)

Varnish 6.0

Description

vmod-xbody is a streaming regular expression engine which allows Varnish to manipulate and capture response body text. xbody supports full PCRE regex expressions, capture groups, and backreferences.

xbody supports two modes: regsub and capture.

regsub performs a regular expression substitution. Any number of regsub substitutions can be done on a response body.

capture finds a pattern and then put its substitution, or value, into a JSON object which can then be queried out using VCL or included directly in an Edgestash template. Any number of capture calls can be done on a response body.

When regsub and capture are combined, Varnish can convert a dynamic and personalized response into a JSON object and Edgestash template, allowing for uncacheable content to be cached and acclerated. Please see Edgestash for more information on JSON based templating.

Because xbody is a streaming regex parser, embedded use of anchors (^ and $) is not supported. Anchors can only be used as the first and last characters of your regex expression.

For assistance in creating PCRE compatible regular expressions, please use regex101.com.

Example VCL

Example regsub and capture

Input (backend): This is a test!

import std;
import xbody;

sub vcl_backend_response
{
    xbody.regsub("test", "new string");
    xbody.regsub("\sis", " isn't");
    xbody.regsub("string", "cat");
    xbody.capture("name", "new (\w+)", "\1");
}

sub vcl_deliver
{
    std.log("Name: " + xbody.get("name"));
    std.log("Capture JSON: " + xbody.get_all());
}

Output (cache): This isn't a new cat!

Captures (xbody.get_all()):

{
  "name": "cat"
}

Change All Domain References

vcl 4.0;

import xbody;

sub vcl_backend_response
{
    xbody.regsub("www.example.com", "test.example.com");
    xbody.regsub("admin@example.com", "dev@test.example.com");
}

Javascript CSP Nonce

import crypto;
import edgestash;
import xbody;

sub vcl_backend_response{
    if (beresp.http.Content-Type ~ "html") {
        xbody.regsub("<script>", "<script nonce={{nonce}}>");
        edgestash.parse_response();
    }
}

sub vcl_deliver {
    if (edgestash.is_edgestash()) {
        set req.http.X-nonce = crypto.hex_encode(crypto.urandom(16));
        set resp.http.Content-Security-Policy = "script-src 'nonce-" +
            req.http.X-nonce + "'";
        edgestash.add_json({"
            {
              "nonce":""} + req.http.X-nonce + {""
            }
        "});
        edgestash.execute();
    }
}

Personalization Caching

Cache pages with personalized content:

Hello [name]!

Content ...
vcl 4.0;

import cookieplus;
import edgestash;
import kvstore;
import xbody;

sub vcl_init
{
    // User session (JSON) cache
    new session_json = kvstore.init();
}

sub vcl_recv
{
    unset req.http.X-do-capture;
    unset req.http.X-session;

    if (req.method != "HEAD" && req.method != "GET") {
        return (pass);
    }

    // We do not have session JSON available for this request
    if (!session_json.get(cookieplus.get("session"))) {
        return (pass);
    }

    return (hash);
}

sub vcl_pass
{
    set req.http.X-do-capture = "true";
}

sub vcl_backend_response
{
    if (beresp.http.Content-Type ~ "text") {
        if (bereq.http.X-do-capture) {
            // Do a JSON capture
            xbody.capture("name", "Hello (\S+)", "\1");
        } else {
            // Do a template
            xbody.regsub("Hello \S+", "Hello {{name}}");
            edgestash.parse_response();
        }
    }
}

sub vcl_deliver
{
    // Get the session id
    if (cookieplus.get("session")) {
        set req.http.X-session = cookieplus.get("session");
    } else if (cookieplus.setcookie_get("session")) {
        set req.http.X-session = cookieplus.setcookie_get("session");
    }

    // Store the session JSON
    if (req.http.X-session && xbody.get("name")) {
        session_json.set(req.http.X-session, xbody.get_all(), 24h);
    }

    // Assemble the page
    if (edgestash.is_edgestash()) {
        edgestash.add_json(session_json.get(req.http.X-session));
        edgestash.execute();
    }
}

VHA Support

Because vmod-xbody changes content before going into cache, it can only be executed once when running Varnish in a distributed VHA cluster.

vcl 4.0;

import xbody;

include "vha40.vcl";

sub vcl_backend_response
{
    if (bereq.http.vha-origin) {
        xbody.set(beresp.http.X-body);
    }
    unset beresp.http.X-body;
}

vcl_deliver
{
    if (req.http.vha-fetch) {
        set resp.http.X-body = xbody.get_all();
    }
}

Functions

regsub

VOID regsub(STRING pattern, STRING substitution, STRING mode)

  • Description

    Perform a regex substitution on the response body. Can only be used in vcl_backend_response.

  • Return value

    None

    • pattern

      PCRE regular expression to find.

    • substitution

      Substitution text. Backreferences are allowed (\1 thru \9).

    • mode

      Pattern mode. Defaults to g. Possible values: [none], g.

capture

VOID capture(STRING name, STRING pattern, STRING value, STRING mode)

  • Description

    Perform a regex capture on the response body. If pattern is found, value is set as the value of name. Can only be used in vcl_backend_response.

  • Return value

    None

    • pattern

      PCRE regular expression to find.

    • value

      Value to use if pattern is found. Backreferences are allowed (\1 thru \9).

    • mode

      Pattern mode. Defaults to no mode. Global mode, g is not support with captures.

get

STRING get(STRING name, STRING default)

  • Description

    Get a capture value by name. Can only be used in vcl_deliver.

  • Return value

    The value of name if it exists, otherwise default.

    • name

      The name of the capture. Please see xbody.capture().

    • default

      The default value if name is not found. Defaults to an empty string.

get_all

STRING get_all()

  • Description

    Get all capture values as a JSON object string. Can only be used in vcl_deliver.

  • Return value

    A JSON object representing all capture values. If no capture values exist, an empty JSON object is returned. This is used to send all captures directly to Edgestash or for VHA replication.

set

VOID set(STRING json)

  • Description

    Set capture values. When used, regex and capture operations are skipped. This is used to replicate xbody operations via VHA. Can only be used in vcl_backend_response.

  • Return value

    None

    • json

      Capture values from xbody.get_all(). Must be a valid JSON object.