vmod-xbody
provides access to request and response bodies.
On the response side, xbody
is a streaming regular expression engine which allows
Varnish to manipulate, capture, hash, and log body text. xbody
supports full PCRE regex expressions,
capture groups, and backreferences.
xbody
supports two important modes: regsub
and capture
.
regsub
performs a regular expression substitution. Any number of regsub
substitutions
can be done on a response body.
capture
finds a pattern and then put its substitution, or value, into a JSON object
which can then be queried out using VCL or included directly in an Edgestash template.
Any number of capture
calls can be done on a response body.
When regsub
and capture
are combined, Varnish can convert a dynamic and personalized
response into a JSON object and Edgestash template, allowing for uncacheable content
to be cached and acclerated. Please see Edgestash for
more information on JSON based templating.
Note: Because xbody
is a streaming regex parser, embedded use of anchors (^
and $
) is not supported.
Anchors can only be used as the first and last characters of your regex expression.
For assistance in creating PCRE compatible regular expressions, please use regex101.com.
Click here for legacy vmod_bodyaccess
.
Input (backend): This is a test!
import std;
import xbody;
sub vcl_backend_response
{
xbody.regsub("test", "new string");
xbody.regsub("\sis", " isn't");
xbody.regsub("string", "cat");
xbody.capture("name", "new (\w+)", "\1");
}
sub vcl_deliver
{
std.log("Name: " + xbody.get("name"));
std.log("Capture JSON: " + xbody.get_all());
}
Output (cache): This isn't a new cat!
Captures (xbody.get_all()
):
{
"name": "cat"
}
import xbody;
sub vcl_backend_response
{
xbody.regsub("www.example.com", "test.example.com");
xbody.regsub("admin@example.com", "dev@test.example.com");
}
import crypto;
import edgestash;
import xbody;
sub vcl_backend_response{
if (beresp.http.Content-Type ~ "html") {
xbody.regsub("<script>", "<script nonce={{nonce}}>");
edgestash.parse_response();
}
}
sub vcl_deliver {
if (edgestash.is_edgestash()) {
set req.http.X-nonce = crypto.hex_encode(crypto.urandom(16));
set resp.http.Content-Security-Policy = "script-src 'nonce-" +
req.http.X-nonce + "'";
edgestash.add_json({"
{
"nonce":""} + req.http.X-nonce + {""
}
"});
edgestash.execute();
}
}
Convert a JSON response into JSONP using a callback
query parameter.
import urlplus;
import xbody;
sub vcl_backend_response {
# Insert the JSONP callback
if (urlplus.query_get("callback") && beresp.http.Content-Type ~ "json") {
xbody.regsub("^", urlplus.query_get("callback") + "(");
xbody.regsub("$", ");");
set beresp.http.Content-Type = regsub(beresp.http.Content-Type, "json", "javascript");
}
}
Cache PHP pages with personalized content.
Personalized content is wrapped with an edgestash-name
attribute.
These attributes will be templated and delivered per session.
<span edgestash-name="greeting">
Hello [name]!
</span>
<span edgestash-name="number-items">
5
</span>
Content ...
import cookieplus;
import edgestash;
import xbody;
import ykey;
sub vcl_recv
{
# Purge session data
if (req.method == "POST") {
# Broadcaster can be used to send this purge to a cluster
ykey.purge(cookieplus.get("PHPSESSID"));
}
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
return(hash);
}
sub vcl_hash
{
# Add the session to xbody data
if (req.http.host ~ "^xbody\.") {
hash_data(cookieplus.get("PHPSESSID", "guest"));
}
}
sub vcl_pass
{
# When you purge, you may be forced to pass
if (req.http.host ~ "^xbody\.") {
return (restart);
}
}
sub vcl_backend_fetch
{
# Remove xbody host marker
unset bereq.http.xbody;
if (bereq.http.Host ~ "^xbody\.") {
set bereq.http.Host = regsub(bereq.http.Host, "^xbody\.", "");
set bereq.http.xbody = "true";
}
}
sub vcl_backend_response
{
if (beresp.http.Content-Type ~ "text") {
if(bereq.http.xbody) {
# Extract xbody data, add a ykey session key
xbody.capture("\2", {"(<[^>]*edgestash-name="?([^"\s]+)"?[^>]+>)([^>]*)<"}, "\3", "g");
ykey.add_key(cookieplus.get("PHPSESSID", "guest"));
set beresp.ttl = 1d;
} else {
# Edgestash template
xbody.regsub({"(<[^>]*edgestash-name="?([^"\s]+)"?[^>]+>)([^>]*)<"}, "\1{{\2}}<");
edgestash.parse_response();
}
}
}
sub vcl_deliver
{
if (edgestash.is_edgestash()) {
# Use the homepage as the xbody data source
edgestash.add_json_url("/", json_host="xbody." + req.http.host, xbody=true);
edgestash.execute();
}
}
VOID regsub(STRING pattern, STRING substitution, STRING mode, INT max)
Perform a regex substitution on the response body. Can only be used in vcl_backend_response
.
Arguments
STRING pattern
- PCRE regular expression to find.STRING substitution
- Substitution text. Backreferences are allowed (\1
thru \9
).STRING mode
- optional
Pattern mode. Defaults to g
. Possible values: [none], g
.INT max
- optional
Max number of substitutions. Defaults to 0
(no limit).Returns
Nothing.
VOID capture(STRING name, STRING pattern, STRING value, STRING mode, INT max)
Perform a regex capture on the response body. If pattern
is found, value
is set as the value of name
. Use xbody.get()
to read the value in vcl_deliver
. Can only be used in vcl_backend_response
. When used, streaming is disabled as to allow the captured value to be safely read in vcl_deliver
.
Arguments
STRING name
- JSON variable name. Backreferences are allowed (\1
thru \9
).STRING pattern
- PCRE regular expression to find.STRING value
- Value to use if pattern
is found. Backreferences are allowed (\1
thru \9
).STRING mode
- optional
Pattern mode. Defaults to no mode. Global mode, g
will repeat the capture and produce a JSON array value of all captures found.INT max
- optional
Max number of captures. Defaults to 0
(no limit).Returns
Nothing.
STRING get(STRING name, STRING default)
Get a capture value by name
. Can only be used in vcl_deliver
. Can be used in vcl_synth
in combination with xbody.synth()
.
Arguments
STRING name
- The name of the capture. Please see xbody.capture()
.STRING default
- optional
The default value if name
is not found. Defaults to an empty string.Returns
The value of name
if it exists, otherwise default
.
STRING get_all()
Get all capture values as a JSON object string. Can only be used in vcl_deliver
. Can be used in vcl_synth
in combination with xbody.synth()
.
Arguments
None.
Returns
A JSON object representing all capture values. If no capture values exist, an empty JSON object is returned. This is used to send all captures directly to Edgestash or for VHA replication.
STRING get_req_body()
Get the contents of the request body. Can only be used in vcl_recv
. Must call std.cache_req_body()
prior to this call.
Arguments
None.
Returns
A string of the request body.
BLOB get_req_body_hash(ENUM {md5, sha1, sha224, sha256, sha384, sha512} algorithm)
Generate and return the hash of the request body. Can only be used in vcl_recv
.
Arguments
ENUM algorithm
- The algorithm to hash the body with.Returns
A blob containing the hash of the request body.
VOID hash_body(ENUM {md5, sha1, sha224, sha256, sha384, sha512} algorithm, STRING name)
Generate the hash of the response body. Can only be used in vcl_backend_response
.
Arguments
ENUM algorithm
- The algorithm to hash the body with.STRING name
- optional
The name of the hash used when storing multiple hashes. If not used, this is the default hash.Returns
Nothing.
BLOB get_hash(STRING name)
Get the hash generated with hash_body
. Can only be used in vcl_deliver
.
Arguments
STRING name
- optional
The name of the hash used when generating multiple hashes. If not used, it returns the default hash.Returns
A blob of the binary encoded hash.
VOID log_body(BYTES max)
Log the response body to the VSL (varnishlog). The Body
VSL tag is used for this operation. The body logging can span multiple tag lines, this is determined by the maximum VSL record length.
Arguments
BYTES max
- optional
The maximum amount of bytes to log. After logging this many bytes, a truncation message will be appended: __XBODY_TRUNCATED_%MISSING_BYTES%
. Defaults to 4KB.Returns
Nothing.
VOID synth()
Allow captured JSON data to be accessible in vcl_synth
.
Arguments
None.
Returns
Nothing.
VOID set(STRING json)
Set capture values. When used, regex
and capture
operations are skipped. This is used to replicate xbody operations via VHA. Can only be used in vcl_backend_response
.
Arguments
STRING json
- Capture values from xbody.get_all()
. Must be a valid JSON object.Returns
Nothing.
VOID reset()
Reset the internal state.
Arguments
None.
Returns
Nothing.