Synthetic responses

Throughout the book, the primary focus has been caching content from the origin. For non-cacheable content we provided ways to bypass the cache.

As you have already seen, this chapter is about caching otherwise uncacheable content, and about offloading the uncacheable logic on the edge.

But instead of serving content from the origin, we can cut out the origin and produce the content ourselves. We do this by serving synthetic HTTP responses.

This is already a familiar concept by now, as return(synth()) and vcl_synth have been covered a number of times.

If we have access to the stateful data, or we can compute the data ourselves, we can produce synthetic HTTP responses without accessing the origin server. We can do this for all content or for select endpoints.

We can produce HTML and basically act as a web server. We can also produce JSON or XML and become a RESTful API application.

In previous sections of this chapter, we talked about file system access, about access to HTTP services, and about access to Memcached and Redis. We can query these data sources and use synthetic output to visualize the data.

Synthetic output and no backend

If you return synthetic output, you don’t really need to define a backend, and backend default none; will ensure varnishd doesn’t complain when you load a VCL that doesn’t have a real backend.

You can use return(synth(200,"OK")) to return a synthetic response. Any request that doesn’t return synthetic output will return an HTTP 503 Backend fetch failed error.

The synth() function is very limited in its capabilities: only a status code and some text can be used, which are parsed in a pretty horrible HTML template :

<!DOCTYPE html>
<html>
  <head>
	<title>200 OK</title>
  </head>
  <body>
	<h1>Error 200 OK</h1>
	<p>OK</p>
	<h3>Guru Meditation:</h3>
	<p>XID: 32770</p>
	<hr>
	<p>Varnish cache server</p>
  </body>
</html>

This is not really user-friendly unless we modify what vcl_synth returns.

Loading an HTML template

Using std.fileread() you can load an HTML file from disk, which serves as the template. Via regsuball(), the <<REASON>> placeholder can be replaced with the value of resp.status.

Here’s an example:

vcl 4.1;

import std;

backend default none;

sub vcl_recv {
	return(synth(200,"Something cool"));
}

sub vcl_synth {
	if(req.url == "/") {
		set resp.http.Content-Type = "text/html";
		set resp.body = regsuball(
			std.fileread("/etc/varnish/index.html"),
			"<<REASON>>",
			resp.reason);
	} else {
		set resp.status = 404;
		set resp.http.Content-Type = "text/plain";
		set resp.body = "Not found";
	}
	return(deliver);
}

If you only have a couple of files to serve, this will do, and it will be very powerful. Don’t forget that the output from std.readfile() is only processed at compile time. This means that no file system access is done at runtime.

Creating a simple API

The previous example showed some of the possibilities of synthetic responses by overriding vcl_synth. Let’s spice it up a bit and return dynamic content.

The following example features a very small RESTful API that returns a JSON object containing the username and the number of items in the shopping cart for a session that was established.

The session is identified by a sessionId cookie, and the session information is stored in Redis.

vmod_redis is an open source VMOD by Carlos Abalde. It is not packaged with Varnish Cache or Varnish Enterprise, but you can download the source code via https://github.com/carlosabalde/libvmod-redis.

The session information is stored in a Redis hash, which has multiple fields. The following Redis CLI command returns that hash for session ID 123:

127.0.0.1:6379> hgetall 123
1) "username"
2) "JohnSmith"
3) "items-in-cart"
4) "5"

You can see that the username for session 123 is JohnSmith. John has 5 items in his shopping cart.

We can create a RESTful API that consumes this data and returns it in vcl_synth:

vcl 4.1;

import redis;

backend default none;

sub vcl_init {
	new redis_client = redis.db(
		location="redis:6379",
		shared_connections=false,
		max_connections=1);
}

sub vcl_recv {
	return(synth(200));
}

sub vcl_synth {
	if(req.url == "/api/session") {
		set resp.http.sessionId = regsub(req.http.Cookie,"^.*;?\s*sessionId\s*=\s*([0-9a-zA-z]+)\s*;?.*","\1");
		redis_client.command("HMGET");
		redis_client.push(resp.http.sessionId);
		redis_client.push("username");
		redis_client.push("items-in-cart");
		redis_client.execute();
		set resp.http.Content-Type = "application/json";
		set resp.body = {"{
			"username": ""} + redis_client.get_array_reply_value(0) + {"",
			"items-in-cart": ""} + redis_client.get_array_reply_value(1) + {""
		}"};
		return(deliver);
	}
}

Because the session information is stored in a Redis hash, a HMGET is required to retrieve multiple fields. The parsed command would be HMGET 123 username items-in-cart.

In VCL we can use redis_client.get_array_reply_value() to retrieve the value of individual fields based on an index because Redis returns the output as an array.

When we call the /api/session endpoint using the right cookie value, the output will be the following:

$ curl -H"Cookie: sessionId=123" localhost/api/session
{
  "username": "JohnSmith",
  "items-in-cart": "5"
}

Synthetic backends

The previous examples in this section all leveraged the vcl_synth subroutine to return synthetic output. Although this works fine, the output is not cached, and ESI or Gzip compression aren’t supported either.

When edge logic depends on external services, large traffic spikes may overload these services and result in latency.

Varnish Enterprise offers synthetic backends through vmod_synthbackend: synthetic objects will be inserted at the beginning of the fetch pipeline, which gives them the same behavior as regular objects.

The API for vmod_synthbackend has a couple of functions:

synthbackend.mirror() will mirror the request information and will return the request body into the response body.
synthbackend.from_blob() will create a response body using BLOB data.
synthbackend.from_string() will create a response body using string data.
synthbackend.none() will return a null backend.

The function we’re mainly interested in is synthbackend.from_string(). The following example is based on the previous one where Redis is used to return session information in a RESTful API.

Instead of sending a command to Redis for every request, the following example will cache the output and will create a cache variation per session.

Here’s the code:

vcl 4.1;

import redis;
import synthbackend;
import ykey;

backend default {
	.host = "backend.example.com";
}

sub vcl_init {
	new redis_client = redis.db(
		location="redis:6379",
		shared_connections=false,
		max_connections=1);
}

sub vcl_recv {
	if(req.url == "/api/session") {
		set req.http.sessionId = regsub(req.http.Cookie,"^.*;?\s*sessionId\s*=\s*([0-9a-zA-z]+)\s*;?.*","\1");
		if(req.method == "PURGE") {
			ykey.purge(req.http.sessionId);
		}
		return(hash);
	}
}

sub vcl_hash {
	hash_data(req.http.sessionId);
}

sub vcl_backend_fetch {
	if(bereq.url == "/api/session") {
		redis_client.command("HMGET");
		redis_client.push(bereq.http.sessionId);
		redis_client.push("username");
		redis_client.push("items-in-cart");
		redis_client.execute();
		set bereq.backend = synthbackend.from_string({"{
			"username": ""} + redis_client.get_array_reply_value(0) + {"",
			"items-in-cart": ""} + redis_client.get_array_reply_value(1) + {""
		}"});
	} else {
		set bereq.backend = default;
	}
}

sub vcl_backend_response {
	if(bereq.url == "/api/session") {
		set beresp.http.Content-Type = "application/json";
		set beresp.ttl = 3h;
		ykey.add_key(bereq.http.sessionId);
	}
}

Let’s break this example down, and explain what is going on:

Requests for /api/session are cacheable, even cookies are used. The sessionId cookie is extracted and stored in req.http.sessionId for later use.

If PURGE requests are received for /api/session, vmod_ykey will evict objects from cache that match the session ID.

When we look up requests for /api/session, we ensure the session ID is used as a cache variation.

And when requests for /api/session cause a cache miss, a synthetic object is inserted into the cache via synthbackend.from_string(). The string contains a JSON object that is composed by fetching the session information from Redis.

Backend requests for other endpoints are sent to the default backend, which is not a synthetic one.

When synthetic responses are received for /api/session, the Content-Type: application/json response header is set and the TTL for the object is set to three hours.

When such a response is received, the session ID is registered as a key in vmod_ykey.

The following curl request will still output the personalized JSON response:

$ curl -H "Cookie: sessionId=123" localhost/api/session
{
  "username": "JohnSmith",
  "items-in-cart": "5"
}

The only difference is that the value is cached per session for three hours. If at any point the object needs to be updated, a PURGE call can be done, as illustrated below:

$ curl -XPURGE -H"Cookie: sessionId=123" localhost/api/session
{
  "username": "JohnSmith",
  "items-in-cart": "5"
}