In terms of interacting with stateful data that can be used to offer a personalized caching experience, we already used the file system and API calls.
Although they are valid candidates as the source of truth, there are limiting factors:
Unless data is readily available in files, or unless data APIs can keep up with Varnish, we need to find another solution.
Having direct access to a database may be the better solution. The term
database can refer to many implementations. Some databases may be
accessible via a RESTful API, which can be leveraged using
vmod_http.
In this section, we’re going to cover four types of databases:
For the record: SQLite is a library that implements a serverless, self-contained relational database system. Varnish Enterprise contains a VMOD that interacts with SQLite. We already featured this VMOD in chapters 5 and 2.
In chapter 5 I showed you an example where sessions were stored in the database, and that a cookie value was used to retrieve the username of a logged in user.
This time, we’ll use SQLite to store caching policies about specific pages.
Here are the commands you need to create and populate the database:
sqlite3 sqlite.db <<EOF
CREATE TABLE pages (
cache BOOLEAN NOT NULL,
url TEXT NOT NULL,
host TEXT NOT NULL,
PRIMARY KEY (url, host)
);
INSERT INTO pages (cache,url,host) VALUES
(0,'/checkout','example.com'),
(1,'/','example.com'),
(1,'/products','example.com'),
(0,'/cart','example.com');
EOF
Once the database has been put in place, we can match the URL and hostname of a page to determine its caching behavior. When the page is not found, the built-in VCL behavior is used.
Here’s the VCL:
vcl 4.1;
import sqlite3;
sub vcl_init {
sqlite3.open("/etc/varnish/sqlite.db", "|;");
}
sub vcl_fini {
sqlite3.close();
}
sub vcl_recv {
set req.http.cache = sqlite3.exec("SELECT `cache` FROM `pages` WHERE url='"
+ sqlite3.escape(req.url) + "' AND host='"
+ sqlite3.escape(req.http.host) + "'");
if(req.http.cache == "1") {
return(hash);
} elseif (req.http.cache == "0") {
return(pass);
}
}
The output from sqlite3.exec is used to determine the value of the
cache database field, based on the url and hostname values.
If there’s a matching row in the database, and the cache field is 1,
the page is cacheable and return(hash) is called. If cache is 0,
return(pass) is called.
If there’s no matching row, we’re not returning anything, which means the built-in VCL behavior applies.
SQLite is a very lightweight database system and performs quite well for read-only access. As soon as you start writing to the database in VCL, latency will occur because write operations lock the database file.
Can vmod_kvstore be considered a database? The examples we used
throughout the book would suggest otherwise: the key-value store is
populated in VCL, and a restart removes all content.
However, there is a very basic level of persistence available that can
be triggered via the .init_file() function.
Here’s the vmod_kvstore implementation of the SQLite example, but
backed by a file:
vcl 4.1;
import kvstore;
sub vcl_init {
new pages = kvstore.init();
pages.init_file("/etc/varnish/pages.store",",");
}
sub vcl_recv {
set req.http.cache = pages.get(req.http.host+req.url,"");
if(req.http.cache == "1") {
return(hash);
} elseif (req.http.cache == "0") {
return(pass);
}
}
The following command can be used to populate the pages.store file
that contains the same rules as the SQLite database:
$ cat <<EOF > /etc/varnish/pages.store
> example.com/,1
> example.com/products,1
> example.com/cart,0
> example.com/checkout,0
> EOF
The pages.init_file("/etc/varnish/pages.store",",") function can be
called in other places in your VCL when a resynchronization is
required.
This persisted kvstore example will perform better than SQLite, but does not offer the flexibility of the SQL language.
Memcached is a distributed key-value store that has client implementations in many programming languages. It is extremely fast and scalable, but offers no persistence layer. Technically, Memcached can be viewed as simple a cache that is accessible over the network.
vmod_memcached is an open source VMOD that provides access to a
Memcached setup. It is available via
https://github.com/varnish/libvmod-memcached, but is also packaged
with Varnish Enterprise.
Let’s revisit the basic authentication example from earlier in this
chapter. We featured this example to show the power of vmod_http.
Let’s strip out the HTTP calls and replace them with Memcached
calls.
Here’s the code:
vcl 4.1;
import crypto;
import memcached;
sub vcl_init {
memcached.servers("--SERVER=192.168.98.101");
memcached.error_string("error");
}
sub vcl_recv {
if (req.http.Authorization !~ "^Basic ([a-z-A-Z0-9=]+)$") {
return(synth(401,"Authentication required"));
}
set req.http.base64 = regsub(req.http.Authorization,"^Basic ([a-z-A-Z0-9=]+)$","\1");
set req.http.usernamepassword = crypto.string(crypto.base64_decode(req.http.base64));
set req.http.username = regsub(req.http.usernamepassword,"^([^:]+):([^:]+)$","\1");
set req.http.password = regsub(req.http.usernamepassword,"^([^:]+):([^:]+)$","\2");
set req.http.memcached = memcached.get(req.http.username);
if (req.http.memcached == "error") {
return(synth(403));
}
if (req.http.password != req.http.memcached) {
return(synth(401,"Authentication required"));
}
unset req.http.Authorization;
unset req.http.base64;
unset req.http.usernamepassword;
unset req.http.username;
unset req.http.password;
unset req.http.memcached;
}
The Memcached server is accessible via 192.168.98.101 on the
standard 11211 port and contains login credentials. Varnish uses
these credentials to grant or deny access to the platform.
Varnish decodes the Authorization header using the
crypto.base64_decode() function. Via regular expressions, the
username and password are extracted.
The Memcached key is the username, and the corresponding value is the password. If a Memcached lookup results in an error, this means the user was not found. In that case we return an HTTP 403 response.
If the passwords don’t match, we return an HTTP 401 response, which gives the client the opportunity to try logging in again.
Once authentication is successful, the Authorization header is
stripped off to ensure the built-in VCL can consider the request
cacheable.
Memcached can also be used to store session information, or as a way to store projected results from relational databases.
Redis is also a distributed key-value store, like Memcached. It can be considered the successor of Memcached and offers a lot more features. To some extent we can say that Redis is steadily becoming the industry standard.
Unlike Memcached, Redis offers multiple data types and specific commands to interact with them in an atomic way. Redis also offers persistence, replication, security, and many more operational features.
The fun thing about Redis is that it has a LUA scripting language, which allows you to script certain behavior.
There is an open source VMOD available for Redis, which you get via https://github.com/carlosabalde/libvmod-redis. It has a very extensive API.
Let’s feature an example where Redis can be used to provide a personalized caching experience.
Remember the shopping cart example from earlier in this chapter? We used the file system to access the session file, and we extract the right key from the serialized session data.
It’s easy to replicate this example and use Redis instead. However, this example will store the product and session data in a more intuitive way:
So whenever someone adds a product to their shopping cart, an
RPUSH $sessionId $productId command is sent to Redis. And whenever
the quantity of a product in the cart is decreased, an
LREM $sessionId 1 $productId is used. When a complete product is
removed from the shopping cart, an LREM $sessionId 0 $productId
command is sent to Redis.
Computing the number of items in the shopping cart can be done using the following Redis command:
LLEN $sessionId
If we have access to Redis from VCL, there are many ways we can offload this stateful logic from the origin, but in this example we’ll limit it to counting the shopping cart items.
Here’s the VCL code:
vcl 4.1;
import redis;
import cookieplus;
import xbody;
import edgestash;
sub vcl_init {
new sessions = redis.db(
location="192.168.98.102:6379",
shared_connections=false,
max_connections=1);
}
sub vcl_recv {
cookieplus.keep("PHPSESSID");
cookieplus.write();
if(req.url ~ "^/add/to/cart/[0-9]+$" || req.url ~ "^/remove/from/cart/[0-9]+") {
return(pass);
}
if(req.url == "/") {
return(hash);
}
}
sub vcl_backend_response {
if(bereq.url == "/") {
unset beresp.http.cache-control;
set beresp.ttl = 3600s;
xbody.regsub({"(<span id="items-in-cart" [^>]+>)(\w*)(</span>)"},
{"\1{{items-in-cart}}\3"});
edgestash.parse_response();
}
}
sub vcl_deliver {
sessions.command("LLEN");
sessions.push(cookieplus.get("PHPSESSID"));
sessions.execute();
if(edgestash.is_edgestash() && sessions.reply_is_integer()) {
edgestash.add_json({"{ "items-in-cart": ""}
+ sessions.get_integer_reply()
+ {"" }"});
edgestash.execute();
}
}
Let’s talk through this one:
vcl_init we initialize a Redis client object called sessions.vcl_recv we strip off all cookies except PHPSESSID.vcl_recv we don’t allow /add/to/cart/$productId and
/remove/from/cart/$productId to be served from cache.vcl_recv we explicitly cache the homepage, despite the
PHPSESSID cookie being present.vcl_backend_response we use xbody.regsub() to replace the
items in cart counter with a {{items-in-cart}} Edgestash
placeholder.vcl_deliver we execute an LLEN Redis command to get the
number of items in the shopping cart.vcl_deliver we parse the LLEN Redis value in the
items-in-cart placeholder.Instead of temporarily storing the value via vmod_kvstore, we directly
connect to Redis at delivery time. Although Redis scales really
well, there might be some operational concerns. Please keep in mind that
your Redis server should be properly tuned if you receive a lot of
incoming requests.