The Global Server Load Balancer (GSLB) in a CDN is responsible for the distribution of HTTP clients globally. It distributes clients to specific locations (POPs) or to specific cache nodes, depending on the implementation and architecture.
Varnish is agnostic to the GSLB function and can be integrated with different types such as DNS-based, network routing-based (anycast) and HTTP302-based solutions.
In this tutorial we will set up a simple self-hosted GSLB that distributes clients to the closest healthy cache node using DNS. The following diagram shows what our architecture looks like.
Benefits:
Challenges:
Accuracy:
In order to replicate this tutorial, you will need:
In this tutorial, we have the following resources available:
Five data centers (US west, US east, EU west, AS west and AS east).
Six cache nodes with Varnish spread over these five data centers running CentOS 7.
cache01.us-west.example.com
at 192.168.1.10
cache01.us-east.example.com
at 192.168.2.10
cache01.eu-west.example.com
at 192.168.3.10
cache02.eu-west.example.com
at 192.168.3.11
cache01.as-west.example.com
at 192.168.4.10
cache01.as-east.example.com
at 192.168.5.10
Two DNS nodes with PowerDNS spread over the US east and EU west data centers running CentOS 7.
ns01.us-east.example.com
at 192.168.2.5
ns01.eu-west.example.com
at 192.168.3.5
One origin in EU west:
origin.example.com
at 192.168.3.2
We pretend to own the domain example.com
and will use the subdomain cdn.example.com
for the CDN.
The IP addresses listed above are used as examples. In a real environment they would be in publicly available and routable IP networks.
Follow the getting started tutorial to install Varnish on each of the caching nodes.
Deploy a VCL configuration to the caching nodes with the origin as the backend. Specify also a URL that can be used for health probes from the DNS server. Example:
vcl 4.1;
import std;
# origin.example.com
backend origin;
.host = "192.168.3.2";
.port = "443";
.ssl = true;
}
sub vcl_recv {
# URL to be used for health probes
if (req.url == "/varnish-status") {
if (std.file_exists("/etc/varnish/maintenance")) {
# If the file exists, the cache node is in maintenance mode
# and will be excluded automatically in PowerDNS.
return(synth(503, "Maintenance"));
} else {
return(synth(200, "OK"));
}
}
set req.backend_hint = origin;
}
This step of the tutorial covers PowerDNS Authoritative Server version 4.4 on CentOS 7. For other versions of PowerDNS or other platforms than CentOS 7, please refer to the PowerDNS install documentation.
On the DNS nodes, create the file /etc/yum.repos.d/powerdns.repo
with the following contents (please refer to the PowerDNS repositories for the most up to date information):
[powerdns-auth-44]
name=PowerDNS repository for PowerDNS Authoritative Server - version 4.4.X
baseurl=http://repo.powerdns.com/centos/$basearch/$releasever/auth-44
gpgkey=https://repo.powerdns.com/FD380FBB-pub.asc
gpgcheck=1
enabled=1
priority=90
includepkg=pdns*
Enable the Extra Packages for Enterprise Linux (EPEL) repository by installing the epel-release
package:
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Install PowerDNS and its backend to handle IP geolocation lookups:
sudo yum install pdns pdns-backend-geoip
Register at Maxmind and download the GeoLite2 database with City granularity. Copy this MMDB file to /etc/pdns/GeoLite2-City.mmdb
on the router nodes and make it world readable:
sudo chmod 644 /etc/pdns/GeoLite2-City.mmdb
MaxMind provides updated databases at regular intervals. It is recommended to automate the process of updating the database, but it is outside the scope of this tutorial.
Put the following in /etc/pdns/pdns.conf
:
bind-config=/etc/pdns/named.conf
cache-ttl=0
consistent-backends=yes
daemon=no
edns-subnet-processing=yes
enable-lua-records=yes
geoip-database-files=/etc/pdns/GeoLite2-City.mmdb
launch=bind,geoip
query-cache-ttl=0
query-logging=yes
setgid=pdns
setuid=pdns
Create /etc/pdns/named.conf
and add the zone that will be managed by the DNS server:
zone "cdn.example.com" in {
type native;
file "/etc/pdns/cdn.example.com.zone";
};
Create /etc/pdns/cdn.example.com.zone
and add the SOA
, A
and AAAA
records:
$ORIGIN cdn.example.com.
@ IN SOA ns01.eu-west.example.com. hostmaster.example.com. 2 7200 3600 86400 60
@ 30 IN LUA A ("ifurlup('https://cdn.example.com/varnish-status', {'192.168.1.10', '192.168.2.10', '192.168.3.10', '192.168.3.11', '192.168.4.10', '192.168.5.10'}, {selector='pickclosest'})")
# A similar AAAA record can be made for IPv6 support
#@ 30 IN LUA AAAA ("ifurlup('https://cdn.example.com/varnish-status', {'2001:0db8:85a3:::8a2e:0370:7334', '2001:0db8:85a3:::8a2e:0370:7335', ...}, {selector='pickclosest'})")
The ifurlup
section of configuration will enable health probing from PowerDNS to Varnish. The probes will send the following HTTP request to each of the IP addesses specified every 5 seconds:
GET https://cdn.example.com/varnish-status HTTP/1.1
User-Agent: PowerDNS Authoritative Server
Host: cdn.example.com
Accept: */*
The cache nodes will be considered healthy if they respond with 200 OK
within the default timeout which is 2 seconds.
From the list of healthy IP addresses, the selector='pickclosest'
will pick the IP address that is closest to the IP subnet of the client (if the client’s DNS server support RFC 7871) or the IP address of the client’s DNS server.
Configure PowerDNS to start automatically at boot:
sudo systemctl enable pdns
Restart PowerDNS manually (or reboot the DNS nodes):
sudo systemctl restart pdns
The subdomain cdn.example.com
is delegated to the PowerDNS nodes using NS
records. Add the following DNS records to the example.com
zone configuration:
cdn.example.com 3600 IN NS ns01.us-east.example.com
cdn.example.com 3600 IN NS ns01.eu-west.example.com
The environment can now be tested. Verify first that DNS lookups work as expected. The IP address should correspond to the cache node that is both healthy and has the shortest distance from the client.
$ dig cdn.example.com
[...]
;; QUESTION SECTION:
;cdn.example.com. IN A
;; ANSWER SECTION:
cdn.example.com. 30 IN A 192.168.3.11
[...]
Enable tracing to get information about the delegation path all the way from the root name servers down to the CDN nodes: dig +trace cdn.example.com
.
Verify that a client is able to get DNS responses if one of the DNS servers are unavailable. Unavailability of single DNS servers may affect the latency of the DNS responses, but should not affect the availability of the DNS service.
Given the VCL example above, individual cache nodes can be drained in a non-disruptive way by creating the file /etc/varnish/maintenance
on a cache node. If this file exists on a cache node, the DNS servers will stop sending clients to this node. Existing clients will go to other nodes as soon as the DNS entry expires and is refreshed. Simply remove the file to let the cache node back into active duty again.
Health monitoring of the DNS nodes can be done using the check_dns plugin for Nagios and the built-in Prometheus interface from PowerDNS. For more information about managing PowerDNS, please refer to the PowerDNS documentation.
This tutorial shows how a simple and self-hosted DNS based GSLB can be set up to balance clients between multiple data centers. It takes the location of the client and the health of the cache nodes into account.
For more advanced use cases, this setup can be extended in several ways. Examples of next steps are: