Search

DNS based GSLB (Global Server Load Balancer) Tutorial

Introduction

The Global Server Load Balancer (GSLB) in a CDN is responsible for the distribution of HTTP clients globally. It distributes clients to specific locations (POPs) or to specific cache nodes, depending on the implementation and architecture.

Varnish is agnostic to the GSLB function and can be integrated with different types such as DNS-based, network routing-based (anycast) and HTTP302-based solutions.

In this tutorial we will set up a simple self-hosted GSLB that distributes clients to the closest healthy cache node using DNS. The following diagram shows what our architecture looks like.

World

Benefits and challenges with a DNS based GSLB

Benefits:

  • DNS is network agnostic. It does not require any special networking equipment or routing protocol knowledge and can be used across cloud providers and on-prem deployments in different networks.
  • DNS is transparent to the HTTP layer, which means that it is possible to use a DNS based GSLB with any HTTP use case.
  • The setup is simple and requires few components/dependencies.

Challenges:

  • DNS propagation time is potentially long and can become a disruptive factor during unforseen downtime. This will also need to be considered when doing maintenance in the CDN.
  • There is a myriad of different DNS implementations on the client side, and their behaviour is not always aligned.

Accuracy:

  • IP geolocation will be based on the client’s IP subnet if the client’s DNS server supports relaying this information. The fallback is to use the IP address of the client’s DNS server, which in most cases are reasonably close to the client. For global distribution this is usually sufficient, but for distribution between multiple locations within a single country it may not be.

Call flow

Call flow

Components involved

Prerequisites

In order to replicate this tutorial, you will need:

  • Several nodes for Varnish in different locations with public IP addresses. These will act as the caching nodes in the CDN. Use one of the supported platforms for Varnish.
  • Minium two nodes for PowerDNS in different locations with public IP addresses. These will act as DNS servers in the CDN. More nodes can be added for more redundancy and performance. Use one of the supported platforms for PowerDNS.
  • A subdomain to be used for the CDN.

In this tutorial, we have the following resources available:

  • Five data centers (US west, US east, EU west, AS west and AS east).
  • Six cache nodes with Varnish spread over these five data centers running CentOS 7.

    • cache01.us-west.example.com at 192.168.1.10
    • cache01.us-east.example.com at 192.168.2.10
    • cache01.eu-west.example.com at 192.168.3.10
    • cache02.eu-west.example.com at 192.168.3.11
    • cache01.as-west.example.com at 192.168.4.10
    • cache01.as-east.example.com at 192.168.5.10
  • Two DNS nodes with PowerDNS spread over the US east and EU west data centers running CentOS 7.

    • ns01.us-east.example.com at 192.168.2.5
    • ns01.eu-west.example.com at 192.168.3.5
  • One origin in EU west:

    • origin.example.com at 192.168.3.2
  • We pretend to own the domain example.com and will use the subdomain cdn.example.com for the CDN.

The IP addresses listed above are used as examples. In a real environment they would be in publicly available and routable IP networks.

Step 1 - Prepare the caching nodes with Varnish

  1. Follow the getting started tutorial to install Varnish on each of the caching nodes.
  2. Deploy a VCL configuration to the caching nodes with the origin as the backend. Specify also a URL that can be used for health probes from the DNS server. Example:

    vcl 4.1;
    import std;
    
    # origin.example.com
    backend origin;
        .host = "192.168.3.2";
        .port = "443";
        .ssl = true;
    }
    
    sub vcl_recv {
        # URL to be used for health probes
        if (req.url == "/varnish-status") {
            if (std.file_exists("/etc/varnish/maintenance")) {
                # If the file exists, the cache node is in maintenance mode
                # and will be excluded automatically in PowerDNS.
                return(synth(503, "Maintenance"));
            } else {
                return(synth(200, "OK"));
            }
        }
    
        set req.backend_hint = origin;
    }
    

Step 2 - Install PowerDNS

This step of the tutorial covers PowerDNS Authoritative Server version 4.4 on CentOS 7. For other versions of PowerDNS or other platforms than CentOS 7, please refer to the PowerDNS install documentation.

On the DNS nodes, create the file /etc/yum.repos.d/powerdns.repo with the following contents (please refer to the PowerDNS repositories for the most up to date information):

[powerdns-auth-44]
name=PowerDNS repository for PowerDNS Authoritative Server - version 4.4.X
baseurl=http://repo.powerdns.com/centos/$basearch/$releasever/auth-44
gpgkey=https://repo.powerdns.com/FD380FBB-pub.asc
gpgcheck=1
enabled=1
priority=90
includepkg=pdns*

Enable the Extra Packages for Enterprise Linux (EPEL) repository by installing the epel-release package:

sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Install PowerDNS and its backend to handle IP geolocation lookups:

sudo yum install pdns pdns-backend-geoip

Step 3 - Fetch the IP geolocation database

Register at Maxmind and download the GeoLite2 database with City granularity. Copy this MMDB file to /etc/pdns/GeoLite2-City.mmdb on the router nodes and make it world readable:

sudo chmod 644 /etc/pdns/GeoLite2-City.mmdb

MaxMind provides updated databases at regular intervals. It is recommended to automate the process of updating the database, but it is outside the scope of this tutorial.

Step 4 - Configure PowerDNS

Put the following in /etc/pdns/pdns.conf:

bind-config=/etc/pdns/named.conf
cache-ttl=0
consistent-backends=yes
daemon=no
edns-subnet-processing=yes
enable-lua-records=yes
geoip-database-files=/etc/pdns/GeoLite2-City.mmdb
launch=bind,geoip
query-cache-ttl=0
query-logging=yes
setgid=pdns
setuid=pdns

Create /etc/pdns/named.conf and add the zone that will be managed by the DNS server:

zone "cdn.example.com" in {
    type native;
    file "/etc/pdns/cdn.example.com.zone";
};

Create /etc/pdns/cdn.example.com.zone and add the SOA, A and AAAA records:

$ORIGIN cdn.example.com.
@ IN SOA ns01.eu-west.example.com. hostmaster.example.com. 2 7200 3600 86400 60
@ 30 IN LUA A ("ifurlup('https://cdn.example.com/varnish-status', {'192.168.1.10', '192.168.2.10', '192.168.3.10', '192.168.3.11', '192.168.4.10', '192.168.5.10'}, {selector='pickclosest'})")

# A similar AAAA record can be made for IPv6 support
#@ 30 IN LUA AAAA ("ifurlup('https://cdn.example.com/varnish-status', {'2001:0db8:85a3:::8a2e:0370:7334', '2001:0db8:85a3:::8a2e:0370:7335', ...}, {selector='pickclosest'})")

The ifurlup section of configuration will enable health probing from PowerDNS to Varnish. The probes will send the following HTTP request to each of the IP addesses specified every 5 seconds:

GET https://cdn.example.com/varnish-status HTTP/1.1
User-Agent: PowerDNS Authoritative Server
Host: cdn.example.com
Accept: */*

The cache nodes will be considered healthy if they respond with 200 OK within the default timeout which is 2 seconds.

From the list of healthy IP addresses, the selector='pickclosest' will pick the IP address that is closest to the IP subnet of the client (if the client’s DNS server support RFC 7871) or the IP address of the client’s DNS server.

Configure PowerDNS to start automatically at boot:

sudo systemctl enable pdns

Restart PowerDNS manually (or reboot the DNS nodes):

sudo systemctl restart pdns

Step 5 - Delegation of cdn.example.com

The subdomain cdn.example.com is delegated to the PowerDNS nodes using NS records. Add the following DNS records to the example.com zone configuration:

cdn.example.com  3600  IN NS  ns01.us-east.example.com
cdn.example.com  3600  IN NS  ns01.eu-west.example.com

Step 6 - Testing

The environment can now be tested. Verify first that DNS lookups work as expected. The IP address should correspond to the cache node that is both healthy and has the shortest distance from the client.

$ dig cdn.example.com
[...]
;; QUESTION SECTION:
;cdn.example.com.        IN    A

;; ANSWER SECTION:
cdn.example.com.    30    IN    A    192.168.3.11
[...]

Enable tracing to get information about the delegation path all the way from the root name servers down to the CDN nodes: dig +trace cdn.example.com.

Verify that a client is able to get DNS responses if one of the DNS servers are unavailable. Unavailability of single DNS servers may affect the latency of the DNS responses, but should not affect the availability of the DNS service.

Operations and monitoring

Given the VCL example above, individual cache nodes can be drained in a non-disruptive way by creating the file /etc/varnish/maintenance on a cache node. If this file exists on a cache node, the DNS servers will stop sending clients to this node. Existing clients will go to other nodes as soon as the DNS entry expires and is refreshed. Simply remove the file to let the cache node back into active duty again.

Health monitoring of the DNS nodes can be done using the check_dns plugin for Nagios and the built-in Prometheus interface from PowerDNS. For more information about managing PowerDNS, please refer to the PowerDNS documentation.

Conclusion

This tutorial shows how a simple and self-hosted DNS based GSLB can be set up to balance clients between multiple data centers. It takes the location of the client and the health of the cache nodes into account.

For more advanced use cases, this setup can be extended in several ways. Examples of next steps are:

  • to expose the DNS servers using anycast,
  • put the cache nodes behind layer 4 load balancers (with DR/DSR) to provide a single point of entry in each data center,
  • and/or take utilization of each cache node into account.