Search

Request routing

Request routing

You can build a CDN with as many PoPs as you want, but you need a way to route clients to the right PoP.

There are various ways you can do this, but DNS is a popular one. The authoritative DNS server for the hostname that receives the DNS requests uses geoIP lookups based on the client network address in the case of EDNS, or in non-EDNS cases we use the recursive resolver’s IP address, which returns the IP address of the PoP.

Resolving nameservers can pass information about users using EDNS Client Subnet. The subnet is a short suffix that is appended to the end of an IP address that indicates where a user is located. Not all resolvers forward this information.

Another way is via WAN Anycast, where the network routing technology is used to select a PoP based on the shortest network route.

Certain use cases can also warrant the use of HTTP for routing requests to the right PoP: a discovery service can also use geoIP to localize the client, and then perform an HTTP 301 redirect to the right PoP. Using HTTP has the upside that the discovery service will see the requesting client IP and the geoIP filtering will be more fine-grained.

It is possible to combine these methods and first perform a crude localization through DNS, and then let Anycast find the closest server for that IP address.

In a more practical sense, we will cover four request routing implementations:

  • PowerDNS
  • AWS Route 53
  • Anycast
  • Varnish Traffic Router

PowerDNS

PowerDNS is an open source DNS server that is quite easy to install and manage.

Via its geoip backend plugin, geolocation can be performed. If the request is done with EDNS the client network address is part of the DNS request. If it is done without EDNS the requesting resolver address is inspected and matched to a geoIP database. The DNS response is the IP address of a PoP in our CDN.

In terms of configuration, you can add the following settings to /etc/powerdns/``pdns.conf:

launch=geoip
geoip-database-files=/usr/share/GeoIP/GeoIP.dat,/usr/share/GeoIP/GeoIPv6.dat
geoip-zones-file=/etc/powerdns/zone

This configuration enables the geoip backend, loads the geoIP databases, and sets the location of the zone file.

The zone file, located in /etc/powerdns/zone, contains information about the domain and its records and could look like this:

- domain: geo.example.com
  ttl: 60
  records:
    geo.example.com:
       - soa: ns.example.com. hostmaster.example.com. 1 7200 3600 86400 60
       - ns:  ns.example.com.
    eu.geo.example.com:
       - a: 192.168.1.2
    na.geo.example.com:
       - a: 192.168.1.3
    sa.geo.example.com:
       - a: 192.168.1.4
    af.geo.example.com:
       - a: 192.168.1.5
    as.geo.example.com:
       - a: 192.168.1.6
    "*.geo.example.com":
       - a: 192.168.1.7
  services:
     www.geo.example.com: '%cn.geo.example.com'

This zone file provides DNS information for the geo.example.com domain. There are a certain number of address records available that are linked to specific IP addresses:

  • eu.geo.example.com points to IP address 192.168.1.2 and represents our European PoP.
  • na.geo.example.com points to IP address 192.168.1.3 and represents our North American PoP.
  • sa.geo.example.com points to IP address 192.168.1.4 and represents our South American PoP.
  • af.geo.example.com points to IP address 192.168.1.5 and represents our African PoP.
  • as.geo.example.com points to IP address 192.168.1.6 and represents our Asian PoP.
  • *.geo.example.com points to IP address 192.168.1.7 and catches DNS requests for unmatched continents, or when the continent information could not be retrieved from the client IP address.

And finally, there is a service definition for www.geo.example.com that is exposed as a CNAME record. It points to %cn.geo.example.com. The %cn placeholder is replaced with the continent code of the client.

For any matching address record, the IP address will be returned. Unmatched address records will be caught by the *.geo.example.com record.

Because DNS resolution is distributed, it scales well: your system’s DNS resolvers will perform all the heavy lifting. DNS requests to our PowerDNS server will only be made if the cached value of your DNS resolver expires.

As you can see, DNS uses caching techniques just like Varnish. There is also a TTL that should be respected. However, there is no way to enforce this TTL, and no way to forcefully invalidate the cache.

If changes in the zone file occur, it could take a couple of hours before they are propagated globally.

AWS Route53

Route53 is a cloud-based DNS service by Amazon Web Services (AWS). The technology is very similar to the PowerDNS example you just saw: Route53 identifies the client IP address for incoming DNS requests and matches the requested hostname to an IP address that is associated with a specific geographic region.

Route53 can match continents, countries, and US states.

The following screenshot shows how to configure a DNS record with geolocation routing:

AWS Route 53

The IP address that is returned represents the closest CDN PoP the user should connect to. If the PoP nodes are also hosted in the AWS cloud, Route53 has some additional request routing capabilities.

Anycast

Anycast is a network-routing technique that maps a single IP address to multiple endpoints and lets routers decide which endpoint is selected.

Endpoint selection is based on the number of hops between the client and the endpoints, on distance, and network latency. Anycast will choose the shortest route.

Anycast may even select a PoP that is a lot further away because the latency is lower. Geolocation does not have this intelligence.

The preferred route for Anycast addressing is implemented using the Border Gateway Protocol (BGP). This is a routing protocol that announces the routes over the network. This is not a layer-7 implementation; however, it can be leveraged by layer-7 protocols, such as HTTP and DNS.

When routing traffic to a CDN PoP, Anycast can give you the IP address of a load balancer or an edge-tier node, which can be directly used by the HTTP protocol.

It is also possible that you use Anycast to send requests to specific DNS servers. The DNS server can then use finer-grained geolocation information that differs based on the selected DNS server.

Varnish Traffic Router

Varnish Software is also building a traffic router. The systems are designed to be perfectly compatible with its Varnish Enterprise offering and approaches the routing aspect from two different angles:

  • DNS
  • HTTP redirects

One big differentiator from the other request routing solutions is that the Varnish Traffic Router keeps track of PoP and endpoint utilization and health. It keeps track of bandwidth consumption and request rate. It takes the load of the individual endpoints and PoPs into account when routing traffic. An endpoint or PoP that is not healthy or overloaded will not get any traffic sent to it. There is also support for CIDR routing.

To avoid reinventing the wheel, Varnish Traffic Router doesn’t implement a custom DNS server but leverages PowerDNS instead.

The Varnish Traffic Router uses PowerDNS to handle all DNS protocol specifics and acts as a remote backend. This means that PowerDNS polls an HTTP endpoint to retrieve zone information. This endpoint happens to be a specific listening port on the Varnish Traffic Router.

The logic, the rules, and the geolocation is done inside the traffic router.

This logic can also be exposed for incoming HTTP requests: when a client requests an HTTP resource on the traffic router, it decides based on the client IP address which node in a specific PoP is going to be selected. The result is an HTTP 301 redirect to that node.

For websites, HTTP redirection is not ideal for SEO reasons. But for assets like images, video streams, and other static files, this is a viable solution. Video is the primary use case here.

Varnish Traffic Router is still an unreleased product, still in development at the time of writing this book. However, chances are that by the time you are reading this, the Varnish Traffic Router will be released, and its management integrated into the Varnish Controller.


®Varnish Software, Wallingatan 12, 111 60 Stockholm, Organization nr. 556805-6203