Historically Varnish has always been associated with web acceleration. Varnish was invented to speed up websites and the majority of Varnish users have a web acceleration use case.
However, Varnish is not solely built for websites: Varnish is an HTTP accelerator and there are far more HTTP use cases than just websites.
Accelerating APIs is a good example of an alternate use case for Varnish: APIs return HTTP responses and interpret HTTP requests, but they do not return HTML output. In most cases a REST API will return JSON or XML.
One could say that the acceleration of REST APIs is more straightforward than speeding up a website, and that is because REST APIs inherently respect the best practices of HTTP caching:
API authentication is a more complicated matter: as soon as an
Authorization header appears, caching usually goes out the window.
Just like cookies, auth headers are a mechanism to keep track of
state. They imply that the data is for your eyes only and hence
cannot be cached.
Of course Varnish has an elegant way to work around these limitations, but we’ll talk about state and authentication on the edge at a later stage in the book.
Let’s rewind for a minute, and focus on website acceleration.
Generating the HTML markup for a dynamic website is often done using server-side programming languages like PHP, Python, ASP.NET, Ruby, or Node.js. These languages have the ability to interact with databases and APIs. Although they seem quite fast, they are prone to heavy load as soon as the concurrency increases.
The concurrency aspect is very significant: yes, the application logic that generates HTML output will consume CPU, RAM and disk I/O. But the most time is spent waiting for external resources, such as databases or APIs. While your application is waiting, resources cannot be freed and the connection between the client and the server remains open for the duration of the request.
On a small scale, this has no impact on the user experience, but at larger scale, more memory and CPU will be used, and a lot more time is spent waiting for results to be returned by databases. And eventually you’ll run out of available connections, you’ll run out of available memory, and your CPU usage may spike.
The stability of your entire website is in jeopardy. These are all problems that weren’t tangible at small scale, but the risk was always there.
The bottom line is that code has an impact on the performance of the server. This impact is amplified by the concurrency of visits to your website. By putting Varnish in front of your web server, the impact of the code is mitigated: by caching the HTML output in Varnish, user requests can immediately be satisfied without having to execute the application logic.
The resource consumption and stability aspect also applies to APIs of course, and to any HTTP platform that requires a lot of computation to generate output.
You’ll also notice that websites consist of more than text formatted in HTML markup:
All these files and documents need to be accelerated too. Some of them are a lot bigger in size. Although modern web servers don’t need a lot of CPU power or memory to serve them, there are bottlenecks along the way that require a reverse caching proxy like Varnish.
As mentioned before: web servers only have a limited number of available connections. At large scale you quickly run out of available connections. Using Varnish will mitigate that risk.
Another aspect is the geographical distance between the user and the server, and the latency issues that come into play: transmitting images and other large files to the other side of the world will increase latency. That’s just physics: light can only travel so fast through fiber-optic cables.
Having servers close to your users will reduce that latency, which has a positive impact on the quality of experience. By putting Varnish servers in different locations, you can efficiently reduce latency, but you also horizontally scale out the capacity of your web platform, both in terms of server load and bandwidth.
Let’s talk about bandwidth for a minute. At scale, the first problem you’ll encounter is a lack of server resources. With Varnish, you’ll be able to handle a lot more concurrent users, which will expose the next hurdle: a lack of bandwidth.
Your web/HTTP platform might have limited network throughput. Your network may be throttled. Maybe you operate at such a scale that you don’t have sufficient network resources at your disposal.
In those cases it also makes sense to distribute your Varnish servers across various locations: not just to reduce latency, but also to be on multiple networks that have the required capacity.
This use case may sound familiar, and it is exactly the problem that a content delivery network (CDN) tries to tackle: by placing caching nodes in different points of presence (PoPs), latency is reduced, network traffic to a single server is reduced, and excessive server load is tackled as well.
Varnish can serve as a private content delivery network (private CDN), accelerating content close to the consumer. Even caching large volumes of content is not a problem: setting up a multi-tier Varnish architecture with edge nodes for hot content and storage nodes to store more content sharded over multiple nodes allows you to cache petabytes of data using horizontally scalable architecture.
Varnish Enterprise even has a purpose-built stevedore that combines memory and optimized disk storage to build your own private CDN. It’s called the Massive Storage Engine and it is covered in depth in chapter 7.
The why and the how of private CDNs is further explained in chapter 9.
A more unexpected use case for Varnish is the acceleration of online video streaming platforms, or OTT platforms as we call them. More than 80% of the internet’s bandwidth is used to serve video. These are staggering numbers, and video has its own unique content delivery challenges.
Online video is not distributed using traditional broadcast networks, but over the top (OTT). Meaning that a third-party network, in this case the internet, is used to deliver the content to viewers. The distribution of this type of video also uses HTTP as its go-to protocol. And once again, Varnish is the perfect fit to accelerate OTT video.
Accelerating video has many similarities with private CDN:
Although it seems video streaming acceleration is a carbon copy of a regular private CDN, there are some unique challenges.
A lot of it has to do with how online video is packaged. OTT video, both live and on demand, is chopped up into segments. Each segment represents on average six seconds of video. This means that a video player has to fetch the next segment every six seconds. For 4K video, a six-second segment requires transmitting between 10 MB and 20 MB of data. Audio can be a separate download stream and this also applies to subtitles.
A single 4K stream consumes at least 6 GB per hour. Not only does this pose an enormous bandwidth challenge, low latency is also important for the continuity of the video stream, and of course the quality of experience. The slightest delay would result in the video player having to rebuffer the content.
Some of Varnish’s features are ideal for caching live video streams, guaranteeing low latency. For video on demand (VoD), the enterprise product has the storage capabilities as well as a module to prefetch the next video segment.
Varnish for OTT video is discussed in depth in chapter 10.
Varnish operates on the edge, which is the outer tier of your web platform. It is responsible for handling requests from the outside world.
From an operational point of view, the outside world comes with a lot risk. Ensuring the stability of your platform is key, and a reverse caching proxy like Varnish has an important role in maintaining that stability.
We already talked about performance. We also talked about scalability, which is maintaining performance and stability at large scale. These are some of the risks that we try to mitigate.
Another important aspect of risk mitigation is security: it’s not always a large number of concurrent visitors that jeopardizes stability; it’s also about what these visitors do when they’re on your platform.
Websites, APIs, content delivery solutions all consist of many pieces of software. Often all of this software has layer upon layer of components, third-party libraries, and tons of business logic that is written in-house. Keeping all that software secure is a massive undertaking. From the operating system to the web server, from encryption libraries to the component that allows your code to interact with the database: more than 90% of the code is written and maintained by third parties.
Although many organizations have the discipline of installing security updates as soon as they become available, it’s not always clear what needs to be patched. Hackers and cybercriminals are a lot more aware of the vulnerabilities out there, and they’re not afraid to exploit them.
There are many VCL code snippets available that try to detect malicious access to web platforms: from SQL injections, to Cross Site Scripting attempts. In this context, Varnish assumes the role of a Web Application Firewall (WAF), blocking malicious requests. Although these VCL code snippets work to some extent, they are hard to maintain, and are hardly as effective as well-respected WAF projects like ModSecurity.
Varnish Enterprise has a WAF add-on module that wraps around ModSecurity. It allows for all traffic to be inspected by ModSecurity and is configurable using VCL. Suspicious requests are blocked and never reach your origin.
The Varnish WAF supports all ModSecurity features and the full rule set, including the OWASP Core Rule Set. This includes:
When a security vulnerability is detected and reported in the list of common vulnerabilities and exposures (CVE), it is expected that the vulnerability is fixed at the source. Unfortunately software maintainers aren’t always quick enough to respond in time. Luckily ModSecurity proactively releases new rules to protect your origin against so-called zero-day attacks.
The Varnish WAF and security in general will be covered in much more detail in chapter 7.