HTTP as the go-to protocol

There is a difference between the internet and the World Wide Web: the internet offers us a multitude of protocols to interact with computers over a global network.

The World Wide Web is a specific application of the internet that depends on the HTTP protocol and its siblings. This protocol has always been the engine behind web pages and behind hypermedia, but HTTP has grown and can do so much more now.

Although traditional client-server interactions over HTTP using a web browser are still very common, it’s the fact that machines can communicate with each other over HTTP that took the protocol to the next level.

APIs, service-oriented architectures, remote procedure calls, SOAP, REST, microservices. Over the years, we’ve seen many buzzwords that describe this kind of machine-to-machine communication.

Why develop a custom protocol? Why re-invent the wheel, when HTTP is so accessible?

The strengths of HTTP

HTTP is actually an implementation of the Representational State Transfer (REST) architectural style for distributed hypermedia systems.

We know, it’s quite the mouthful. The design of REST came out of Roy Thomas Fielding’s Ph.D. dissertation, and many of the strengths of HTTP are described in that chapter of the dissertation.

HTTP is a pretty simple stateless protocol that is request-response based. There is a notion of resources that can reflect entities of the application’s business logic. Resources can be identified through URLs, and can be represented in various document and file formats.

The type of action that is performed through an HTTP request is very explicit: because of request methods, the intent is always clear. A GET request is used for data retrieval, a POST request is all about data insertion. And there are many more valid HTTP request methods, each with their own purpose.

The real power of HTTP lies in its metadata, which is exposed through request and response headers. This metadata can be processed by clients, servers, or proxies, and improve the overall experience.

Some of the metadata is considered hypermedia, helping users navigate through resources, presenting resources in the desired format. In essence, it is what makes the World Wide Web so interactive and so impactful.

What makes Roy Thomas Fielding’s dissertation so relevant to our use case, which is web acceleration and content delivery, can be summarized in the following quote:

REST provides a set of architectural constraints that, when applied as a whole, emphasizes scalability of component interactions, generality of interfaces, independent deployment of components, and intermediary components to reduce interaction latency, enforce security, and encapsulate legacy systems.

The entire purpose is to make applications scale and to reduce latency. The fact that caching is explicitly mentioned as one of the features of REST makes it a first-class citizen. It is also by design that so-called intermediary components exist.

Varnish is such an intermediary component, and Varnish reduces interaction latency through caching. A tool like Varnish, among others, facilitates the scalability of HTTP, and as a consequence, the scalability of the web.

The limitations of HTTP

HTTP is not a perfect protocol. Although it probably tackled most of the issues it was designed for, times change, and use cases evolve.

In the nineties, I knew that HTTP was what powered the interaction between my Netscape browser and some web server. Today, in 2021, I know that HTTP is used by my cell phone when it connects to my car to see if the windows are closed. Now that’s an example of evolving use cases.

HTTP is now used in situations it wasn’t designed for. And although it’s doing an okay job, there are some limitations:

HTTP is stateless; we have to use cookies to keep track of state.
There are mechanisms in place to define cache expiration (through the Cache-Control header), but there is no conventional mechanism for explicit cache purging.
The Connection header is used by clients to decide whether or not the connection should be closed after processing an HTTP request. Although it’s a popular mechanism, proper use is at the discretion of the client and server.
Long polling, and Server-Sent Events are used for event-based communication, whereas HTTP itself is not really suited for those situations.
Although connections can be reused, concurrency requires opening up multiple connections to the server, but clients must be careful not to open too many to avoid overloading servers.
Even though the payload of an HTTP response can be compressed with gzip, the headers remain uncompressed.

And there are many more limitations. Despite these limitations, and the fact that HTTP has outgrown its original purpose, we still trust HTTP as a low entry barrier protocol.

Newer versions of the HTTP protocol

Although the previous section portrayed the limitations of HTTP, we should emphasize that it has evolved over the years, and continues to do so.

In 1991, Tim Berners-Lee released HTTP as a single-line protocol, which bootstrapped the World Wide Web. Back then, HTTP didn’t have a version number.

It wasn’t until HTTP/1.0 was released back in 1996, that its predecessor received the HTTP/0.9 version number. HTTP/1.0 was already a multi-line protocol and feature headers. In fact, it already looked a lot like the HTTP we know today.

HTTP/1.1

In 1997 HTTP/1.1 was released. As the version number indicates, it added new features, but it wasn’t a complete overhaul. Here are a couple of notable feature additions that were part of the HTTP/1.1 release:

The Host header is a required request header
Support for persistent connections through Connection: keep-alive
Support for the OPTIONS method
Better support for conditional requests through the Etag and If-None-Match headers
Advanced caching support through the Cache-Control header
Cache variations support through the Vary header
Streaming support using Transfer-Encoding: chunked
Support for compressed responses through the Content-Encoding header
Support for range requests

There are of course many more features in that release, but this overview shows that significant improvements were made. Even in the mid-nineties, when the web was still in its infancy, people had the impression that HTTP was used beyond its scope. HTTP had to improve, which it did, and still does to this day.

HTTP/2

In 2015, HTTP/2 was officially released as the new major version of the protocol. HTTP/2 was inspired by SPDY, a protocol invented by Google to improve transport performance and reduce latency.

In HTTP/1.1, only one request at a time could be sent over a single connection, and you needed for this request to be done to tackle the next one. This is referred to as head-of-line blocking. In order to benefit from concurrency, multiple TCP connections should be opened. This obviously has a major impact on performance and latency, and is further amplified by the fact that modern websites require an increasing amount of resources: JavaScript files, CSS files, web fonts, AJAX calls, and much more.

HTTP/2 tackles these inefficiencies by enabling full request and response multiplexing. This means that one TCP connection can exchange and process multiple HTTP requests and responses at the same time.

The protocol became mostly binary. Messages, both requests and responses, were fragmented into frames. Headers and payload were stored in different frames, and correlated frames were considered a message. These frames and messages were sent over the wire on one or multiple streams, but all in the same connection.

This shift in the way message transport was approached resulted in fewer TCP connections per transaction, less head-of-line blocking, and lower latency. And fewer connections means fewer TLS handshakes, reducing overhead even further.

Another benefit of HTTP/2 is the fact that headers can also be compressed. This feature was long overdue because payload compression was already quite common.

HTTP/3.0

With HTTP/2 we significantly reduced latency by multiplexing requests and responses over a single TCP connection. This solves head-of-line blocking from an HTTP point of view, but not necessarily from a TCP point of view.

When packet loss occurs, there is head-of-line blocking, but it’s at the TCP level. Even if the packet loss only occurs on a single request or response, all the other messages are blocked. TCP has no notion of what is going on in higher-level protocols, such as HTTP.

HTTP/3.0 aims to solve this issue by no longer relying on TCP, but using a different transfer protocol: QUIC.

QUIC looks a lot like TCP, but is built on top of UDP. UDP has no loss recovery mechanisms in place and is a so-called fire and forget protocol. The fact that UDP has no handshaking allows for QUIC to multiplex without the risk of head-of-line blocking. Potential packet loss will only happen on the affected transaction, and will not block other transactions.

QUIC does implement a low-overhead form of handshaking that doesn’t rely on the underlying protocol. As a matter of fact, TLS negotiation is also done in QUIC during the handshaking. This heavily reduces extra roundtrips, compared to TLS on top of TCP.

This new QUIC protocol is a very good match for HTTP, and moves a lot of the transport logic from the transport layer into user space. This allows for HTTP to be a lot smarter when it comes to transport and message exchange, and makes the underlying transport protocol a lot more robust.

What initially was called HTTP over QUIC, officially became HTTP/3.0 in 2018. The way that HTTP header compression was implemented in HTTP/2 turned out to be incompatible with QUIC, which resulted in the need to bump the major version of HTTP to HTTP/3.0.

Most web browsers offer HTTP/3.0 support, but on the web server front, it is still early days. LiteSpeed and Caddy are web servers that support it, but there is no support for it in Apache, and Nginx only has a tech preview of HTTP/3.0 available.

What about Varnish?

Varnish supports HTTP/1.1 and HTTP/2. Requests that are sent using HTTP/0.9, or HTTP/1.0 will result in an HTTP/1.1 response.

As you will see in the next sections, Varnish will leverage many HTTP features to decide whether or not a response will be stored in cache, for how long it will be stored in cache, how it is stored in cache, and how the content will be delivered to the client.

Here’s a quick preview of Varnish default behavior with regard to HTTP:

Varnish will inspect the request method of an HTTP request, and only cache GET or HEAD requests.
Varnish will not serve responses from cache where the request contains cookies or authorization headers.
Varnish can serve compressed data to the client using Gzip compression. If the client doesn’t support it, Varnish will send the plain text version of the response instead.
Varnish will respect the Cache-Control header and use its values to decide whether or not to cache and for how long.
The Expires header is also supported and is processed when there’s no Cache-Control header in the response.
The Vary header is used to support cache variations.
Varnish supports conditional requests, both for clients and backends.
Varnish uses the values of the Etag and Lost-Modified response headers and compares them to If-None-Match and If-Modified-Since request headers for conditional requests.
The special stale-while-revalidate attribute from the Cache-Control header is used by Varnish to determine how long stale content should be served, while Varnish is revalidating the content.
Varnish can serve range requests and supports conditional range requests by comparing the value of an If-Range header to the values of either an Etag header, or a Last-Modified header.
Varnish supports content streaming through chunked transfer encoding.

This is just default behavior. Custom behavior can be defined in VCL and can be used to leverage other parts of HTTP that are not implemented by default.

HTTP/2 in Varnish

Getting back to HTTP/2: Varnish supports it, but you need to add the following feature flag to enable support for H2:

-p feature=+http2

Because the browser community enforced HTTPS for H2, you need to make sure your TLS proxy has H2 as valid ALPN protocol.

If you use Hitch to terminate your TLS connection, you can add the following value to your Hitch configuration file:

alpn-protos = "h2, http/1.1"

If you use a recent version of Varnish Enterprise, you can enable native TLS support, which will handle the ALPN part for you.

HTTP/3 in Varnish

HTTP/3 is on the roadmap for both Varnish Cache and Varnish Enterprise, but the implementation is only in the planning stage for now, with no estimated time of delivery as the protocol itself hasn’t been finalized yet.

The changes needed to support HTTP/3 are substantial, and such changes will always warrant an increase of the major version number.

Basically, it will take at least until Varnish 7 for HTTP/3 to be supported in Varnish.