vsthrottle

Description

The vsthrottle vmod allows for rate-limiting traffic on a single Varnish server. Offers a simple interface for throttling traffic on a per-key basis to a specific request rate.

Keys can be specified from any VCL string, e.g. based on client.ip, a specific cookie value, an API token, etc.

The request rate is specified as the number of requests permitted over a period. To keep things simple, this is passed as two separate parameters, ’limit’ and ‘period’.

If an optional duration ‘block’ is specified, then access is denied altogether for that period of time after the rate limit is reached. This is a way to entirely turn away a particularly troublesome source of traffic for a while, rather than let them back in as soon as the rate slips back under the threshold.

This VMOD implements a token bucket algorithm. State associated with the token bucket for each key is stored in-memory using BSD’s red-black tree implementation.

Token bucket algorithm: http://en.wikipedia.org/wiki/Token_bucket

The memory usage is around 100 bytes per key tracked. This means that per unique client.ip, 100 bytes will be consumed.

A small note is that records accumulate over time. vsthrottle vmod does garbage collection which means that any key that hasn’t seen any activity within its configured rate limit period will get cleaned up. For instance, a rate like 50 requests every 2 minutes, only the unique IP numbers from the previous two minutes are retained.

Example

vcl 4.0;
import vsthrottle;

backend default { .host = "192.0.2.11"; .port = "8080"; }

sub vcl_recv {
  # Varnish will set client.identity for you based on client IP.

  if (vsthrottle.is_denied(client.identity, 15, 10s, 30s)) {
    # Client has exceeded 15 reqs per 10s.
    # When this happens, block altogether for the next 30s.
    return (synth(429, "Too Many Requests"));
  }

  # There is a quota per API key that must be fulfilled.
  if (vsthrottle.is_denied("apikey:" + req.http.Key, 30, 60s)) {
      return (synth(429, "Too Many Requests"));
  }

  # Only allow a few POST/PUTs per client.
  if (req.method == "POST" || req.method == "PUT") {
    if (vsthrottle.is_denied("rw" + client.identity, 2, 10s)) {
      return (synth(429, "Too Many Requests"));
    }
  }
}

API

is_denied

BOOL is_denied(STRING key, INT limit, DURATION period, DURATION block = 0)

Can be used to rate limit the traffic for a specific key to a maximum of ’limit’ requests per ‘period’ time. If ‘block’ is > 0s, (0s by default), then always deny for ‘key’ for that length of time after hitting the threshold.

Note: A token bucket is uniquely identified by the 4-tuple of its key, limit, period and block, so using the same key multiple places with different rules will create multiple token buckets.

Example

sub vcl_recv {
  if (vsthrottle.is_denied(client.identity, 15, 10s)) {
    # Client has exceeded 15 reqs per 10s
    return (synth(429, "Too Many Requests"));
  }
  # ...
}

Arguments:

key accepts type STRING
limit accepts type INT
period accepts type DURATION
block accepts type DURATION with a default value of 0 optional

Type: Function

Returns: Bool

return_token

VOID return_token(STRING key, INT limit, DURATION period, DURATION block = 0)

Increment (by one) the number of tokens in the specified bucket. is_denied() decrements the bucket by one token and return_token() adds it back. Using these two, you can effectively make a token bucket act like a limit on concurrent requests instead of requests / time.

Note: This function doesn’t enforce anything, it merely credits a token to appropriate bucket.

Warning: If streaming is enabled (beresp.do_stream = true) as it is by default now, sub vcl_deliver() is called before the response is sent to the client (who may download it slowly). Thus you may credit the token back too early if you use return_token() in sub vcl_backend_response().

Example

sub vcl_recv {
  if (vsthrottle.is_denied(client.identity, 20, 20s)) {
    # Client has more than 20 concurrent requests
    return (synth(429, "Too Many Requests In Flight"));
  }
  # ...
}

sub vcl_deliver {
  vsthrottle.return_token(client.identity, 20, 20s);
}

Arguments:

key accepts type STRING
limit accepts type INT
period accepts type DURATION
block accepts type DURATION with a default value of 0 optional

Type: Function

Returns: None

remaining

INT remaining(STRING key, INT limit, DURATION period, DURATION block = 0)

Get the current number of tokens for a given token bucket. This can be used to create a response header to inform clients of their current quota.

sub vcl_deliver {
  set resp.http.X-RateLimit-Remaining = vsthrottle.remaining(client.identity, 15, 10s);
}

Arguments:

key accepts type STRING
limit accepts type INT
period accepts type DURATION
block accepts type DURATION with a default value of 0 optional

Type: Function

Returns: Int

blocked

DURATION blocked(STRING key, INT limit, DURATION period, DURATION block)

If the token bucket identified by the four parameters has been blocked by use of the ‘block’ parameter in ‘is_denied()’, then return the time remaining in the block. If it is not blocked, return 0s. This can be used to inform clients how long they will be locked out.

sub vcl_deliver {
  set resp.http.Retry-After
  = vsthrottle.blocked(client.identity, 15, 10s, 30s);
}

Arguments:

key accepts type STRING
limit accepts type INT
period accepts type DURATION
block accepts type DURATION

Type: Function

Returns: Duration

Availability

The vsthrottle VMOD is available in Varnish Enterprise version 6.0.0r0 and later.