The tcp vmod contains functions to control TCP congestion control algorithms,
set pacing (rate limiting) and perform logging of protocol-related information.
import std;
import tcp;
sub vcl_recv
{
# Limit all clients to 1000 KB/s.
tcp.set_socket_pace(1000);
}
import std;
import tcp;
sub vcl_recv
{
set req.http.X-Tcp = tcp.congestion_algorithm("bbr");
}
Here, the X-Tcp header field will be set to 0 when changing the congestion control algorithm succeeded. Otherwise, it will be -1, indicating an error.
See the tcp.congestion_algorithm() function for more information about congestion control algorithms.
INT congestion_algorithm(STRING algorithm)
Set the client socket congestion control algorithm to algorithm. Returns 0 on success, and -1 on error.
sub vcl_recv {
set req.http.x-tcp = tcp.congestion_algorithm("cubic");
}
To see your available algorithms:
# sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = reno cubic bbr
The bbr congestion control algorithm requires kernel version 4.9.0 or later. See: https://www.vultr.com/docs/how-to-deploy-google-bbr-on-centos-7
Arguments:
algorithm accepts type STRINGType: Function
Returns: Int
Restricted to: client
VOID dump_info()
Write the contents of the TCP_INFO data structure into varnishlog.
sub vcl_recv {
tcp.dump_info();
}
The varnishlog output could look like this:
VCL_Log tcpi: snd_mss=1448 rcv_mss=536 lost=0 retrans=0
VCL_Log tcpi2: pmtu=1500 rtt=12042 rttvar=6021 snd_cwnd=10 advmss=1448 reordering=3
This function is provided for backward compatibility, please refer to log_info()
for the meaning of the different values and a better way of getting information about
the kernel TCP info metrics.
Arguments: None
Type: Function
Returns: None
Restricted to: client
VOID log_info(STRING record_prefix = "tcpinfo", STRING fields = "snd_mss,rcv_mss,segs_out,total_retrans,delta:segs_out,delta:total_retrans,pmtu,rtt,rttvar", ENUM {text, column, json} format = text)
This function will produce a log when the processing of the client request
ends. The log will contain the fields values from the TCP informations
as reported by the kernel.
The following fields can be reported as calculated by the linux kernel:
advmss : advertised maximum segment sizedata_segs_in : number of payload packets received from the clientdata_segs_out : number of payload packets sent to the clientlost : number of currently queued packets marked lostmin_rtt : minimum estimated round trip time observed (in microseconds)notsent_bytes : bytes ready to be sent to the client which have not been sent yetpmtu : number of bytes which can be transmitted in a single packetrcv_mss : maximum segment size of received packets from clientrcv_rtt : estimated round trip time of the client (in microseconds)rcv_ssthresh : maximum receive capacity advertised to the clientretrans : number of currently queued packets being actively retransmittedreordering : maximum number of duplicate acknowledgement before retransmittingrtt : estimated round trip time (in microseconds)rttvar : estimated mean deviation of the round trip time (variance)segs_in : number of packets received from the clientsegs_out : number of packets sent to the clientsnd_cwnd : maximum number of packets that can be waiting for client acknowledgement (congestion window)snd_mss : maximum segment size for transmitting to the clientsnd_ssthresh : number of packets in the slow start threshold for transmitting to the clienttotal_retrans : number of packets retransmittedEach item in the list can be prefixed by the delta: prefix meaning that
the output should be the difference between the value at the end of processing
and the value at the time of invocation of this function from VCL.
The record_prefix is used as a prefix in the output log and can later be
used as a record selection criteria in a VSL query.
If format is set to column then the values only are in the output log.
If format is set to json then the key-value pairs are placed in JSON
format in the output, note that all values are JSON numbers in the generated
document. Note that the JSON format is subject to change in the future.
Arguments:
record_prefix accepts type STRING with a default value of tcpinfo optional
fields accepts type STRING with a default value of snd_mss,rcv_mss,segs_out,total_retrans,delta:segs_out,delta:total_retrans,pmtu,rtt,rttvar optional
format is an ENUM that accepts values of text, column, and json with a default value of text optional
Type: Function
Returns: None
Restricted to: client
REAL get_estimated_rtt()
Get the estimated round-trip-time for the client socket, measured in milliseconds.
sub vcl_recv
{
if (tcp.get_estimated_rtt() > 300) {
std.log("Client is far away!");
}
}
Arguments: None
Type: Function
Returns: Real
Restricted to: client
VOID set_socket_pace(INT, ENUM {sess, req} scope = sess)
Socket pacing is a Linux method for rate limiting TCP connections in a network friendly way.
Controls TCP rate limiting for the client connection, where pace is measured
in KB/s. The outgoing network interface used must be configured with a supported scheduler, such as fq.
sub vcl_recv
{
# Set client max bandwidth to 1000kb/s for this client,
# as long as the current network scheduler supports it:
if (tcp.set_socket_pace(1000) != 0) {
std.log("Failed to set pacing for client socket!");
}
}
Servers utilizing rate limiting must change their network scheduler. This can be changed with a sysctl setting:
net.core.default_qdisc=fq
See: https://wiki.mikejung.biz/Sysctl_tweaks
The scope parameter has two options, req and sess:
Note that this is a no-op for HTTP/2 clients when used with req scope.
Arguments:
scope is an ENUM that accepts values of sess, and req with a default value of sess optional
Type: Function
Returns: None
Restricted to: client
INT get_socket_pace()
Get the socket pace.
Arguments: None
Type: Function
Returns: Int
Restricted to: client
INT get_quick_ack()
Get the current setting of the TCP_QUICKACK socket option of the client socket.
Arguments: None
Type: Function
Returns: Int
Restricted to: client
VOID set_quick_ack(INT quickack)
Set the current setting of the TCP_QUICKACK socket option of the client socket. When ‘quickack’ is 1, TCP ACK will be sent immediately without waiting for TCP to select an appropiate time to send one. This is useful to workaround a client sending small messages without using TCP_NODELAY. TCP may override this setting at a later time or delay ACKs for other reasons. This function may only be called from the client side.
Arguments:
quickack accepts type INTType: Function
Returns: None
Restricted to: client
The tcp VMOD is available in Varnish Enterprise version 6.0.0r0 and later.