This VMOD provides utility functions and an object for the VCL data type BLOB, which may contain arbitrary data of any length.
Examples
sub vcl_init {
# Create blob objects from encodings such as base64 or hex.
new myblob = blob.blob(BASE64, "Zm9vYmFy");
new yourblob = blob.blob(encoded="666F6F", decoding=HEX);
}
sub vcl_deliver {
# The .get() method retrieves the BLOB from an object.
set resp.http.MyBlob-As-Hex
= blob.encode(blob=myblob.get(), encoding=HEX);
# The .encode() method efficiently retrieves an encoding.
set resp.http.YourBlob-As-Base64 = yourblob.encode(BASE64);
# decode() and encode() functions convert blobs to text and
# vice versa at runtime.
set resp.http.Base64-Encoded
= blob.encode(BASE64,
blob=blob.decode(HEX,
encoded=req.http.Hex-Encoded));
}
sub vcl_recv {
# transcode() converts from one encoding to another.
# case=UPPER specifies upper-case hex digits A-F.
set req.http.Hex-Encoded
= blob.transcode(decoding=BASE64, encoding=HEX,
case=UPPER, encoded="YmF6");
# transcode() from URL to IDENTITY effects a URL decode.
set req.url = blob.transcode(encoded=req.url, decoding=URL);
# transcode() from IDENTITY to URL effects a URL encode.
set req.http.url_urlcoded
= blob.transcode(encoded=req.url, encoding=URL);
}
Binary-to-text encoding schemes are specified by ENUMs in the VMOD’s constructor, methods and functions. Decodings convert a (possibly concatenated) string into a blob, while encodings convert a blob into a string.
ENUM values for an encoding scheme can be one of:
IDENTITY
BASE64
BASE64URL
BASE64URLNOPAD
HEX
URL
Empty strings are decoded into a “null blob” (of length 0), and conversely a null blob is encoded as the empty string.
For encodings with HEX
or URL
, you may also specify a case
ENUM with one of the values LOWER
, UPPER
or DEFAULT
to
produce a string with lower- or uppercase hex digits (in [a-f]
or
[A-F]
). The default value for case
is DEFAULT
, which for
HEX
and URL
means the same as LOWER
.
The case
ENUM is not relevant for decodings; HEX
or URL
strings to be decoded as BLOBs may have hex digits in either case, or
in mixed case.
The case
ENUM MUST be set to DEFAULT
for the other encodings
(BASE64* and IDENTITY). You cannot, for example, produce an uppercase
string by using the IDENTITY scheme with case=UPPER
. To change the
case of a string, use the toupper
or tolower
functions from
vmod_std(3).
The simplest encoding converts between the BLOB and STRING data types, leaving the contents byte-identical.
Note that a BLOB may contain a null byte at any position before its end; if such a BLOB is decoded with IDENTITY, the resulting STRING will have a null byte at that position. Since VCL strings, like C strings, are represented with a terminating null byte, the string will be truncated, appearing to contain less data than the original blob. For example
# Decode from the hex encoding for "foo\0bar".
# The header will be seen as "foo".
set resp.http.Trunced-Foo1
= blob.encode(IDENTITY, blob=blob.decode(HEX,
encoded="666f6f00626172"));
IDENTITY is the default encoding and decoding. So the above can also be written as
# Decode from the hex encoding for "foo\0bar".
# The header will be seen as "foo".
set resp.http.Trunced-Foo2
= blob.encode(blob=blob.decode(HEX, encoded="666f6f00626172"));
The case
ENUM MUST be set to DEFAULT
for IDENTITY
encodings.
The base64 encoding schemes use 4 characters to encode 3 bytes. There are no newlines or maximal line lengths – whitespace is not permitted.
The BASE64
encoding uses the alphanumeric characters, +
and
/
; and encoded strings are padded with the =
character so that
their length is always a multiple of four.
The BASE64URL
encoding also uses the alphanumeric characters, but
-
and _
instead of +
and /
, so that an encoded string
can be used safely in a URL. This scheme also uses the padding
character =
.
The BASE64URLNOPAD
encoding uses the same alphabet as
BASE6URL
, but leaves out the padding. Thus the length of an
encoding with this scheme is not necessarily a multiple of four.
The case
ENUM MUST be set to DEFAULT
for for all of the
BASE64*
encodings.
The HEX
encoding scheme converts hex strings into blobs and vice
versa. For encodings, you may use the case
ENUM to specify upper-
or lowercase hex digits A
through f
(default DEFAULT
,
which means the same as LOWER
). A prefix such as 0x
is not
used for an encoding and is illegal for a decoding.
If a hex string to be decoded has an odd number of digits, it is
decoded as if a 0
is prepended to it; that is, the first digit is
interpreted as representing the least significant nibble of the first
byte. For example
# The concatenated string is "abcdef0", and is decoded as "0abcdef0".
set resp.http.First = "abc";
set resp.http.Second = "def0";
set resp.http.Hex-Decoded
= blob.encode(HEX, blob=blob.decode(HEX,
encoded=resp.http.First + resp.http.Second));
The URL
decoding replaces any %<2-hex-digits>
substrings with
the binary value of the hexadecimal number after the % sign.
The URL
encoding implements “percent encoding” as per RFC3986. The
case
ENUM determines the case of the hex digits, but does not
affect alphabetic characters that are not percent-encoded.
BLOB decode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding = IDENTITY, INT length = 0, STRANDS encoded)
Returns the BLOB derived from the string encoded
according to the
scheme specified by decoding
.
If length
> 0, only decode the first length
characters of the
encoded string. If length
<= 0 or greater than the length of the
string, then decode the entire string. The default value of length
is 0.
decoding
defaults to IDENTITY.
Example
blob.decode(BASE64, encoded="Zm9vYmFyYmF6");
# same with named parameters
blob.decode(encoded="Zm9vYmFyYmF6", decoding=BASE64);
# convert string to blob
blob.decode(encoded="foo");
Arguments:
length
accepts type INT with a default value of 0
optional
encoded
accepts type STRANDS
decoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
Type: Function
Returns: Blob
STRING encode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} encoding = IDENTITY, ENUM {LOWER, UPPER, DEFAULT} case = DEFAULT, BLOB blob)
Returns a string representation of the BLOB blob
as specified by
encoding
. case
determines the case of hex digits for the
HEX
and URL
encodings, and is ignored for the other encodings.
encoding
defaults to IDENTITY, and case
defaults to DEFAULT.
DEFAULT is interpreted as LOWER for the HEX and URL encodings, and is
the required value for the other encodings.
Example
set resp.http.encode1
= blob.encode(HEX,
blob=blob.decode(BASE64, encoded="Zm9vYmFyYmF6"));
# same with named parameters
set resp.http.encode2
= blob.encode(blob=blob.decode(encoded="Zm9vYmFyYmF6",
decoding=BASE64),
encoding=HEX);
# convert blob to string
set resp.http.encode3
= blob.encode(blob=blob.decode(encoded="foo"));
Arguments:
blob
accepts type BLOB
encoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
case
is an ENUM that accepts values of LOWER
, UPPER
, and DEFAULT
with a default value of DEFAULT
optional
Type: Function
Returns: String
STRING transcode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding = IDENTITY, ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} encoding = IDENTITY, ENUM {LOWER, UPPER, DEFAULT} case = DEFAULT, INT length = 0, STRANDS encoded)
Translates from one encoding to another, by first decoding the string
encoded
according to the scheme decoding
, and then returning
the encoding of the resulting blob according to the scheme
encoding
. case
determines the case of hex digits for the
HEX
and URL
encodings, and is ignored for other encodings.
As with decode()
: If length
> 0, only decode the first
length
characters of the encoded string, otherwise decode the
entire string. The default value of length
is 0.
decoding
and encoding
default to IDENTITY, and case
defaults to DEFAULT. DEFAULT is interpreted as LOWER for the HEX and
URL encodings, and is the required value for the other encodings.
Example
set resp.http.Hex2Base64-1
= blob.transcode(HEX, BASE64, encoded="666f6f");
# same with named parameters
set resp.http.Hex2Base64-2
= blob.transcode(encoded="666f6f",
encoding=BASE64, decoding=HEX);
# URL decode -- recall that IDENTITY is the default encoding.
set resp.http.urldecoded
= blob.transcode(encoded="foo%20bar", decoding=URL);
# URL encode
set resp.http.urlencoded
= blob.transcode(encoded="foo bar", encoding=URL);
Arguments:
length
accepts type INT with a default value of 0
optional
encoded
accepts type STRANDS
decoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
encoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
case
is an ENUM that accepts values of LOWER
, UPPER
, and DEFAULT
with a default value of DEFAULT
optional
Type: Function
Returns: String
BOOL same(BLOB, BLOB)
Returns true if and only if the two BLOB arguments are the same object, i.e. they specify exactly the same region of memory, or both are empty.
If the BLOBs are both empty (length is 0 and/or the internal pointer
is NULL), then same()
returns true
. If any non-empty BLOB
is compared to an empty BLOB, then same()
returns false
.
Arguments: None
Type: Function
Returns: Bool
BOOL equal(BLOB, BLOB)
Returns true if and only if the two BLOB arguments have equal contents (possibly in different memory regions).
As with same()
: If the BLOBs are both empty, then equal()
returns true
. If any non-empty BLOB is compared to an empty BLOB,
then equal()
returns false
.
Arguments: None
Type: Function
Returns: Bool
INT length(BLOB)
Returns the length of the BLOB.
Arguments: None
Type: Function
Returns: Int
BLOB sub(BLOB, BYTES length, BYTES offset = 0)
Returns a new BLOB formed from length
bytes of the BLOB argument
starting at offset
bytes from the start of its memory region. The
default value of offset
is 0B.
sub()
fails and returns NULL if the BLOB argument is empty, or if
offset + length
requires more bytes than are available in the
BLOB.
Arguments:
length
accepts type BYTES
offset
accepts type BYTES with a default value of 0
optional
Type: Function
Returns: Blob
OBJECT blob(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding = IDENTITY, STRANDS encoded)
Creates an object that contains the BLOB derived from the string
encoded
according to the scheme decoding
.
Example
new theblob1 = blob.blob(BASE64, encoded="YmxvYg==");
# same with named arguments
new theblob2 = blob.blob(encoded="YmxvYg==", decoding=BASE64);
# string as a blob
new stringblob = blob.blob(encoded="bazz");
Arguments:
encoded
accepts type STRANDS
decoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
Type: Object
Returns: Object.
BLOB .get()
Returns the BLOB created by the constructor.
Example
set resp.http.The-Blob1 =
blob.encode(blob=theblob1.get());
set resp.http.The-Blob2 =
blob.encode(blob=theblob2.get());
set resp.http.The-Stringblob =
blob.encode(blob=stringblob.get());
Arguments: None
Type: Method
Returns: Blob
STRING .encode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} encoding = IDENTITY, ENUM {LOWER, UPPER, DEFAULT} case = DEFAULT)
Returns an encoding of BLOB created by the constructor, according to
the scheme encoding
. case
determines the case of hex digits
for the HEX
and URL
encodings, and MUST be set to DEFAULT
for the other encodings.
Example
# blob as text
set resp.http.The-Blob = theblob1.encode();
# blob as base64
set resp.http.The-Blob-b64 = theblob1.encode(BASE64);
For any blob
object, encoding ENC
and case CASE
, encodings
via the .encode()
method and the encode()
function are equal
# Always true:
blob.encode(ENC, CASE, blob.get()) == blob.encode(ENC, CASE)
But the object method is more efficient – the encoding is computed
once and cached (with allocation in heap memory), and the cached
encoding is retrieved on every subsequent call. The encode()
function computes the encoding on every call, allocating space for the
string in Varnish workspaces.
So if the data in a BLOB are fixed at VCL initialization time, so that
its encodings will always be the same, it is better to create a
blob
object. The VMOD’s functions should be used for data that are
not known until runtime.
The encoders, decoders and sub()
may fail if there is insufficient
space to create the new blob or string. Decoders may also fail if the
encoded string is an illegal format for the decoding scheme. Encoders
will fail for the IDENTITY
and BASE64*
encoding schemes if the
case
ENUM is not set to DEFAULT
.
If any of the VMOD’s methods, functions or constructor fail, then VCL
failure is invoked, just as if return(fail)
had been called in the
VCL source. This means that:
If the blob
object constructor fails, or if any methods or
functions fail during sub vcl_init
, then the VCL program will fail
to load, and the VCC compiler will emit an error message.
If a method or function fails in any other VCL subroutine besides
sub vcl_synth
, then control is directed to sub vcl_synth
. The
response status is set to 503 with the reason string "VCL failed"
, and an error message will be written to the Varnish log
using the tag sub vcl_Error
.
If the failure occurs during sub vcl_synth
, then sub vcl_synth
is
aborted. The response line 503 VCL failed
is returned, and
the sub vcl_Error
message is written to the log.
The VMOD allocates memory in various ways for new blobs and
strings. The blob
object and its methods allocate memory from the
heap, and hence they are only limited by available virtual memory.
The encode()
, decode()
and transcode()
functions allocate
Varnish workspace, as does sub()
for the newly created BLOB. If
these functions are failing, as indicated by “out of space” messages
in the Varnish log (with the sub vcl_Error
tag), then you will need to
increase the varnishd parameters workspace_client
and/or
workspace_backend
.
The transcode()
function also allocates space on the stack for a
temporary BLOB. If this function causes stack overflow, you may need
to increase the varnishd parameter thread_pool_stack
.
Arguments:
encoding
is an ENUM that accepts values of IDENTITY
, BASE64
, BASE64URL
, BASE64URLNOPAD
, HEX
, and URL
with a default value of IDENTITY
optional
case
is an ENUM that accepts values of LOWER
, UPPER
, and DEFAULT
with a default value of DEFAULT
optional
Type: Method
Returns: String