vmod-urlplus
is a URL/query string normalization, parsing and manipulation VMOD. This VMOD allows you to get,
add, delete, and keep URL segments and query parameters. You can also sort query parameters.
vmod-urlplus
keeps an internal representation of the URL up until it is written out.
Keep only the page_id
.
vcl 4.0;
import urlplus;
sub vcl_recv
{
//Keep page_id
urlplus.query_keep("page_id");
//Sort query string and write URL out to req.url
//Any query that is not kept is not written to req.url
urlplus.write();
}
Remove all Google Analytics.
vcl 4.0;
import urlplus;
sub vcl_recv
{
//Remove all Google Analytics
urlplus.query_delete_regex("utm_");
//Sort query string and write URL out to req.url
urlplus.write();
}
Remove random
query parameter.
vcl 4.0;
import urlplus;
sub vcl_recv
{
urlplus.query_delete("random");
//Write URL out to req.url without sorting
urlplus.write(sort_query = false);
}
Normalize, sort and write query string to req.url
.
vcl 4.0;
import urlplus;
sub vcl_recv
{
//Write URL out to req.url
urlplus.write();
}
Specify TTL by file type.
vcl 4.0;
import urlplus;
sub vcl_backend_response
{
//Give files with these extensions a TTL of 1 day
if (urlplus.get_extension() ~ "gif|jpg|jpeg|bmp|png|tiff|tif|img") {
set beresp.ttl = 1d;
}
}
All functions by default will parse and normalize the URL from either req.url
or bereq.url
(depending
on the context), unless otherwise specified using parse()
. All functions keep an internal
state of the URL until it is written out using write()
.
The functions are split into three areas of focus, generic, URL and query string. The generic
functions are, parse()
, reset()
, write()
, as_string()
, get_basename()
, get_filename()
,
get_extension()
, tolower()
and toupper()
. While Functions with a prefix of url
do operations
only on the URL (For example: some/test/url
) and a prefix of query
indicates that operations will be
conducted only on the query string (For example: ?name=value&name2=value2
).
VOID parse(STRING url)
Description
Parse and normalize the URL and query string. Any internal state is reset.
Return value
None.
url
The string to parse from.
VOID reset()
Description
Reset the internal state.
Return value
None.
VOID write(BOOL sort_query = TRUE, ENUM {NONE, KEY, KEY_VALUE} query_unique = NONE,
ENUM {FROM_INPUT, TRUE, FALSE} leading_slash = FROM_INPUT,
ENUM {FROM_INPUT, TRUE, FALSE} trailing_slash = FROM_INPUT)
Description
Write the combined normalized URL and query string back out to be/req.url
. If in keep_mode
only kept URL segments and queries will be written.
Return value
None.
sort_query
Sort the query string prior to writing. Defaults to TRUE
.
query_unique
Remove duplicate values in the query string by either key (KEY
) or key and value (KEY_VALUE
).
The first occurrence is kept and each proceeding occurrence is removed. This is done before
sorting. Defaults to NONE
.
leading_slash
Should the URL contain a leading slash? If TRUE
a slash will be at the
beginning of the URL. If FALSE
no slash will be at the beginning of the URL.
If FROM_INPUT
a slash will be at the beginning of the URL if a leading slash
was in the URL when it was parsed. Defaults to FROM_INPUT
.
trailing_slash
Should the URL contain a trailing slash? If TRUE
a slash will be
at the end of the URL. If FALSE
no slash will be at the end of
the URL. If FROM_INPUT
a slash will be at the end of the URL if
a trailing slash was in the URL when it was parsed. Defaults to FROM_INPUT
.
STRING as_string(BOOL sort_query = TRUE, ENUM {NONE, KEY, KEY_VALUE} query_unique = NONE,
ENUM {FROM_INPUT, TRUE, FALSE} leading_slash = FROM_INPUT,
ENUM {FROM_INPUT, TRUE, FALSE} trailing_slash = FROM_INPUT)
Description
Return a string of the combined normalized URL and query string. If in keep_mode
only kept
URL segments and queries will be returned.
Return value
A string of the combined normalized URL and query string.
sort_query
Sort the query string prior to returning. Defaults to TRUE
.
query_unique
Remove duplicate values in the query string by either key (KEY
) or key and value (KEY_VALUE
).
The first occurrence is kept and each proceeding occurrence is removed. This is done before
sorting. Defaults to NONE
.
leading_slash
Should the URL contain a leading slash? If TRUE
a slash will be
at the beginning of the URL. If FALSE
no slash will be at the
beginning of the URL. If FROM_INPUT
a slash will be at the beginning
of the URL if a leading slash was in the URL when it was parsed. Defaults
to FROM_INPUT
.
trailing_slash
Should the URL contain a trailing slash? If TRUE
a slash will be
at the end of the URL. If FALSE
no slash will be at the end of
the URL. If FROM_INPUT
a slash will be at the end of the URL if
a trailing slash was in the URL when it was parsed. Defaults to FROM_INPUT
.
STRING get_basename()
Description
Get the basename of the URL.
Return value
Given /foo/bar.baz
return bar.baz
. If no extension is present, return the last URL segment.
For example, /foo/bar/baz
would return baz
.
STRING get_filename()
Description
Get the filename of the URL.
Return value
Given /foo/bar.baz
return bar
. If no extension is present, return NULL
.
STRING get_extension()
Description
Get the extension of the URL.
Return value
Given /foo/bar.baz
return baz
. If no extension is present, return NULL
.
STRING get_dirname()
Description
Get the directory name of the URL.
Return value
Given /foo/bar/bar.baz
return /foo/bar
. If one or less URL segments are present return /
.
STRING query_get(STRING name, STRING def = NULL)
Description
Get the value of query name
.
Return value
The value for query name
. If not found, default
is returned.
name
The name of the query parameter. The first query to match is used.
def
The return value if name
is not found. Defaults to NULL
.
VOID tolower(ENUM {ALL, URL, QUERY} convert = "ALL")
Description
Convert the URL to lowercase.
Return value
None.
convert
Pick the part of the URL to convert to lowercase. If set to URL
only the
URL will be converted (for example some/test/url?NAME=Value
). If set to Query
only the query string will be converted (for example SOME/Test/url?name=value
).
When set to ALL
both URL and query string will be converted (for example
some/test/url?name=value
). Defaults to ALL
.
VOID toupper(ENUM {ALL, URL, QUERY} convert = "ALL")
Description
Convert the URL to uppercase.
Return value
None.
convert
Pick the part of the URL to convert to uppercase. If set to URL
only the
URL will be converted (for example SOME/TEST/URL?name=Value
). If set to Query
only the query string will be converted (for example SOME/Test/url?NAME=VALUE
).
When set to ALL
both URL and query string will be converted (for example
SOME/TEST/URL?NAME=VALUE
). Defaults to ALL
.
STRING query_get(STRING name, STRING def = NULL)
Description
Get the value of query name
.
Return value
The value for query name
. If not found, default
is returned.
name
The name of the query parameter. The first query to match is used.
def
The return value if name
is not found. Defaults to NULL
.
STRING query_get_regex(STRING regex, STRING def = NULL)
Description
Get the value of the query with name matching regex
.
Return value
The value for the query with name matching regex
. If not found, default
is returned.
regex
The regular expression used to match on name
. The first query to match is used.
This value is static and cannot change between calls.
def
The return value if name
is not found. Defaults to NULL
.
VOID query_add(STRING name, STRING value = NULL, BOOL keep = TRUE)
Description
Add a query pair of query
, value
with parameter keep
.
Return value
None.
name
The name of the query.
value
The value of the query. Defaults to NULL
.
keep
Indicate if the query should be kept(TRUE
) or not(FALSE
). If in keep_mode
and
keep
is TRUE
, the query is always kept when writing. Defaults to TRUE
.
VOID query_delete(STRING name, BOOL delete_keep = FALSE)
Description
Delete all queries that match name
.
Return value
None.
name
The name of the query to delete. All queries which match name
are deleted.
delete_keep
If set to TRUE
, delete kept queries. Defaults to FALSE
.
VOID query_delete_regex(STRING regex, BOOL delete_keep = FALSE)
Description
Delete all queries with name
matching regex
Return value
None.
regex
The regular expression used to match on name
. All queries which match regex
are deleted.
This value is static and cannot change between calls.
delete_keep
If set to TRUE
, delete kept queries. Defaults to FALSE
.
VOID query_keep(STRING name)
Description
Set keep
to TRUE
for all queries with name
. This enables keep_mode
. When
writing or returning as a string, only kept queries are written/returned.
Return value
None.
name
The name of the query to keep. All queries which match name
are kept.
VOID query_keep_regex(STRING regex)
Description
Set keep
to TRUE
for all queries with name
matching regex
. This enables keep_mode
.
When writing or returning as a string, only kept queries are written/returned.
Return value
None.
regex
The regular expression used to match on name
. All queries which match regex
are kept.
This value is static and cannot change between calls.
INT query_count()
Description
Return the number of queries stored. If keep_mode
is enabled, only kept queries will be counted.
Return value
The number of query pairs.
STRING query_as_string(BOOL sort_query = TRUE, ENUM {NONE, KEY, KEY_VALUE} query_unique = NONE)
Description
Return a string of queries in format name=value&name2
. If keep_mode
is enabled,
only kept queries will be returned.
Return value
The string of the query pairs.
sort_query
Sort the query string prior to writing. Defaults to TRUE
.
query_unique
Remove duplicate values in the query string by either key (KEY
) or key and value (KEY_VALUE
).
The first occurrence is kept and each proceeding occurrence is removed. This is done before
sorting. Defaults to NONE
.
STRING url_get(INT start_range = 0, INT end_range= -1 ,
ENUM {FROM_INPUT, TRUE, FALSE} leading_slash = FROM_INPUT,
ENUM {FROM_INPUT, TRUE, FALSE} trailing_slash = FROM_INPUT)
Description
Get the URL as a string.
Return value
A string of the normalized URL.
start_range
A base 0
integer indicating the starting index of the URL to return.
Defaults to 0
, meaning the first index of the URL.
end_range
A base 0
integer indicating the ending index of the URL to return.
Defaults to -1
, meaning the last index of the URL.
leading_slash
Should the URL contain a leading slash? If TRUE
a slash will be at the beginning of the URL.
If FALSE
no slash will be at the beginning of the URL. If FROM_INPUT
a slash will be at
the beginning of the URL if a leading slash was in the URL when it was parsed. Defaults to FROM_INPUT
.
trailing_slash
Should the URL contain a trailing slash? If TRUE
a slash will be at the end of the URL.
If FALSE
no slash will be at the end of the URL. If FROM_INPUT
a slash will be at the
end of the URL if a trailing slash was in the URL when it was parsed. Defaults to FROM_INPUT
.
VOID url_add(STRING name, BOOL keep = FALSE, INT position = -1)
Description
Add a URL segment to the URL.
Return value
None.
name
the URL segment to add.
keep
Indicate if the URL segment should be kept(TRUE
) or not(FALSE
). If in keep_mode
and
keep
is TRUE
, the URL segment is always kept when writing. Defaults to TRUE
.
position
A base 0
integer indicating where the URL segment will be added. Adding to
position = 0
will add to the front of the list. Adding to position = 1
will add to the second spot and so on. Defaults to -1
, which will add the URL
segment to the end.
VOID url_delete(STRING name, BOOL delete_keep = FALSE)
Description
Delete all instances of URL with name
.
Return value
None.
name
The name of the URL segment to delete.
delete_keep
If set to TRUE
delete kept URL segments. Defaults to FALSE
.
VOID url_delete_range(INT start_range, INT end_range, BOOL delete_keep = FALSE)
Description
Delete all URL segments between the indices start_range
and end_range
.
Return value
None.
start_range
An inclusive base 0
integer indicating the starting index of the range.
end_range
An inclusive base 0
integer indicating the ending index of the range. -1
indicates the last index.
delete_keep
If set to TRUE
delete kept URL segments. Defaults to FALSE
.
VOID url_delete_regex(STRING regex, BOOL delete_keep = FALSE)
Description
Delete all instances of URL segments matching regex
Return value
None.
regex
The regular expression used to match on name
. All URL segments which match regex
are deleted.
This value is static and cannot change between calls.
delete_keep
If set to TRUE
, delete kept URL segments. Defaults to FALSE
.
VOID url_keep(STRING name)
Description
Set keep
to TRUE
for all URL segments of name
. This initiates keep_mode
.
Return value
None.
name
The URL segment to be kept.
VOID url_keep_regex(STRING regex)
Description
Set keep
to TRUE
for all URL segments of name
. This initiates keep_mode
.
Return value
None.
regex
The regular expression used to match on name
. All URL segments which match regex
are kept.
This value is static and cannot change between calls.
INT url_count()
Description
Return the number of URL segments stored. If keep_mode
is enabled, only
kept URL segments will be counted.
Return value
The number of URL segments.
STRING url_as_string(ENUM {FROM_INPUT, TRUE, FALSE} leading_slash = FROM_INPUT,
ENUM {FROM_INPUT, TRUE, FALSE} trailing_slash = FROM_INPUT)
Description
Return a string of the normalized URL in format some/test/url
. If keep_mode
is enabled,
only kept URL segments will be returned.
Return value
A string of the URL.
leading_slash
Should the URL contain a leading slash? If TRUE
a slash will be at the
beginning of the URL. If FALSE
no slash will be at the beginning of the URL.
If FROM_INPUT
a slash will be at the beginning of the URL if a leading slash
was in the URL when it was parsed. Defaults to FROM_INPUT
.
trailing_slash
Should the URL contain a trailing slash? If TRUE
a slash will be
at the end of the URL. If FALSE
no slash will be at the end of
the URL. If FROM_INPUT
a slash will be at the end of the URL if
a trailing slash was in the URL when it was parsed. Defaults to FROM_INPUT
.