The Massive Storage Engine version 4 (MSE4
) is an advanced
stevedore implementation for Varnish Enterprise.
The stevedore is the component in Varnish that handles storing the cached objects and their metadata. It is also responsible for keeping track of which objects in the cache are most relevant, and which to evict when there is a need to make room for new content.
MSE4 adds several advanced features compared to the standard stevedores that ship with both open source Varnish Cache and Varnish Enterprise. Some of the highlights are:
Memory based caches using a compacted object structure
MSE4 has a more compact object storage structure giving less storage overhead. This is most noticable for small objects.
Memory Governor
MSE4 features a mechanism that will automatically adjust the size of the cache according to the process memory consumption. This makes it easy to set up the right cache limits, and ensures the best utilization of the available memory even in shifting load conditions.
Large caches using disks to cache objects
MSE4 can use disks as backing for object data, enabling cache sizes that are much larger than the available system memory. Content that is frequently used will be kept in memory for fast access, while less frequently used content will be read from disk instead of fetching from the backend.
Persisted caches
MSE4 will persist the disk backed objects, keeping the content in the cache between planned and unplanned restarts of the Varnish daemon.
Safe runtime disk failures
MSE4 will safely fail a disk that reports errors at runtime, flagging the disk as offline and stop using it. This is possible without restarting the Varnish daemon. Cached content that resides on the failed disk will be removed from the cache.
Disks can also be manually taken offline by an administrator.
Runtime disk reinitialization
Failed disks can be reinitialized and put back into service after a failure without having to restart the cache. This enables an administrator to hot plug a disk and put it back into service without disruption of the service.
Cache payload checksumming
In MSE4 all of the cache content can be checksummed. This enables verifying of the data after reading it from disk, but before delivering it to the requesting client. This protects against random bit-flips on the storage devices.
Content Categories and resource provisioning
MSE4 features an advanced content categorization feature that enables content to be classified into categories. This enables real time statistics on how much cache space different types of content takes up in the cache. The categories can also have resource usage limits applied to them both memory and disk space, which makes it possible to dedicate the cache resources towards specific types of content.
Fair content eviction strategy
When evicting content to make room for fresh content in the cache, the task that does the job of making space will be guaranteed a first go at the newly freed space. This ensures that fetch tasks do not fail due to other simultaneous tasks stealing the space from under it.
This manual page describes the fourth version of the Massive Storage Engine in Varnish Enterprise.
The previous version (version 3) was released in Varnish Enterprise
version 6.0.1r3
. This version is referred to in documentation and
configuration as just the Massive Storage Engine, or MSE
for short.
To ease in transitioning from the previous version, both version 3 and
version 4 is shipped with Varnish Enterprise. Which version is enabled
depends on the configuration options given to the Varnish daemon. The
existing version is just named MSE
as before, and no additional action
or configuration changes are needed to keep using that version.
The new version 4 of the Massive Storage Engine
has been consistently
namespaced as MSE4
in the configuration, documentation and any
identifiers visible e.g. in the Varnish counters. This will make it
immediately apparent which version is targeted and being used. The two
versions are furthermore mutually exclusive, and it will not be possible
to run the Varnish daemon with both versions enabled at the same
time. Trying to do so will produce an argument error when the daemon is
attempted started.
While many of the concepts and features of MSE4 will be recognizable for those that have prior experience using earlier versions of the Massive Storage Engine, MSE4 is a grounds up reimplementation of the stevedore. Some of the changes worth being aware of are:
LMDB has been switched out in favour of a custom database
In MSE the LMDB database is used to provide structure and transactionality for the persisted cache. This enabled safe keeping and ordering of updates to which objects were currently in the cache. Though the small data sizes needed by MSE (which are much smaller than the 4k page size used by LMDB) together with database table fragmentation over time, has shown that the database performance of LMDB is a major limiting factor.
MSE4 features a brand new in-house developed custom database for keeping the cache metadata. This can leverage the custom needs of the Varnish cache over the general database use-case, making it possible to create a much more IO efficient and fragmentation proof solution. This has significantly increased the performance for setups where the set of objects in the cache change frequently.
Safe runtime disk failures and reinitialization
In MSE any IO error reported by the disks would always result in the Varnish daemon doig a panic, which is a runtime exception causing the cache process to terminate.
In MSE4 there is safe runtime disk failures implemented as described above.
Book and store resizing and maintenance
MSE featured only rudimentary support for resizing of the storage files used to keep persisted cache content. Specifically only increasing the sizes of the files were allowed.
In MSE4 there is a rich feature set to perform offline maintenance of both books and stores.
Cache payload checksumming
In MSE there was no checksumming and verification of the integrity of the cached content. This meant that, while unlikely, if the disk content was to change, Varnish would be unaware and continue to serve the content.
In MSE4 all of the content can be checksummed and verified. This feature is by default on.
Mandatory use of Memory Governor
In MSE, the Memory Governor was an optional feature, released after the release of MSE and had to be specifically enabled in the configuration.
In MSE4 the use of the Memory Governor is mandated, and there
are no configuration option to turn it off. Having that feature enabled
always has allowed implementing advanced features like the Content Categories
.
Another effect that comes from this is that for the Memory Governor to function properly, there can be only a single stevedore instance configured on the Varnish server, and MSE4 will mandate that it is the single stevedore instance. The use cases where one would have had to resort to multiple stevedore instances to e.g. limit the cache space for certain content is brought back through the rich feature set of the Content Categories.
The full documentation for MSE4 is provided as manpages shipped with the
Varnish Enterprise packages. Execute man mse4
or man mse4-getting-started
after installing the packages.