Persistence lets cached content survive (or persist) over (wanted or unexpected) restarts, maintaining a stable performance over time.

Cache persistence has long been in high-demand and is more important than ever, as the volume of data handled grows exponentially and constantly. In both planned and unplanned outage situations, being able to persist cached data to get a near-instant restart without data loss is key.

In a normal cache, some data can be sacrificed to enhance performance. The way it has worked is that when Varnish® is started using a storage backend, that backend can also be called a silo. Persistence splits that silo into segments and starts adding objects to one of the segments. When the segment is full, it is sealed and synced to disk and a new segment is opened. If Varnish shuts down unexpectedly, Varnish discards all the data from the open silo. The closed, read-only silo data is retained. This ensures that we can keep the number of synchronous operations (usually a limiting factor in performance).

This background explains that we have reached a stage where content has outgrown the limitations imposed by physical memory. The Massive Storage Engine (MSE) with persistence came about when sites began to have terabytes of content, and persistence needs were already known (and growing). The need for maintaining a large cache and high performance makes persistence essential.

 Persistence features

  • The ability to persist cached objects through a same-version Varnish restart
  • Ideal for live video and video-on-demand setups. It supports cut-through-forwarding, streaming and handles range and conditional requests
  • Weak filesystem transaction semantics. A regular request can be sent to the client before content is confirmed to be stored on disk
  • Purge and ban support so long as they are persisted before the client request is acknowledged.
  • Very little overhead compared to consistency gains

Persistence benefits

  • Time savings: Avoid the time-consuming wait for the cached data/content to repopulate. Repopulating a cache after a restart takes considerable time, especially when large objects are involved.
  • Consistency: Keeps fragmentation to a minimum for as long as storage runs. Because Varnish can cache content from the backend at any time, we can sacrifice some of the objects in the cache to achieve less fragmentation and higher performance.
  • Massive data capacity: MSE offers capacity for 100+ terabytes of storage on each node.
  • Protection and performance: Varnish servers can be rebooted without adding strain on backends, avoiding backend failures due to a large number of backend requests.

Why persistence?

Companies that typically benefit from persistence are CDNs and businesses delivering/streaming video and/or anyone needing to handle large, multi-terabyte data sets, and ensuring that these data sets persist across sessions to save time, cost and effort as well as maintain a consistent end-user experience in the event of an outage or restart.

Request a free trial