Clustering vs VHA
Tags:
ops (31)
This document compares Clustering with VHA (Varnish High Availability) across five dimensions: Pull/Push, Resiliency, Latency, Set Up, and Scaling.
By default, in a Varnish setup with independent caches, the origin server may receive one request per cache node even for the same cacheable object. Cluster.vcl solves this problem through prescriptive sharding, ensuring only one origin request per object cluster‑wide. VHA, in contrast, achieves consistency via preemptive cache replication across nodes. Details below, but in nearly all production cases, Cluster.vcl is the preferred solution.
1. Pull/Push
Clustering (Pull-based)
- Guarantees cluster-wide request coalescing: only one request goes to the origin.
- Reduces inter-node traffic and cache bloat caused by rarely accessed objects.
VHA (Push-based)
- Does not guarantee single-origin requests, making it unsuitable for live streaming or low-TTL use cases.
- Every cacheable request is broadcast to all nodes, even if never requested again.
2. Resiliency
Clustering
- Efficiently resilient.
- Moderately popular objects quickly replicate to target nodes.
- On node failure, missing objects are replicated from the primary node until the replication target is met.
- Different objects may have different replication targets.
VHA
- Inefficiently resilient.
- Every node caches every object.
- Can tolerate the loss of all but one node and still retain all objects.
- Limited node failure tolerance, beyond a point, surviving nodes become overwhelmed.
3. Latency
Clustering
- Self-routing may introduce slightly higher latency:
- Cache-misses often go through an extra hop to the primary node.
- Can be mitigated by increasing replication targets for hot content.
- Requires low inter-node communication latency.
VHA
- Can have lower latency in some cases:
- Broadcasts pre-warm other nodes: no extra hop needed.
- Nodes can be warmed before receiving traffic.
- But if multiple nodes miss simultaneously, both must fetch from origin, increasing latency.
4. Set Up
Clustering
- Simple and straightforward.
- No identity management required.
- Entire configuration resides in VCL via a VMOD.
- Nodes defined as static or dynamic backends.
VHA
- Complex and error-prone.
- Mistakes are easy to make, hard to debug, and can go unnoticed.
- Discovery and broadcaster are separate services.
- Nodes defined in a separate file.
5. Scaling
Clustering
- Scales efficiently.
- Horizontally scales memory and disk capacity.
- Hot objects can be replicated to all nodes to reduce inter-node traffic.
- Long-tail objects are selectively replicated to reduce storage bloat.
- Uses efficient Varnish backend fetches for replication.
VHA
- Poor scalability.
- Does not horizontally scale memory or disk.
- Broadcasts every cacheable request to all nodes, increasing traffic and bloat.
- Relies on less efficient
vmod_http
for broadcasts. - High CPU usage.