Warming up Varnish Cache with varnishreplay

Warming up Varnish Cache with varnishreplay

Image: (c) 2011 jasonwoodhead23. Used under CC BY 2.0.

Varnish Cache 2.0 included a feature that hasn't really been used much but might be a very useful tool should you ever need it. It can be used to replay traffic from the log onto another.

Here I'll try to replay the traffic from www.varnish-cache.org onto a varnish instance installed on my local machine.

$ ssh www.varnish-cache.org varnishlog -w - | varnishreplay -a localhost:6081 -r -

Lets go through what is does. The first part logs into the server and grabs the raw varnishlog and pipes it to standard output. Then we pipe than into a varnishlog instance that attaches to my local varnish running on localhost, port 6081.

Now I can see actual traffic from varnish-cache.org being replayed on my local setup. I've made varnish-cache.org a backend to my varnish and I can play around with the VCL and parameters to see how it affects hit rate and such.

It could also be used to keep a standby cache warm by transmitting some of the traffic to the spare datacenter. Or you could record some of the traffic you have during normal traffic to play back whenever you restart a cache before letting it enter the cluster.

Finally, a word of warning. varnishreplay is not the most mature tool in our toolbox. It hasn't see that much use over the years.  I think there are some limitations on how accurately it replays traffic. Bug reports are very welcome should you find any. 

Add comment

Refresh Type the characters you see in this picture. Type the characters you see in the picture; if you can't read them, submit the form and a new image will be generated. Not case sensitive.  Switch to audio verification.


I always wanted to try it but never got around to that. Also, docs are not exactly great for it.

I think it should also be possible to replay only some types of requests, or in general filter some requests in/out - I realize this can be done in varnishlog but I think varnishreplay is the right place to decide what part of traffic hits the server...

I wonder what it does with POST requests? Body is obviously not there... Even some GETs can be dangerous (can delete an article in CMS etc...). I wouldn't dare try replaying anything in production, but someone might get burned by that...