Caching RPM, Python, and PHP with Varnish Enterprise
vmod_goto
in the open-source version.This tutorial will demonstrate how to set up Varnish Enterprise to cache for RPM, Python, and PHP packages. By the end of this tutorial you should be able to use one machine and one VCL configuration for all three of these package types.
The general flow of this process is identifying where your local machine stores the location for retrieving these packages, changing this to point to your Varnish Enterprise instance, and then setting that original stored location as a backend for Varnish.
This process is not limited to these package types, and we have webinars on doing this for Artifactory, Debian, NPM, Go, and private Docker registries.
If you are interested in getting help with other types, please reach out to your Account Manager.
For the sake of space, some of the outputs or logs in this tutorial have been shortened, but the commands should all be clear.
To verify Varnish is caching as you test this, it may be helpful to open a second window and run:
sudo varnishncsa -b -q "BereqMethod eq 'GET'"
This will show you all backend fetches made by Varnish. After the initial fetch, you should see that removing packages and reinstalling them does not change this commands output as they are being served from cache. With that said, let’s get started with RPM.
RPM
First let’s assume you already have Varnish Enterprise installed. With that done, when you run a yum update
, you should see all needed package updates, including for our RPM distribution, in my case AlmaLinux.
Upgraded:
almalinux-gpg-keys-9.5-2.el9.x86_64 almalinux-release-9.5-2.el9.x86_64
almalinux-repos-9.5-2.el9.x86_64
What we want to do now is change the registry locations for where our machine looks for these packages, which in this case will be our Varnish Enterprise instance. Since we’re just doing this test on a Digital Ocean instance, we will just be using localhost:6081
, but if you are implementing this yourself, then change that as needed.
We can see where these packages are being held by doing:
ls /etc/yum.repos.d/
This is the output:
almalinux-appstream.repo almalinux-highavailability.repo almalinux-rt.repo epel-cisco-openh264.repo varnish-plus-60.repo
almalinux-baseos.repo almalinux-nfv.repo almalinux-sap.repo epel-testing.repo
almalinux-crb.repo almalinux-plus.repo almalinux-saphana.repo epel.repo
almalinux-extras.repo almalinux-resilientstorage.repo droplet-agent.repo varnish-enterprise-6.0.repo
We can take a look these files with cat
and notice that all of our almalinux-{something}.repo
files have a line or lines like:
mirrorlist=https://mirrors.almalinux.org/mirrorlist/$releasever/baseos-source
What we want to do is edit this line each time it appears in the desired files to point to Varnish, or in our case http://localhost:6081
.
Looking at /etc/yum.repos.d/almalinux-baseos.repo
, we can see this should be changed to look like:
[baseos]
name=AlmaLinux $releasever - BaseOS
mirrorlist=http://localhost:6081/mirrorlist/$releasever/baseos
# baseurl=https://repo.almalinux.org/almalinux/$releasever/BaseOS/$basearch/os/
enabled=1
gpgcheck=1
countme=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-AlmaLinux-9
metadata_expire=86400
enabled_metadata=1
[baseos-debuginfo]
name=AlmaLinux $releasever - BaseOS - Debug
mirrorlist=http://localhost:6081/mirrorlist/$releasever/baseos-debug
# baseurl=https://repo.almalinux.org/vault/$releasever/BaseOS/debug/$basearch/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-AlmaLinux-9
metadata_expire=86400
enabled_metadata=0
[baseos-source]
name=AlmaLinux $releasever - BaseOS - Source
mirrorlist=http://localhost:6081/mirrorlist/$releasever/baseos-source
# baseurl=https://repo.almalinux.org/vault/$releasever/BaseOS/Source/
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-AlmaLinux-9
metadata_expire=86400
enabled_metadata=0
Finally, we want to edit our default.vcl
file and set up our backend to pull from https://mirrors.almalinux.org/
like so:
vcl 4.1;
import goto;
backend default none;
sub vcl_init {
new rpm = goto.dns_director("https://mirrors.almalinux.org/", ip_version = ipv4);
}
sub vcl_recv {
unset req.http.cache-control;
unset req.http.pragma;
}
sub vcl_backend_fetch {
if (bereq.http.User-Agent ~ "libdnf") {
set bereq.backend = rpm.backend();
}
unset bereq.http.host;
}
sub vcl_backend_response {
# No Last-Modified header? Just use the current time
if (!beresp.http.last-modified) {
set beresp.http.last-modified = now;
}
if (beresp.status == 200) {
set beresp.ttl = 1h;
set beresp.grace = 1s;
set beresp.keep = 1y;
} else {
set beresp.ttl = 5s;
set beresp.grace = 0s;
}
}
After restarting Varnish, we can do a few yum update
commands, and see that Varnish is caching the package with sudo varnishlog -d
:
* << Request >> 65540
- ReqMethod GET
- ReqURL /mirrorlist/9/baseos
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost:6081
- ReqHeader User-Agent: libdnf (AlmaLinux 9.5; generic; Linux.x86_64)
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- Hit 32771 3096.861751 1.000000 31536000.000000
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader X-Varnish: 65540 32771
- RespHeader Age: 503
- RespHeader Via: 1.1 varnish (Varnish/6.0)
- VCL_call DELIVER
- VCL_return deliver
- End
Python
Python packages are installed with the pip package manager. We need to edit our VCL a bit so that we can also these packages:
vcl 4.1;
import goto;
backend default none;
sub vcl_init {
new rpm = goto.dns_director("https://mirrors.almalinux.org/", ip_version = ipv4);
new python = goto.dns_director("https://pypi.org/simple", ip_version = ipv4);
}
sub vcl_recv {
unset req.http.cache-control;
unset req.http.pragma;
}
sub vcl_backend_fetch {
if (bereq.http.User-Agent ~ "libdnf") {
set bereq.backend = rpm.backend();
}
if (bereq.http.User-Agent ~ "^pip/") {
set bereq.backend = python.backend();
}
unset bereq.http.host;
}
sub vcl_backend_response {
# No Last-Modified header? Just use the current time
if (!beresp.http.last-modified) {
set beresp.http.last-modified = now;
}
if (beresp.status == 200) {
set beresp.ttl = 1h;
set beresp.grace = 1s;
set beresp.keep = 1y;
} else {
set beresp.ttl = 5s;
set beresp.grace = 0s;
}
}
With that done, we have to reload our VCL through varnishreload
before we can go ahead and install python3
and pip
:
sudo varnishreload
sudo yum install python3 python3-pip
And then we can check the versions with:
$ python3 --version
Python 3.9.21
$ pip --version
pip 21.3.1 from /usr/lib/python3.9/site-packages/pip (python 3.9)
Now we want to configure Python to route to Varnish. We can do so temporarily by first uninstalling a package incase it was already installed. Also note the shortened this output:
$ pip uninstall numpy
Found existing installation: numpy 2.0.2
Uninstalling numpy-2.0.2:
Would remove:
/usr/local/bin/f2py
/usr/local/bin/numpy-config
/usr/local/lib64/python3.9/site-packages/numpy-2.0.2.dist-info/*
Proceed (Y/n)?
Successfully uninstalled numpy-2.0.2
And then reinstalling it like so:
pip install numpy --index-url http://localhost:6081/simple --verbose
If we uninstall and reinstall for a second time, we can see the cache hit in the logs like so:
* << Request >> 32776
- ReqMethod GET
- ReqURL /simple/numpy/
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost:6081
- ReqHeader User-Agent: pip/21.3.1 {"ci":null,"cpu":"x86_64","distro":{"id":"Teal Serval","libc":{"lib":"glibc","version":"2.34"},"name":"AlmaLinux","version":"9.5"},"implementation":{"name":"CPython","version":"3.9.21"},"installer":{"name":"pip","version":"21.3.1"},"openssl_version":"OpenSSL 3.2.2 4 Jun 2024","python":"3.9.21","setuptools_version":"53.0.0","system":{"name":"Linux","release":"5.14.0-284.11.1.el9_2.x86_64"}}
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Hit 32771 523.957584 10.000000 0.000000
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader X-Varnish: 32776 32771
- RespHeader Age: 76
- RespHeader Via: 1.1 varnish (Varnish/6.0)
- VCL_call DELIVER
- VCL_return deliver
- End
To make our machine permanently use Varnish, we need to make a directory and configuration file like shown below:
mkdir ~/.pip
nano ~/.pip/pip.conf
And put the following content in ~/.pip/pip.conf
:
[global]
index-url = http://localhost:6081/simple
After we uninstall numpy again, we can reinstall without forcing it to use Varnish like before:
$ pip install numpy
Looking in indexes: http://localhost:6081/simple
Collecting numpy
Installing collected packages: numpy
Successfully installed numpy-2.0.2
We will now see another HIT
in sudo varnishlog -d
.
PHP
PHP packages are installed with the Composer package manager. But before we can install Composer, we need to make sure we have the right dependencies installed:
sudo yum install php php-cli php-common php-mysqlnd php-fpm -y
To verify the install you can check the php version by running php -v
:
$ php -v
PHP 8.0.30 (cli) (built: Aug 3 2023 17:13:08) ( NTS gcc x86_64 )
Copyright (c) The PHP Group
Zend Engine v4.0.30, Copyright (c) Zend Technologies
with Zend OPcache v8.0.30, Copyright (c), by Zend Technologies
Then install Composer:
php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php composer-setup.php
sudo mv composer.phar /usr/local/bin/composer
Let’s make a directory for a project and initiate one:
mkdir php-test
cd php-test/
composer init
composer init
will prompt you to answer a handful of questions for your project. We will skip these questions in this tutorial. Once the questions have been answered you can still edit the generated composer.json
file as needed with your text editor of choice)By running cat composer.json
, you’ll get the following output:
{
"name": "brian/php-test",
"description": "test",
"type": "project",
"minimum-stability": "stable",
"require": {
"guzzlehttp/guzzle": "^7.9"
}
}
With that done, we want to update the VCL to pull from the backend shown by composer config --list --global
:
[repositories.packagist.org.type] composer
[repositories.packagist.org.url] https://repo.packagist.org
The VCL should now look like:
vcl 4.1;
import goto;
backend default none;
sub vcl_init {
new rpm = goto.dns_director("https://mirrors.almalinux.org/", ip_version = ipv4);
new python = goto.dns_director("https://pypi.org/simple", ip_version = ipv4);
new php = goto.dns_director("https://repo.packagist.org", ip_version = ipv4);
}
sub vcl_recv {
unset req.http.cache-control;
unset req.http.pragma;
}
sub vcl_backend_fetch {
if (bereq.http.User-Agent ~ "libdnf") {
set bereq.backend = rpm.backend();
}
if (bereq.http.User-Agent ~ "^pip/") {
set bereq.backend = python.backend();
}
else {
set bereq.backend = php.backend();
}
unset bereq.http.host;
}
sub vcl_backend_response {
# No Last-Modified header? Just use the current time
if (!beresp.http.last-modified) {
set beresp.http.last-modified = now;
}
if (beresp.status == 200) {
set beresp.ttl = 1h;
set beresp.grace = 1s;
set beresp.keep = 1y;
} else {
set beresp.ttl = 5s;
set beresp.grace = 0s;
}
}
After adjusting the default.vcl
and reloading the VCL using sudo varnishreload
, you can adjust the global variable to look for localhost
using:
composer config --global repositories.packagist.org '{"type":"composer", "url":"http://localhost:6081"}'
When you run composer config --list --global
again, you’ll get the following output:
[repositories.packagist.org.type] composer
[repositories.packagist.org.url] http://localhost:6081
If you need to allow http
instead of https
you can do so with:
composer config --global secure-http false
With our machine configured and the VCL in place, we can pull the required packages with:
composer require guzzlehttp/guzzle
We should see our packages be delivered. We can also do sudo varnishlog -d
to see the requests coming through Varnish.
To see cache hits, we can clear the objects from the project, and then pull them again:
rm -rf vendor/ composer.lock
composer clear-cache
Cache directory does not exist (cache-vcs-dir):
composer require guzzlehttp/guzzle
Now when we do a sudo varnishlog -d
we should see cache hits like so:
* << Request >> 229505
- Begin req 229495 rxreq
- ReqMethod GET
- ReqURL /p2/guzzlehttp/streams.json
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost:6081
- ReqHeader Accept: */*
- ReqHeader Accept-Encoding: deflate, gzip, br
- ReqHeader Connection: keep-alive
- ReqHeader User-Agent: Composer/2.8.8 (Linux; 5.14.0-284.11.1.el9_2.x86_64; PHP 8.0.30; cURL 7.76.1)
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- Hit 294980 3101.351442 1.000000 31536000.000000
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.1
- RespStatus 200
- RespHeader X-Varnish: 229505 294980
- RespHeader Age: 498
- RespHeader Via: 1.1 varnish (Varnish/6.0)
- VCL_call DELIVER
- VCL_return deliver
- End
For more questions or assistance, please reach out to your Account Manager.