Varnish sample code for HDS and HLS failover


-  Requirements
Prerequisite knowledge This article assumes that you have basic knowledge of using the Adobe Media Server and know how to install and run Varnish proxy. 
Additional required other products
User level
Required products
Sample files


In a real deployment scenario, playback should be as robust as possible in the face of server-side problems. As our architecture stands today, playback can suffer from two problems regardless of the amount of backend redundancy that is deployed. Those problems are liveness, where a packager advertises a stale view of live, and dropout, where a packager has gaps in its fragment list.
HDS/HLS fail-over is the server side solution to solve the live-ness and dropout problems in HDS/HLS respectively.
This article describes a basic failover setup with redundant packagers and Varnish as a reverse proxy. The article further provides users a step-by-step guide to writing a basic varnish configuration script to configure Varnish for the failover solution.
Failover Setup – Adding redundancy
A typical server side setup  for failover looks like:
Figure 1. An example server side setup for failover
Figure 1. An example server side setup for failover
In this setup, two packagers are hosted at and respectively. The encoder sends an identical ATC-enabled RTMP feed to both the packagers. A reverse proxy acts on behalf of, sourcing its HTTP content from one of the two packagers.
We add redundancy into our system by introducing two additional elements. First, we add an additional packager in order to improve load distribution and fault tolerance. Then we add a reverse proxy between our packagers and our CDN. Reverse proxies often serve multiple roles in a system, as a cache, a load balancer, and a fault tolerance mechanism.
Configuring a Reverse Proxy
There are many different reverse proxy products to choose from, each with their own benefits and drawbacks.
Ideally, the reverse proxy you select should support the following functionality:
503 Failover: 503 failover is a proxy configuration technique in which if one packager returns a 503 response, the proxy retries the request with the next packager in the pool. If all packagers generate a 503, the proxy should return a cacheable error (cached for a short time, say half a fragment interval)
Backend selection based on URL (for HLS): HLS requires that the m3u8 file must only grow in size; An M3U8 can only have entries appended to its tail in subsequent refreshes. When a reverse proxy is present in front of multiple packagers, natural latency between the packagers can cause minor fluctuations between subsequent m3u8 requests. As a result, it is highly recommended that you enable the URL hashing scheme to redirect m3u8 requests. URL hashing gives URL an affinity for a particular packager. This causes the m3u8 to consistently be served by the same packager, ensuring that the m3u8 will only grow between subsequent refreshes.
Response caching: Since the HDS and HLS apache modules do not cache requests and packagers are typically more resource constrained than proxies, transferring the burden of serving requests from the packagers to the proxies (via a cache) usually results in better overall system scalability.
In HDS, our primarily tool for controlling caching behaviour is to tweak the HttpStreamingF4MMaxAge, HttpStreamingBootstrapMaxAge, and HttpStreamingFragMaxAge configuration parameters located in httpd.conf. These parameters modify the HTTP Cache-Control max-age and Expires headers that are emitted from our packager for the manifest, bootstrap, and fragment requests respectively. Intermediate HTTP caches use these headers to determine how long they can cache assets.
Our goal is to fine-tune max-age; max-age should be long enough so they will still absorb most of the load , but short enough so our intermediate caches don’t serve stale data.
Caching requirements for different assets:

- Fragment requests may be cached for a long period of time. Once a fragment has been generated, it will remain the same forever.
- Bootstrap requests should only be cached for a short period of time since the bootstrap file is updated every time a new fragment is available. Setting HttpStreamingFragMaxAge to half a fragment duration is recommended. This will ensure that the intermediate caches periodically refetch the bootstrap, and clients are made aware of new fragments.
- Manifest requests may be cached for a long time. The manifest typically remains unchanged for the duration of the event.
As with HDS, for HLS we must always make sure to configure the HTTP Cache-Control max-age in order to gain maximum scalability. These parameters can be tuned via the HLSM3U8MaxAge, and HLSTSSegmentMaxAge parameters, which control the M3U8 and TS file lifetimes respectively.
TS files are the HLS analog of fragments. They do not change frequently, so may be configured with a long max-age.
M3U8 files are the HLS analog of the bootstrap. It should be configured with a short max-age. A good value is typically half a fragment interval.
NOTE: Cache lifetimes for objects may change in future releases.
Using Varnish Proxy
Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. It sits in front of servers that speak HTTP.
One of the key features of Varnish Cache, in addition to its performance, is the flexibility of its configuration language, Varnish Configuration Language (VCL).
Varnish Configuration Language
The VCL is a small domain-specific language designed to be used to define request handling and document caching policies for the Varnish HTTP accelerator.
VCL enables you to write policies on how incoming requests should be handled. In such a policy one can decide what content you want to serve, from where you want to get the content and how the request or response should be altered.
In our setup, we configure varnish for the available back-end servers, the algorithm to select a back-end whenever an incoming request for a bootstrap/fragment/m3u8/TS is received and to decide on the response to be sent back to the client.
The subsequent code examples in this article describe how to configure Varnish for our failover setup.
Writing basic VCL for failover
One can start with editing the default VCL file that comes packaged with Varnish (default.vcl) or start with a new VCL file. Attached is the sample code adobe_sample.vcl.
The rest of the document refers to the sample vcl attached while describing each subroutine and configurations.
Configuring backend servers
Varnish has a concept of "backend" or "origin" servers. A backend server is the server providing the content Varnish will accelerate.
At the beginning of the VCL add the following:

# sections you should customize to match your configuration
# are marked with the word "CUSTOMIZE"
# define the available back-ends.
# CUSTOMIZE this section to point to your packagers.
# in this example, we have 2 packagers hosted at
# and

backend b1 {
.host =;
.probe = {
.url = "/";
.timeout = 150 ms;
.interval = 10s;
.window = 6;
.threshold = 5;

backend b2 {
.host =;
.probe = {
.url = "/";
.timeout = 150 ms;
.interval = 10s;
.window = 6;
.threshold = 5;

This adds 2 back-end servers b1 and b2. Add as many entries as the number of servers in the setup.
Backend probes
Backends can be probed to see whether they should be considered healthy or not.Back-end Health Polling is done in order to determine a "sick/healthy" state for each back-end.
Polling is configured by adding a {{.probe}} member to the back-end definition in VCL, as in the above example.
The various options and their definitions are:
url: What URL should varnish request.
interval: How often should it poll.
timeout: What is the timeout of the probe.
window: Varnish maintains a sliding window of the poll results. Window size determines how many of the latest polls to consider when determining if the back-end is healthy.
threshold: How many of the window last polls must be good for the back-end to be declared healthy.

The configuration in the above code will cause Varnish to send a request to http:/// every 10 seconds. When deciding whether to use a server or not, it will look at the last 6 probes it has sent and require that at least 5 of them were good.


You can also group several back-ends into a group of back-ends. These groups are called directors. In our setup we use 2 directors hash and round-robin. Add the following to the VCL after configuring the backend servers.

# define the hash director that is used for m3u8 (HLS) requests.
# CUSTOMIZE this section if you have additional packagers

director hls hash {
{.backend = b1; .weight = 1;}
{.backend = b2; .weight = 1;}

# define the round_robin director that is used for HDS requests.
# CUSTOMIZE this section if you have additional packagers
director hds round-robin {
{.backend = b1;}
{.backend = b2;}

Round-robin director: With this director Varnish distributes the incoming requests across backend servers on a round-robin basis.
Hash director:  With the hash director Varnish will pick a backend server based on the URL hash value. The hash director uses the hash data, which means that the same URL will go to the same back-end server every time. 
Customizing Varnish Subroutines:
Typically a VCL file is divided into subroutines. The different subroutines are executed at different times. Two of the most important of these are: vcl_recv and vcl_fetch.
vcl_recv  is called at the beginning of a request, after the complete request has been received and parsed. Its purpose is to decide whether or not to serve the request, how to do it, and, if applicable, which back-end server to use. 
vcl_fetch is called after a response has been successfully retrieved from the back-end server. Normal tasks here are to alter the response headers, or try alternate back-end servers in case the request failed.
Start off with a basic vcl_recv subroutine.

Add the following:

## Called when a client request is received
sub vcl_recv {
set req.backend = hds;

This code sets the director as HDS, which is round-robin. With this director Varnish will select back-end servers on a round-robin fashion whenever an incoming request is received.
Next, for m3u8 requests we select back-end servers based on URL hashing. This is done so that an m3u8 for a stream is always served by the same packager. The solution is sticky towards a packager while serving m3u8. The  check "req.url ~ "\.m3u8$" && req.url ~ "hls-live"" ensures that we do this only for m3u8 requests for live streams.
Add the following in the sub-routine:

## Use URL hashing direction scheme for .m3u8s in order to mitigate liveness issues on packager failure and restart

if (req.request == "GET" && req.url ~ "\.m3u8$" && req.url ~ "hls-live") {
set req.backend = hls;
if(req.restarts > 0){
set req.url = req.url + "?restart=" + req.restarts;

If either of the back-end servers fail in serving the request, we append a query-string to the request URL thereby changing its hash, so that another server is selected.
The following is then added to aid logging and set grace.

## Add a unique header containing the client address for the purposes of logging

if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;

# grace settings, note this is also set in vcl_fetch,
set req.grace = 80s;

Grace in the scope of Varnish means delivering otherwise expired objects when circumstances call for it. A grace can happen if:
  • A request is already pending for some specific content (deliver old content as long as fetching new content is in progress).
  • No healthy back-end is available.
Setting it to 80 secs implies clients will be delivered content that is no more than 80 seconds past it's TTL.
The following makes sure that bootstrap and fragments are always served from the Varnish cache.
In vcl_recv you can perform the following terminating statements:
pass the cache, executing the rest of the Varnish processing as normal, but not looking up the content in cache or storing it to cache.
pipe the request, telling Varnish to shuffle bytes between the selected back-end and the connected client without looking at the content. Because Varnish no longer tries to map the content to a request, any subsequent request sent over the same keep-alive connection will also be piped, and not appear in any log.
lookup the request in cache, possibly entering the data in cache if it is not already present.
error Generate a synthetic response from Varnish. Typically an error message, redirect message or response to a health check from a load balancer.

## always cache bootstraps:
if (req.request == "GET" && req.url ~ "\.(bootstrap)") {

## always cache fragments:
if (req.request == "GET" && req.url ~ "(\wSeg[0-9]*-Frag[0-9]*)") {

## do not cache these rules:
if (req.request != "GET" && req.request != "HEAD") {

When you return look-up from vcl_recv you tell Varnish to deliver content from cache even if the request otherwise indicates that the request should be passed. 
Handling compression and Accept-Encoding headers
Each browser sends information to the server about what kind of caching mechanisms it supports. All modern browsers now support "gzip" compression, but they all inform the server that they support it in different ways.
For example, the header of modern browsers will report Accept-Encoding in the following ways:
Firefox, IE: gzip, deflate
Chrome: gzip,deflate,sdch
Opera: deflate, gzip, x-gzip, identity, *;q=0
In addition to the headers sent by the browser, Varnish must also pay attention to the headers sent by Apache, which usually include lines like this:
Vary: Accept-Encoding
This means that Varnish will store a different cache for every version of the "Accept-Encoding" header it receives from different browsers! That means you'll be maintaining separate cached copies of your web site for each different browser, and in some cases for different versions of the same browser. This is a huge waste of space, since every browser actually supports "gzip", but just reports it differently. To prevent this confusion and increase the efficiency of the cache, we include the following snippet in vcl_recv:

# Handle compression correctly. Different browsers send different
# "Accept-Encoding" headers, even though they mostly all support the same
# compression mechanisms. By consolidating these compression headers into
# a consistent format, we can reduce the size of the cache and get more
# hits.

if (req.http.Accept-Encoding) {
if (req.url ~ "\.(ts|bootstrap)$" ||
req.url ~ "(\wSeg[0-9]*-Frag[0-9]*)"){
# No point in compressing these
remove req.http.Accept-Encoding;

} elsif (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";

} elsif (req.http.Accept-Encoding ~ "deflate" &&
req.http.user-agent !~ "MSIE") {
set req.http.Accept-Encoding = "deflate";

} else {
# unkown algorithm
remove req.http.Accept-Encoding;

## if it passes all these tests, do a lookup anyway;


The following snippet defines the vcl_fetch subroutine:

## Called when the requested object has been retrieved from the backend, or the request to the backend has failed
sub vcl_fetch {
# fail-over to the next server when we receive a 503.
# CUSTOMIZE the value of "503" for your deployment.
# It should be the same as the unavailable Response code
# set in httpd.conf at the server
if (beresp.status == 503) {
# CUSTOMIZE the value "1" for your deployment.
# It should be the number of packagers - 1.
if(req.restarts < 1) {
return (restart);
} else {
# all servers failed, generate a cache-able 404 error.
# NOTE: we have chosen not to use the varnish
# "error " mechanism to generate our
# error since those errors are not cached by varnish.
# instead, we will transform our 503 error into a 404 error.

set beresp.status = 404;
set beresp.response = "Not found.";

# CUSTOMIZE ttl to be 1/2 a fragment interval.
set beresp.ttl = 2s;

# CUSTOMIZE Cache-Control to be 1/2 a fragment interval.
set beresp.http.Cache-Control = "maxage=2";

unset beresp.http.expires;
# If the backend is unreachable, hold content for 10 mins
set beresp.grace = 600s;

if (req.backend.healthy) {
# Blanket cache everything for 1 fragment interval.
# CUSTOMIZE the value "4" for your deployment as the configured
# fragment duration.
set beresp.ttl = 4s;
} else {
# The backend is sick set ttl to 10 mins to serve stale content
set beresp.ttl = 600s;

## Do not cache 50x errors
if (beresp.status >= 500) {
set beresp.ttl = 0s;

set beresp.http.X-Cacheable = "YES";


Apart from setting the Time To Live values for various file types, there are 2 functions which deserve a mention here: restart and deliver.
Returning deliver in vcl_fetch tells Varnish to cache, if possible. If you chose to pass the request in an earlier VCL function (e.g.: vcl_recv), you will still execute the logic of vcl_fetch, but the object will not enter the cache even if you supply a cache time.
Returning restart will increment the restart counter by 1 and start the VCL processing again from the top of vcl_recv.
In vcl_fetch we implement the 503 failover solution
503 failover is a proxy configuration technique that provides a simple fail-over solution. The technique works as follows:
  1. When the proxy receives a 503 HTTP error response from a back-end, we return restart form vcl_fetch. In this way the proxy will attempt the request with the next back-end in the pool.
  2. As soon as any back-end returns a valid response, the proxy will send that fragment back to the player.
  3. The restart counter increments every time a back-end fails to serve the request (returns a 503 response). We check on the restart counter and if all the back-ends fail to serve the request, the proxy returns a cacheable error (such as a 404). We also set the max-age of this error to a short value (e.g. half a fragment interval) to prevent unnecessary load on the backends.
Also, beresp.ttl  defines how long an object is kept and beresp.grace defines how long past the {{beresp.ttl }}time Varnish will keep an object.
Note: To provide fine grained caching control, Adobe recommends different TTLs depending on asset type and stream type. Refer the section "Caching requirements for different assets" in the article for the TTL requirements.
Also the attached sample code has commented section in vcl_fetch function to configure TTL values.
Other functions:
The functions vcl_error, vcl_deliver, vcl_hit and vcl_miss are unaltered and added as it-is from the default vcl file.
vcl_hit is called right after an object has been found (hit) in the cache. You can also change the TTL or issue purge here.
vcl_miss is called right after an object was looked up and not found in cache.
vcl_deliver is the common last exit point for all (except vcl_pipe) code paths.
vcl_error is used to generate content from within Varnish, without talking to a web server. It is often used to add and remove debug-headers

## If no packager contains
sub vcl_error {
# Add your logic here

## Called when an object is in the cache, its a hit.
sub vcl_hit {
if (obj.ttl == 0s) {


## Called when the requested object was not found in the cache
sub vcl_miss {
# Add your logic here

## Called before a cached object is delivered to the client
sub vcl_deliver {
set resp.http.X-Served-By = server.hostname;
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
set resp.http.X-Cache-Hits = obj.hits;
} else {
set resp.http.X-Cache = "MISS";

Where to go from here
HTTP Streaming failover lets you control how AMS handles certain streaming problems.
The VCL language is a small domain-specific language designed to be used to define request handling and document caching policies for the Varnish HTTP accelerator.
Learn how to configure HTTP streaming failover in Adobe Media Server here.
Learn about Varnish cache here.
Learn more about Adobe Media Server: