Score:0

How to setup Nginx and Varnish reverse proxy for Node.js?

ir flag

My website on the Astro framework (Node.js SSR adapter) is deployed on 1 shared-cpu-1x@256MB fly.io instance in the Amsterdam region, which automatically handling gzip, TSL termination.

Initial setup includes Varnish on port 80 -> Nginx 8080 -> Node.js 3000.

Varnish handles all cache for both static assets and dynamic requests, Nginx is mostly for rewriting/redirecting URLs, serving error pages on top of the main application.

After some research, I found that Nginx is better suited for serving static content, so Varnish will receive the already changed (if needed) URL and only serve dynamic content. Also, in previous configuration I had trouble with the Vary header being duplicated for static assets marked by Varnish. Is this the right way to setup instead of previous one?

New setup: Nginx port 80 -> Varnish 8080 -> Node.js 3000.

How to properly configure caching for static assets var/www/html/client for a year? Will this interfere with the dynamic routes served by Varnish? Thank you very much.

nginx/nginx.conf

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';

    access_log stdout;
    error_log stderr info;

    upstream varnish {
        server localhost:8080;
    }

    server {
        listen 80 default_server;
        listen [::]:80 default_server;

        root /var/www/html/client;
        index index.html;

        server_tokens off;

        error_page 404 /404.html;

        location = /404.html {
            internal;
        }

        location = /robots.txt {
            log_not_found off; access_log off; allow all;
        }

        location ~* \.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$ {
            log_not_found off;
            add_header Cache-Control "public, max-age=31536000, immutable";
            add_header X-Static-File "true";
            expires max;
        }

        # Redirect URLs with a trailing slash to the URL without the slash
        location ~ ^(.+)/$ {
            return 301 $1$is_args$args;
        }

        # Redirect static pages to URLs without `.html` extension
        location ~ ^/(.*)(\.html|index)(\?|$) {
            return 301 /$1$is_args$args;
        }

        location / {
            try_files $uri $uri/index.html $uri.html @proxy;
        }

        location @proxy {
            proxy_http_version 1.1;
            proxy_cache_bypass $http_upgrade;

            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            proxy_redirect off;
            proxy_pass http://varnish;

            proxy_intercept_errors on;
        }
    }
}

varnish/default.vcl

vcl 4.1;

import std;

backend default {
    .host = "127.0.0.1";
    .port = "3000";
}

acl purge {
    "localhost";
    "127.0.0.1";
    "::1";
}

sub vcl_recv {
    // Remove empty query string parameters
    // e.g.: www.example.com/index.html?
    if (req.url ~ "\?$") {
        set req.url = regsub(req.url, "\?$", "");
    }

    // Remove port number from host header
    set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");

    // Sorts query string parameters alphabetically for cache normalization purposes
    set req.url = std.querysort(req.url);

    // Remove the proxy header to mitigate the httpoxy vulnerability
    // See https://httpoxy.org/
    unset req.http.proxy;

    // Only handle relevant HTTP request methods
    if (
        req.method != "GET" &&
        req.method != "HEAD" &&
        req.method != "PUT" &&
        req.method != "POST" &&
        req.method != "PATCH" &&
        req.method != "TRACE" &&
        req.method != "OPTIONS" &&
        req.method != "DELETE"
    ) {
        return (pipe);
    }

    // Only cache GET and HEAD requests
    if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
    }

    // Purge logic to remove objects from the cache.
    if (req.method == "PURGE") {
        if (client.ip !~ purge) {
            return (synth(405, "Method Not Allowed"));
        }
        return (purge);
    }

    // Mark static files with the X-Static-File header, and remove any cookies
    // X-Static-File is also used in vcl_backend_response to identify static files
    if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
        set req.http.X-Static-File = "true";
        unset req.http.Cookie;
        return (hash);
    }

    // No caching of special URLs, logged in users and some plugins
    if (
        req.http.Authorization ||
        req.url ~ "^/preview=" ||
        req.url ~ "^/\.well-known/acme-challenge/"
    ) {
        return (pass);
    }

    // Remove any cookies left
    unset req.http.Cookie;

    return (hash);
}

sub vcl_pipe {
    // If the client request includes an "Upgrade" header (e.g., for WebSocket or HTTP/2),
    // set the same "Upgrade" header in the backend request to preserve the upgrade request
    if (req.http.upgrade) {
        set bereq.http.upgrade = req.http.upgrade;
    }
    return (pipe);
}

sub vcl_backend_response {
    // Inject URL & Host header into the object for asynchronous banning purposes
    set beresp.http.x-url = bereq.url;
    set beresp.http.x-host = bereq.http.host;

    // Set the default grace period if backend is down
    set beresp.grace = 1d;

    // Stop cache insertion when a backend fetch returns an 5xx error
    if (beresp.status >= 500 && bereq.is_bgfetch) {
        return (abandon);
    }

    // Cache 404 response for short period
    if (beresp.status == 404) {
        set beresp.ttl = 60s;
    }

    // Create cache variations depending on the request protocol and encoding type
    if (beresp.http.Vary) {
        set beresp.http.Vary = beresp.http.Vary + ", X-Forwarded-Proto, Accept-Encoding";
    } else {
        set beresp.http.Vary = "X-Forwarded-Proto, Accept-Encoding";
    }

    // If the file is marked as static cache it for 1 year
    if (bereq.http.X-Static-File == "true" && beresp.http.Cache-Control == "public, max-age=0") {
        unset beresp.http.Set-Cookie;
        set beresp.http.X-Static-File = "true";
        set beresp.ttl = 1y;
    }
}

sub vcl_deliver {
    // Check if the object has been served from cache (HIT) or fetched from the backend (MISS)
    if (obj.hits > 0) {
        // For cached objects with a TTL of 0 seconds but still in grace mode, mark as STALE
        if (obj.ttl <= 0s && obj.grace > 0s) {
            set resp.http.X-Cache = "STALE";
        } else {
            // For regular cached objects, mark as HIT
            set resp.http.X-Cache = "HIT";
        }
    } else {
        // For uncached objects, mark as MISS
        set resp.http.X-Cache = "MISS";
    }

    // Set the X-Cache-Hits header to show the number of times the object has been served from cache
    set resp.http.X-Cache-Hits = obj.hits;

    // Unset certain response headers to hide internal information from the client
    unset resp.http.x-url;
    unset resp.http.x-host;
    unset resp.http.x-varnish;
    unset resp.http.via;
}
Score:0
in flag

Nginx is a great web server, Varnish is a great cache, both are great reverse proxy servers.

If you're only using Nginx for URL rewriting, redirection & error handling, you don't really need Nginx. Varnish can do this just as good.

VCL template

The basic VCL configuration I would recommend is the following: https://www.varnish-software.com/developers/tutorials/example-vcl-template/

It's Varnish Software's recommended non-framework-specific VCL. It covers the following items:

  • Stripping of campaign parameters from the URL
  • Sorting query strings
  • Header cleanup
  • Static file caching
  • Backend health checking
  • Edge Side Include parsing
  • Setting the X-Forwarded-Proto header
  • Stripping off tracking cookies
  • Create protocol-aware cache variations

URL rewriting

If you want to perform URL rewriting, you can write if-statements in VCL and reset the URL via set req.url = "...". You can also perform find/replace using the regsuball() function and use regular expressions.

See https://www.varnish-software.com/developers/tutorials/varnish-configuration-language-vcl for a basic VCL tutorial.

Here's a VCL interpretation of the 2 rewrite rules in your Nginx config:

sub vcl_recv {
    if(req.url ~ "^(.+)/$") {
        return(synth(301,regsuball(req.url,"^(.+)/$","\1")));
    }

    if(req.url ~ "^/(.*)(\.html|index)(\?|$)") {
        return(synth(301,regsuball(req.url,"^/(.*)(\.html|index)(\?|$)","/\1")));
    }
}

sub vcl_synth {
    if(resp.status == 301) {
        set resp.http.Location = resp.reason;
        set resp.reason = "Moved Permanently";
        set resp.body = "Redirecting.";
        return(deliver);
    }
}

This example code will redirect /test/ to /test and /test.html to /test, just like in your Nginx config.

Error handling

Errors coming from the backend are handled in VCL's vcl_backend_error subroutine and can be customized.

You also have the ability to generate your own errors in Varnish based on an incoming request. You do this by returning return(synth(INT status, STRING reason); in your VCL code. We already did this in the URL redirection example.

Customizing the output of synthetic responses is similar to backend errors and happens in the vcl_synth subroutine.

Here's an example how you can modify the output template of backend & synthetic errors. The example uses an HTML template: https://www.varnish-software.com/developers/tutorials/vcl-synthetic-output-template-file/

This should give you a clear indication on how to handle errors coming from your NodeJS app.

Keep Nginx in the setup or not?

Based on how you're describing the situation, you don't really need Nginx. All the caching and reverse proxy logic can easily be done in Varnish.

However there are 2 reasons that would justify the use of Nginx in this project:

  • TLS handling
  • Caching large volumes of static data

Let's talk about TLS first: the open source version of Varnish currently doesn't support native TLS. While the commercial version does, but for the open source version of Varnish, you need to terminate TLS.

We developed our own TLS proxy. It's called Hitch and works really well with Varnish. See https://www.varnish-software.com/developers/tutorials/terminate-tls-varnish-hitch/ for a tutorial.

But one could argue that if you're already committed to using Nginx, you might as well use it to terminate the TLS session before connecting to Varnish.

The other reason could be static data. Don't get me wrong: Varnish is great at caching static data and might even be faster than Nginx at it. However, in Varnish caching of large volumes of static data might eat away from your caching space.

In Varnish you need to assign how much memory will be used for caching. If you only have 1GB of memory assigned and have 2GB of static files to cache, your cache may end up completely full. That's not a big issue, because the Least Recently Used algorithm will automatically clear space by removing long tail content. But if that is not acceptable, Nginx can still be used.

If your static file collection is 1GB, but your cache is bigger, you don't really need to add Nginx.

Predaytor avatar
ir flag
Very thank you. Great explanation.
Predaytor avatar
ir flag
Does Nginx use memory like Varnish or disk directly (is it fast?) to serve static assets? Should I only use something like `location ~ {regex} { add_header Cache-Control "public , max-age=31536000"; }`? Will this rule only target the client browser or does it work like in Varnish (for `max-age`, shared cache for browser and CDN)? Thank you.
Thijs Feryn avatar
in flag
Nginx will only send the `Cache-Control` headers to the browser. However, it's fast enough as a web server to serve static content directly to the client without the need to proxy it. However, dynamic requests coming from your NodeJS app will need proxy caching. See http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache for more info.
Predaytor avatar
ir flag
Thanks for answer. I've seen your videos on Varnish, it's fantastic! Also the website documentation is top-notch!
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.