Score:0

Varnish - detach and attach cookies

cn flag

First of all, I'm asking to find out whether this is possible or not.

So, I can see Varnish will not cache objects if it has cookies.

I'm thinking about when an incoming request has cookies, Varnish will store them in variables and remove those cookies, and then let it through to the backend. Once the backend finishes processing it, Varnish will put back the cookies.

is this something that is possible?

if it is, how does the vcl look like?

in flag
What do you mean by `put back`?
Budianto IP avatar
cn flag
Let's say a page has a Google Analytics cookie named `_ga`. When the page gets reloaded, the request will go to Varnish, and Varnish will try to identify that cookie via the `vcl_recv` function, and then have it stored in a variable and then unset that cookie. Later on, when Varnish gets a response from the backend via `vcl_backend_response`, Varnish will try to set the cookie back.
Score:2
in flag

Context

Varnish is conservative in when it comes to caching and assumes that the use of cookies implies a level of personalization of the response.

Caching a personalized response could result in privacy or security issues. It could also result in inconsistent output.

Imagine caching a page that has a shopping cart. Caching the whole page would result in everyone having the same shopping cart value.

Here's the built-in VCL behavior for Varnish for cookies:

  • When Varnish sees a Cookie header in the request, it will not serve the object from the cache because it assumes the content is personalized.

See https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/#authorization-headers-and-cookies-are-not-cacheable

  • When Varnish sees a Set-Cookie header in the response, it will not store the object in the cache because setting a cookie is a state change that also implies personalization.

See https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/#dont-cache-responses-with-set-cookie-headers

Stripping cookies

Once a cookie is set through the Set-Cookie response header, it will be passed as a Cookie request header for every subsequent request, even for pages that don't really need these cookies.

That's why it's important to determine which pages require cookies and which don't. It's equally important identify tracking cookies, because they are processed by the client in Javascript and not on the server.

Here's a VCL example where we strip off all cookies, except the ones we need on the server:

vcl 4.1;

sub vcl_recv {
    if (req.http.Cookie) {
        set req.http.Cookie = ";" + req.http.Cookie;
        set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
        set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
        set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
        set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    }
    if (req.http.cookie ~ "^\s*$") {
        unset req.http.cookie;
    }
}

In this case only the PHPSESSID cookie is kept, because it is required to keep track of logins. If after the replace logic this cookie is still there, Varnish will not serve the response from cache. If this cookie was not set, but there were a bunch of tracking cookies, all cookies are removed and the page can be served from the cache.

Removing cookies for certain pages

While the previous example featured stripping off select cookies and only keeping the PHPSESSID cookie, you could still end up on pages that don't actually need that session cookie.

Here's a VCL example that where cookies are stripped off entirely, except on pages that actually need a specific cookie, like the admin pages and shopping cart page:

vcl 4.1;

sub vcl_recv {
    if(req.url ~ "^/admin" || req.url == "/cart") {
        return(pass);
    }
    unset req.http.Cookie;
}

Removing Set-Cookie headers

As shown on https://www.varnish-software.com/developers/tutorials/varnish-builtin-vcl/#dont-cache-responses-with-set-cookie-headers, the built-in VCL will not store an object in the cache if a Set-Cookie header is used, because it implies a state change.

However, this doesn't mean that page will never end up in the cache: the next backend response will probably not contain the Set-Cookie header because the cookie is already set. In that case the next response without the Set-Cookie header may end up in the cache.

However, if you're certain you don't need to set a cookie for certain pages, you can also decide to strip off the Set-Cookie header.

The following example prevents a backend from setting a cookie for static content:

vcl 4.1;

sub vcl_backend_response {
    if (bereq.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|ogg|ogm|opus|otf|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)(\?.*)?$") {
        unset beresp.http.Set-Cookie;
    }
}

Re-attaching cookies

While it is possible to store the cookie value in a header, then strip off the cookies in VCL to serve if from cache, to finally re-attach for backend requests, it doesn't really make much sense.

Why would you send cookies for a backend request if the cacheable response doesn't need a cookie to create its output?

If a backend requests needs the cookie value to compose the output, it probably means that response won't be cacheable, so you might as well bypass the cache altogether for that request.

Next steps

Because your question wasn't very specific, I also had to give you a very general response.

If my response that fully answer your question, I invite you to specify more details about your use case and in return I'll give you a more detailed response.

Budianto IP avatar
cn flag
Hi, this is the 2nd time you helped me with this varnish case, and again it's pretty clear to understand. Yes, I can see it now it doesn't make sense to strip off the cookies. I like the way you explain these cookies, these are complete information in one go. I think it gives me a lot of clues, I have to identify which pages are personalized and which are not, and definitely, I have to exclude the admin and cart pages. Lastly, thank you for your help!
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.