Score:0

How to configure HAProxy to ignore client header `Pragma: no-cache`?

ve flag

We have a web service behind a HAProxy server running in caching reverse proxy configuration. The backend servers send Cache-Control headers correctly for all responses so HAProxy can cache all responses according to HTTP spec.

However, when the end user hits the Shift+Reload button in e.g. Google Chrome, the client (Chrome) sends Pragma: no-cache and Cache-Control: no-cache which forces HAProxy to always fetch the request from the backend server. Obviously, DDoS attacks can use this same trick to easily cause more load on the backend servers.

As we know that the cache headers are correct, how can we configure HAProxy to ignore client submitted Pragma: no-cache and avoid calling backend when the request could be directly fulfilled from HAProxy cache?

I know that ignoring this header would not be okay for a generic proxy use, but in this case we control both the reverse proxy and the backend so we know this is fine.

Here's an example of a response from backend server that will be re-done from the backend when the client sends cache-control: no-cache and pragma: no-cache:

cache-control: public, max-age=31536000, s-maxage=31536000
content-length: 463
content-type: image/svg+xml
date: Thu, 24 Jun 2021 14:14:19 GMT
etag: "338"
expires: Fri, 24 Jun 2022 14:14:19 GMT
server: Apache
x-content-type-options: nosniff

It's obviously totally pointless to fetch this from backend servers again because its valid for one year for any user using the given URL. Also worth noting is that NGINX does not honor the [client] Pragma header by default.

ve flag
In reality we have multiple redundant frontends running in parallel and multiple redundant backends but it really doesn't change anything about the problem. So I wrote the above simplified question like there were only one frontend and one backend server.
ve flag
https://www.haproxy.com/documentation/aloha/latest/traffic-management/lb-layer7/caching-small-objects/ says that "*Objects are cached only if all the following are true: [...] Response does not have a "Cache-Control: no-cache" header*" which suggests that this is not currently configurable. That's technically about Aloha component but I'd guess it might be applicable to whole HAProxy, too.
ve flag
Note that nowdays (at least with HTTP/2) Google Chrome sends `cache-control: max-age=0` instead of `Pragma: non-cache` but the HAProxy behavior is still the same.
Score:1
sd flag

Web browsers send Cache-Control and Pragma headers that mess with HAProxy and make the caching virtually unusable due to its inability to cache if cache-control or pragma headers are "no-cache". To bypass this, you just need to make sure you delete the Cache-Control header first, then the Pragma header second with http-request del-header before you attempt to use or store anything in the cache:

http-request del-header Cache-Control
http-request del-header Pragma
http-request cache-use mycache
http-response cache-store mycache
ve flag
Great finding, and welcome to Serverfault! I verified that this is indeed the correct way to configure HAProxy to allow caching in all cases. Nitpick: I would recommend to just include the meaningful details in answer in the future. For example, you can edit the answer to remove the first paragraph about how hard this info was to find. You should assume that somebody is reading this answer 10 years into the future and write the answer accordingly.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.