We have a web service behind a HAProxy server running in caching reverse proxy configuration. The backend servers send Cache-Control
headers correctly for all responses so HAProxy can cache all responses according to HTTP spec.
However, when the end user hits the Shift+Reload button in e.g. Google Chrome, the client (Chrome) sends Pragma: no-cache
and Cache-Control: no-cache
which forces HAProxy to always fetch the request from the backend server. Obviously, DDoS attacks can use this same trick to easily cause more load on the backend servers.
As we know that the cache headers are correct, how can we configure HAProxy to ignore client submitted Pragma: no-cache
and avoid calling backend when the request could be directly fulfilled from HAProxy cache?
I know that ignoring this header would not be okay for a generic proxy use, but in this case we control both the reverse proxy and the backend so we know this is fine.
Here's an example of a response from backend server that will be re-done from the backend when the client sends cache-control: no-cache
and pragma: no-cache
:
cache-control: public, max-age=31536000, s-maxage=31536000
content-length: 463
content-type: image/svg+xml
date: Thu, 24 Jun 2021 14:14:19 GMT
etag: "338"
expires: Fri, 24 Jun 2022 14:14:19 GMT
server: Apache
x-content-type-options: nosniff
It's obviously totally pointless to fetch this from backend servers again because its valid for one year for any user using the given URL. Also worth noting is that NGINX does not honor the [client] Pragma header by default.