Score:2

Apache behind nginx reverse proxy, setting the correct Host header

nl flag

I'm running my application using Apache in a Docker container. I have nginx acting as a reverse proxy running in another Docker container which has Apache as its upstream. I'm using the proxy_pass directive for this.

Apache runs in the URL https://example-8gnm1aqrns-lz.a.run.app Nginx runs in the URL http://example.com

The whole thing is running on Google Cloud Run and the problem is that Google Cloud Run assigns every service an URL and it uses the Host header to differentiate between the applications so I'm forced to send the header Host: example-8gnm1aqrns-lz.a.run.app when connecting to the upstream using nginx so the request routes correctly.

This causes a problem with my application because it thinks it's running in the URL https://example-8gnm1aqrns-lz.a.run.app and not http://example.com.

Is it possible to use some Apache .htaccess configuration to overwrite the Host header based on the X-Forwarded-Host sent by nginx?

Score:2
co flag

Instead of having Apache do that leg work, have NGINX do it before it even hands off the data to Apache by setting the Host header that Apache is expecting as part of the proxy_pass handoff with an extra configuration option.

NGINX has the following variable for proxy_set_header to augment what is passed to the proxy in the backend. So you'd have something like this:

...

location / {
    proxy_pass http://10.20.30.40;
    proxy_set_header Host example.com;
}
...

in your NGINX configuration for the reverse proxy. Then, Apache won't care about X-Forwarded-Host because you'd set the Host header instead which Apache should prioritize serving.

This should fix things - I use multiple NGINX systems in this way when the domain reached by the Browser is different from teh backend's responding host - and so far it works with Django backends, PHP backends, Apache, even a Python HTTP server I use for testing things. And my understanding of Apache is it'll prioritize teh Host header over X-Forwarded-Host.

John Hanley avatar
cn flag
I might be wrong. If you make a request from one Cloud Run service to another, the Host header must match the service being called, otherwise, you will get a 404. This is an example where Cloud Run Custom Domains plus your solution can solve the problem.
co flag
@JohnHanley Ahh, yes you're probably right, however if OP is expecting the endpoint to serve example.com and getting other URLs they need a generic listener that matches all potential domains in here. Cloud Run Custom Domains I think will work better for OP (sounds like that should be a unique answer)
John Hanley avatar
cn flag
I think your solution is the correct half to start with. Then the OP can review Cloud Run Custom Domains, etc so that his domain logic works. I don't understand his dependency on domain names for a backend.
co flag
@JohnHanley well if $BACKEND server is running 50 separate domains and serves content based on the Host header (whcih is how VirtualHosts work and their ServerName / ServerAlias fields in Apache), then the Host header is needed to specify how to connect and with what Host to pass so content is served correctly. That's also, incidentally, how NGINX determines what zones to serve - the server_name field (regex, wildcard, etc.) matching determines what server/site config to serve for a given request.
John Hanley avatar
cn flag
However, that is not how Cloud Run operates. The Host header determines which service the Google Cloud GFE routes the request to. If the backend handles 50 domains, then the Cloud Run service would need 50 Custom Domains. Given that each custom domain will require a domain mapping, I am not sure that is even supported (quota limits). The backend will need another method to support that type of identification. A custom header will probably work. However, this is an example of shoehorning a design into a service that might not be a good match for the architecture.
co flag
@JohnHanley I think all I have to know is where NGINX sits. If i'm reading OP right, example.com -> NGINX -> Google Cloud Run. So the Host header for the location block and the proxy_pass just needs to set Host to the GCR unique URL for the request. Assuming that NGINX is actually sitting at example.com.
fairport avatar
nl flag
Unfortunately this answer won’t work because I can’t connect to the Apache container using IP address, I have to use the domain with the Host header so Google knows how to route the request to the right server. I would like to fix the Host header in Apache so the application builds its routes correctly. The correct Host header is in X-Forwarded-Host or I could use an environment variable to define it too.
co flag
@fairport I'm not sure there's functionality in Apache to rewrite the X-Forwarded-Host header. I was looking but didn't see such functionality. I think that if your app is not going to work the way that Google expects it you need to rebuild / restructure your app with Google's requirements in mind, rather than try to hack it into place. Which makes sense with regards to GCR because they're structuring it in a very specific way. (I may delete my answer at some point after you read this comment)
fairport avatar
nl flag
Yeah, I’m currently looking into the documentation of Apache RequestHeader directive and see if I can replace the Host header using an environment variable. Another way to fix this could be setting up a VPC network, this way the Cloud Run service should have a static IP address I can use instead of an URL. Last way I can think if is modifying the application itself so it replaces the Host header using the X-Forwarded-Host or an environment variable but this seems a hacky. I’ll answer my own question if I find a good way to solve this problem.
Score:1
nl flag

I solved this issue by defining a new environment variable called APP_HOST in the Google Cloud Run control panel and setting it to example.com.

Then I added the following configuration to the .htaccess file in the document root:

<IfModule mod_env.c>
    PassEnv APP_HOST
</IfModule>

<IfModule mod_headers.c>
    RequestHeader set Host %{APP_HOST}e env=APP_HOST
</IfModule>

This allowed me to override the Host header from example-8gnm1aqrns-lz.a.run.app to example.com based on the environment variable APP_HOST.

I could have of course hardcoded the hostname but I think that using an environment variable gives you more flexibility if you want to use the same .htaccess file in different contexts, such as on a staging server.

Edit

Here's how you can solve this using X-Forwarded-Host, for example if you run virtual hosts and you need to site to be accessible from multiple domains

<IfModule mod_setenvif.c>
    SetEnvIf X-Forwarded-Host (.*) REAL_HOST_HEADER=$1
    <IfModule mod_headers.c>
        RequestHeader set Host "%{REAL_HOST_HEADER}e"
    </IfModule>
</IfModule>

This will grab the header from X-Forwarded-Host and set the Host header based on the value.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.