I'd like to start by saying that I know there are quite literally hundreds of topics about this, which I've followed before to get things working. Yet, this configuration, which I had working for months, its not working in a different environment.
The requirements is pretty straightforward: Take users from site on port 80 to the same site on port 443.
We have said site, with Apache as front and with Tomcat instances for our actual services. These services are hosted on two different machines, one virtual on GCP Compute Engine and other baremetal that we will eventually move away from.
We have our domain, example.com, which contains an index of the services. It is served statically by Apache 2.4. We can access it either with Http and Https. From this index, we redirect using proxy_mod to app1, app2 and app3. App2 and App3 are in out baremetal server. We use DNS to take users to them by their subdomains app2.example.com/app2 and app3.example.com/app3. I will address the redundancy in the future. Because certain limitations and the fact that app2 and app3 are not out to the public just yet, we access them only by Http.
App1 has the same SSL certificate as the Apache server as they are hosted in the same VM. Again, we can access both example.com and example.com/app1 with Http and Https.
Now the problems:
As stated in the title, I'm having problem with 502s errors whenever I use a configuration I had already tried before and that is working in a different environment. Said config is roughly as follows:
<VirtualHost *:80>
Redirect permanent / https://bar.com/
</VirtualHost>
<VirtualHost *:443>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
#SSL stuff
SSLEngine On
SSLCertificateFile /file.pem
SSLCertificateKeyFile /file.key
SSLVerifyClient none
#Proxies
ProxyRequests Off
SSLProxyEngine on
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
ProxyPass /api/ https://localhost:8443/api/
ProxyPassReverse /api/ https://localhost:8443/api/
</VirtualHost>
bar.com was my test environment. I took this configuration and tweaked it as follows for example.com
<VirtualHost *:80>
Redirect permanent / https://example.com/
</VirtualHost>
<VirtualHost *:443>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
#SSL stuff
SSLEngine On
SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem
SSLVerifyClient none
#Proxies
SSLProxyEngine on
SSLProxyVerify none
SSLProxyCheckPeerCN on
SSLProxyCheckPeerExpire on
Redirect permanent /app2 http://app2.example.com/app2
Redirect permanent /app3 http://app3.example.com/app3
<Location /app1>
ProxyPass https://localhost:8443/app1
ProxyPassReverse https://localhost:8443/app1
</Location>
</VirtualHost>
But this leads to 502 errors. Again, I understand that the vhost on port 80 is ok. I have even read that is the recommended way (Currently looking where I read this). I haven't tried with mod_rewrite as I believe the error is not the configuration itself, but a mix between SSL and Proxy mods with GCP.
On the GCP side, we have the basic firewall rules and since we can access either Https or Http with the basic configuration (That is, the vhost I have for 443 but listening on port 80, without the first vhost). I set up a load balancer, but mostly to make use of Googles CDN. It has one backend group with the VM where Apache and app1 are hosted. For the front end, I have rules for both port 80 and 443, which means I have another SSL certificate specifically for the load balancer, which is managed by Google. I have set up health checkers for both ports, disabling and enabling frontends to be sure is not something with the ports themselves and so far that doesn't seem to be the issue.
I'm still wraping my head around the whole GCP ecosystem, so my best guess is that I'm missing something in that part of the equation as the Apache config itself seems ok to me.
What I've yet to try:
- Took off the balancer, set the VM with out static IP and try there.
- Use the same cert in the frontend and in the VM.
- Disable port 80 in the frontend and set up Apache to listen to port 443 (I don't want to do this, as this is more of a hack; the config itself would be working as if it were http)
Why haven't I tried this things yet? Well, as you might've guess, poking around in the configuration in out live server is not really an option. I don't have access to bar.com so I can't try stuff as I'd like, and using WSL, while useful, has not proven to be as close to the real thing (This could also be something on my machine.). Since I can only work near midnight, I can't really try much things as thoughly as I'd like.
Does anyone know what could be throwing the 502 errors?
If you get the idea but need more info, please go ahead and ask. I'll be as detailed as I can, but keep in mind I'm still learning how IaaS works.
EDIT:
After a while, I'm able to work on this further. I'm using the same configuration as the second block code I originally posted.
In reponse to @John Hanley (Please note I changed domain and IPs to 'generic' values):
Using curl -l -v
to the domain, I get the following:
* Trying 1.1.1.1:80...
* TCP_NODELAY set
* Connected to example.com (1.1.1.1) port 80 (#0)
> GET / HTTP/1.1
> Host: exampole.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 502 Bad Gateway
< Content-Type: text/html; charset=UTF-8
< Referrer-Policy: no-referrer
< Content-Length: 332
< Date: Sun, 21 Nov 2021 08:57:31 GMT
<
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
* Connection #0 to host example.com left intact
and
* Trying 1.1.1.1:443...
* TCP_NODELAY set
* Connected to example.com (1.1.1.1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=example.com
* start date: Oct 4 13:25:53 2021 GMT
* expire date: Jan 2 13:25:52 2022 GMT
* subjectAltName: host "example.com" matched cert's "example.com"
* issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1D4
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55b622463820)
> GET / HTTP/2
> Host: example.com
> user-agent: curl/7.68.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
< HTTP/2 502
< content-type: text/html; charset=UTF-8
< referrer-policy: no-referrer
< content-length: 332
< date: Sun, 21 Nov 2021 08:57:56 GMT
< alt-svc: clear
<
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
* Connection #0 to host example.com left intact
I think the https request fails here due to missing headers, since I can access the site on my browser.
From this I see that the request is being procesed properly by the LoadBalancer (Or at least that's what I understand), but that Connection stage changed
message is bugging me, SPECIALLY because it takes a couple seconds there, as if waiting for response.
From apachectl -S
VirtualHost configuration:
*:80 vm-name.region.gcp-project.internal (/etc/apache2/sites-enabled/0080-default.conf:1)
*:443 example.com (/etc/apache2/sites-enabled/0443-secured.conf:6)
ServerRoot: "/etc/apache2"
Main DocumentRoot: "/var/www/html"
Main ErrorLog: "/var/log/apache2/error.log"
Mutex watchdog-callback: using_defaults
Mutex ssl-stapling-refresh: using_defaults
Mutex ssl-stapling: using_defaults
Mutex proxy: using_defaults
Mutex ssl-cache: using_defaults
Mutex default: dir="/var/run/apache2/" mechanism=default
PidFile: "/var/run/apache2/apache2.pid"
Define: DUMP_VHOSTS
Define: DUMP_RUN_CFG
User: name="www-data" id=33
Group: name="www-data" id=33
This one is interesting. While the Vhost for port 80 references the VM instance of GCP itself, the VHost for port 443 doesn't. I'm not sure as to what to take from this... I tried accessing the VM by it's external IP with curl. Http shows the right status for the permanent redirect, whereas HTTPs throws error due to the missmatching CN and SAN (Which make sense since the cert is for a domain, not for an IP). With this, I'm somewhat certain that changing the VM IP to the static IP we have and letting Apache do the load balancing will fulfill my requirement, but we are adding at least one more VM in a rather near future, so doing that at this point seems... bad. I still think the CDN is a good feature and should use it after all.
About the Load Balancer:
This one is somewhat hard to explain, so I'll try to be as detailed yet direct as posible (Is there something in the GCP CLI to display this config? Haven't thouched the CLI yet):
The load balancer itself:
Frontends:
Protocol |
Port |
Certificate |
SSL Policy |
HTTP |
80 |
- |
- |
HTTPS |
443 |
for example.com, GCP managed |
Defaults |
Backends
|Endpoind protocol|Named port|Timeout|health check|
|-|-|-|-|
|HTTP|http|30 secs| hc-1 (Failling)|
|HTTP/2|https|30 secs| hc-2 (Passing)|
*Both backends point to the same group, but different ports. This group contains our only VM.
Hosts/Path rules:
All unamaged go to HTTP backend. Request to /* go to HTTPS backend.
While hc-1 fails, it actually resolves to the main page. But when I try to access app1, it fails (Which makes sense since that VHost doesn't contain a proxy mapping to app1 to begin with). On the other hand, hc-2 passes, but accesing the site as https://example.com
throws the 502 errors.
Finally, in regard of how I enabled the sites, I used a2ensite for 0080-default.conf and 0443-secured.conf.
Oh, and also, this week our site wasn't working. I think it was on Tuesday, the same day Spotify had issues. I had to open the load balancer, do any change, save, undo said change and save again to get it working again, but hc-1 is failling since then.
I've been trying many permutations of my config, starting from the one I know works (That is, just one load balancer for HTTP and the one VHost on port 80).