Score:0

NGINX proxy stops working after upstream was unreachable

uz flag

we recently started transitioning from domain binding to using nginx as a proxy for our web apps.

Requests going to the wildcard subdomain *.domain.tld are being loadbalanced by our firewall to two linux machines (Debian 11) proxy-01 and proxy-02 that have nginx running on them, with proxy configurations for *.domain.tld subdomains.

proxy-01 and proxy-02 both have an /etc/hosts entry for webserver-07.

An example config for test.domain.tld:

upstream test {
        server  webserver-07:44309;
}

server {
        server_name test.domain.tld;

        listen 443 ssl;

        location / {
                proxy_pass                              https://test;
                proxy_set_header Host                   $host;
                proxy_set_header X-Real-IP              $remote_addr;
                proxy_set_header X-Forwarded-For        $proxy_add_x_forwarded_for;
        }

        client_max_body_size                    0;
        large_client_header_buffers             4 32k;
        proxy_busy_buffers_size                 512k;
        proxy_buffers                           4 512k;
        proxy_buffer_size                       256k;
        proxy_read_timeout                      600s;

        ssl_certificate                         /etc/ssl/certs/_.domain.tld.crt;
        ssl_certificate_key                     /etc/ssl/private/_.domain.tld.key;
        ssl_trusted_certificate                 /etc/ssl/certs/Root_Cert.pem;

        access_log      /var/log/nginx/test.domain.tld_access.log;
        error_log       /var/log/nginx/test.domain.tld_error.log;
}

This setup has been up and running smoothly for the past ~6 months, until tonight, when webserver-07 lost its network connection for several hours for a reason unknown to me.

Whatever the issue was, our hardware guy got the machine connected to the network again, but even after the webserver-07 was back, trying to connect to the website on test.domain.tld showed the nginx error page 500 Internal Server Error and neither the proxy-01 nor the proxy-02 did log any requests to test.domain.tld_access.log when opened with tail -f.

However, rebooting both proxy-01 and proxy-02 fixed the issue.

We believe the upstream connection must have somehow gotten stale/corrupt, when webserver-07 opted out of the network.

Can anyone tell me what exactly caused the nginx to fail to proxy requests to the upstream even thought the upstream machine was reachable again? Do we miss any config parameters? How do we prevent similiar issues from occuring in the future?

Regards

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.