What is the underlying cause of nginx reverse proxy performance bottleneck?

Question

Score:0

Server

What is the underlying cause of nginx reverse proxy performance bottleneck?

Mike Campbell

5/10/24, 8:57 PM

We run nginx as a reverse proxy with TLS termination.

We moved from a 4 core machine to a 1 core machine recently, by mistake, and noticed under load that our "request queuing" metric shot up (from normal of ~10ms up to around 10 seconds on average). When we realised the sizing error and upgraded to 4 cores, the problem went away, but I'd like to understand exactly what was responsible for the slow down, so that we can know if we're going to approach a similar situation in the future as traffic increases.

The "request queuing" metric is simply calculated as a delta between a timestamp that the nginx proxy adds to proxied requests (X-Request-Start), and the time that that request is processed by our application on the backend web servers.

This value normally indicates queuing on our backend servers, as there's an explicit request queuing mechanism built into the backend web servers. In this case, there wasn't any queuing on the backend servers.

Our worker_connections/worker_processes configuration is:

worker_processes auto;
worker_rlimit_nofile 20000;

events {
  worker_connections 10000;
}

Observations at the time:

The $request_time figures displayed in nginx's access logs were in the normal range (~100ms average)
The system metrics showed a fairly steady average CPU utilisation of 60%
Active connections reported by nginx's stub_status was around 6k and went up to around 10k after the upgrade

What is the technical explanation, in nginx internals, for where this delay came from? It seems as though it was after the connection was accepted and the header was added, but before the request reached the backend. How do I monitor it, and can I calculate at what level of traffic we may experience it again at 4 cores?

Nginx config (the bits I felt were pertinent..):

user www-data;
worker_processes auto;
worker_rlimit_nofile 20000;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
  worker_connections 10000;
}

http {
  log_format upstreamlog '[$time_local] $remote_addr - $remote_user - $host  to: $upstream_addr: $request upstream_response_time $upstream_response_time msec $msec request_time $request_time status $status sent_bytes $bytes_sent';

  upstream app_backend {
    least_conn;
    server 100.64.32.11:8080; # app01
    server 100.64.32.41:8080; # app02
    # ... etc ...
  }

  server {
    listen 80;
    listen [::]:80;
    server_name ~^[^\.]+\.app\.net$;

    return 301 https://$host$request_uri;
  }

  server {
    listen 443 ssl;
    listen [::]:443 ssl;
    server_name ~^[^\.]+\.app\.net$;

    ssl_certificate /etc/nginx/ssl/app.crt;
    ssl_certificate_key /etc/nginx/ssl/app.key;

    location / {
      proxy_pass            http://app_backend;
      proxy_set_header      Host $host;
      proxy_set_header      X-Real-IP $remote_addr;
      proxy_set_header      X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header      X-Forwarded-Proto $scheme;
      proxy_set_header      X-Request-Start "t=${msec}";
      access_log            /var/log/nginx/access.log upstreamlog;
      proxy_send_timeout    300;
      proxy_read_timeout    300;
    }
  }
}

146

0 + 0

nginx

reverse-proxy

What is the underlying cause of nginx reverse proxy performance bottleneck?

Post an answer