Score:0

Web server returning high number of 502 and 504s

gb flag

We have a web application (LAMP stack) with traefik as a reverse proxy that is suddenly giving HTTP 502 and 504 errors on >50% of requests, both for static files and php scripts. In the traefik dashboard I can see a count of these errors, but the logs there don't reveal any information - I suspected the issue was with Apache timing out, possibly from being overloaded.

However looking at the Apache logs, I only see successfully processed requests, as if it's never even seeing the requests that are failing. We haven't seen any spikes in usage, the server CPU utilization hovers around 60% as is typical, and we ensured there is ample disk space. I'm at a loss for how to diagnose what specifically is going on and how to fix it.

The application is dockerized with traefik, apache, and mysql each running in their own containers, and runs on a digital ocean VPS for further info. The software versions are as follows:

Apache: 2.4.57, PHP: 7.2.34-39, Traefik: 1.7.33, MySQL: 14.14 Distrib 5.7.35

Any insight or suggestions would be greatly appreciated!

Marcel avatar
gb flag
Why do you need traefik while having Apache? It's not a LAMP stack if there's Traefik in between, no?
bkane521 avatar
gb flag
traefik isn't strictly necessary in the current setup, I believe the thinking was preemptive preparation for load balancing that hasn't come to fruition
Marcel avatar
gb flag
I'd remove traefik from the stack to see how the rest of the components will behave without it and if it improves the experience for now.
bkane521 avatar
gb flag
Actually my mistake, traefik is handling SSL termination, there's some automation setup around that. Still not strictly necessary but it does serve a purpose.
bkane521 avatar
gb flag
My current suspicion is this is an issue with the volume of requests (~700 per second) and traefik is giving 504 when apache doesn't respond in time. It doesn't explain the 502 which I would expect if apache was returning an error, but I don't see any errors at all in the apache logs.
bkane521 avatar
gb flag
Another interesting tidbit, when the containers are restarted, the first 2000-3000 requests all succeed and right around that time the 50Xs start dumping in alongside.
ua flag
https://support.stackpath.com/hc/en-us/articles/360001458723-Learn-and-Troubleshoot-502-and-504-Errors#:~:text=A%20502%20or%20a%20504,will%20return%20a%205xx%20error.
Marcel avatar
gb flag
Given your current suspicion around volume of requests, have you tried enabling keep-alive connection from-to Apache? I'd start by tweaking the timeout values first to see if that would improve the 504's. But next thing would be to enable keep alive everywhere for the kernel to reuse sockets. HTTP2 would also help AFAIK.
Wilson Hauck avatar
jp flag
Additional DB information request, please. OS, Version? RAM size, # cores, any SSD or NVME devices on MySQL Host server? Post TEXT data on justpaste.it and share the links. From your SSH login root, Text results of: A) SELECT COUNT(*), sum(data_length), sum(index_length), sum(data_free) FROM information_schema.tables; B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; E) STATUS; not SHOW STATUS, just STATUS; G) SHOW ENGINE INNODB STATUS; for server workload tuning analysis to provide suggestions.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.