Score:2

Apache "freezes" few times a day

in flag
JYD

I'm writing here after weeks spent fighting an issue that cause Apache to stop responding until it is restarted. It happens 3/4 times a day, sometimes after hours, sometimes after some minutes, sometimes after a day. There's non relation (at least there's no evidence) with the number of concurrent connection to the server: it happens both during heavy traffic period (between 8.00am - 18.00pm) and during the night when accesses are very low.

Configuration: VM on Vmware ESXi Rel. 7 - OS: Ubuntu 20.04, Apache 2.4.41, PHP 8.0.15, MSSQL Drivers 17.8.1.1-1. 6 CPU "Xeon(R) Gold 5218", 12Gb Ram. 3 website running in "pure" PHP (no CMS like Wordpress, Drupal, Ruby On Rails etc). Awstats shows that the intranet's one with no external access serve < 10k page day, the others about 200k pages served a day. Most of time CPU usage sits about 1% and memory used about 2Gb. When the issue happens, no CPU/Memory/network "spikes" are detected.

At then moment I installed and configured Monit that every 20 seconds test with curl this minimal PHP webpage:

<?php
echo "ok";
?>

Normally it prints "ok". During the "freeze", even this simple page isn't served; curl ends with timeout error and trigger monit to do a "service apache2 restart". After 2/3 seconds the website come back to normal functionality (till the next freeze).

Follows a list of unsuccessful remediation (not in chronological order):

  • Removed certbot-Letsencrypt and used a Sectigo purchased SSL cerificate
  • Switched Apache from mpm_worker to mpm_event
  • Disabled a bunch of unused Apache's modules
  • Disabled a bunch of unused PHP's modules
  • Disabled most of non critical cron jobs (even there's no evidence that the freeze happens during cron jobs execution).
  • Changed virtual network adapter from VMXNET3 to E1000
  • Enabled verbose logging: no useful information/errors are recorded, simply there's a 25-30 sec time gap from the last page served just before the hang a the first served when the restart complete.
  • Enabled for some days mod_log_forensic: no (!) errors are reported using check_forensic utility
  • Double checked the few Rewrite rules in .conf and in .htaccess
  • Changed Apache's configuration; relevant values are:
    StartServers 10
    MinSpareThreads 40
    MaxSpareThreads 120
    ThreadLimit 100
    ThreadsPerChild 75
    MaxRequestWorkers 450
    MaxConnectionsPerChild 1000

There's no evident correlation between the "last" page/file served before the issue: sometimes is a PHP page (obviously not the same) sometimes a png/jpeg image. Reading logs I cannot find abnormal/malformed/excessive client's requests.

The issue is 99,99% Apache related, the PHP-fpm service works perfectly and is not necessary to restart it after a freeze. All other server's running services are not affetced.

Before writing here, I read tons of webpage but I didn't found any useful (for me) hint.

Thanks in adv

Ciao

JYD

jp flag
When Apache hangs check process status with `ps`. Check Apache `mod_status`. Use `strace` to find out what the processes are doing.
fr flag
Maybe the number of httpd threads is influencing this? As it is virtual machine maybe it is running on hypervisor with ram memory balooning?
in flag
JYD
@AlexD I adden a strace to a file and I'll post here the results
in flag
JYD
@kazak No "balloned" memory, ESX monitor shows always 0 KB. All 12Gb are reserved to this VM
in flag
JYD
Finally I got it!!!!
in flag
JYD
The problem was the filesystem's daemon "incron" missconfigured and with its log disabled. In its configuration file, one of the event watched had a wrong escaped command. When I enable incron's log file, the .log starts grows hundreds line/sec and it quick reaches dozens MB size. This strange behaviour was caused by a wrong escaping char in its conf file: in a line there was a "$\" instead a "\$" making a very upredicatble race condition. Fixed it, the apache's freeze gone.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.