I'm having an issue with a centos 7 apache server running PHP 7.3.27 in PHP-FPM mode. Apache is running in mpm-worker mode. The server hosts WordPress sites running w3 Total Cache. Redis version 3 is being used for the w3tc cache storage.
We've been getting CPU spikes that last 1-3 minutes every 10-12 hours. This started last week without any known changes.
Ram is good with more than 50% remaining.
I/O is good with disk usage around 5% at the time of the spike.
The network load looks normal, with no abnormal spikes.
A perf test ( perf record -F 99 -ag -- sleep 10
) is showing __memcp_sse3_back -> async_page_fault as the top CPU usage during the CPU spike.
Can anyone offer some guidance on what could be causing this and/or ideas for further investigation? This is a live production server so I need to be careful what kind of tests I perform.
Thanks!
Update 12-28-21:
We attempted a new EC2 instance with a snapshot. Then we ran yum update, upgraded Apache to 2.52^, and upgraded redis to the latest version. The issue continued on the new server once I brought over the site files.
We've checked all logs: Apache Error Log, PHP-FPM error log, PHP-FPM slow log, dmesg. I've monitored TCP connections and they remain flat leading up to the CPU spike. perf top -a -F 99
shows php-fpm zend_memnstr_ex
as the top overhead during the spike.