I have a CentOS 7 server on AWS w/ 4 CPU and 8 GB RAM. Starting late last week I started getting HIGH load spikes several times a day w/ Apache appearing to be the culprit. It's fully up-to-date w/ the latest packages.
To try and isolate, I installed Monitorix so I can see graphs of the spikes. I also started logging uptime
every minute to try and see when the spikes happen. I noticed they all seem to be roughly every 6 hours (midnight, 6 am, noon and 6 pm).
18:07:02 up 1381 days, 17:55, 2 users, load average: 0.51, 0.67, 1.13
18:08:04 up 1381 days, 17:56, 2 users, load average: 6.86, 2.01, 1.54
18:09:30 up 1381 days, 17:57, 2 users, load average: 101.18, 32.53, 12.26
18:10:02 up 1381 days, 17:58, 2 users, load average: 80.26, 34.50, 13.59
18:11:01 up 1381 days, 17:59, 2 users, load average: 30.24, 28.51, 12.84
18:12:01 up 1381 days, 18:00, 2 users, load average: 11.99, 23.59, 12.13
18:13:01 up 1381 days, 18:01, 2 users, load average: 5.26, 19.50, 11.44
18:14:01 up 1381 days, 18:02, 2 users, load average: 2.14, 16.03, 10.75
...
00:57:01 up 1382 days, 45 min, 2 users, load average: 0.28, 0.39, 0.50
00:58:04 up 1382 days, 46 min, 2 users, load average: 11.42, 2.85, 1.30
00:59:01 up 1382 days, 47 min, 2 users, load average: 4.90, 2.49, 1.26
01:00:01 up 1382 days, 48 min, 2 users, load average: 1.80, 2.04, 1.18
...
12:42:01 up 1382 days, 12:30, 2 users, load average: 0.20, 0.47, 0.56
12:43:09 up 1382 days, 12:31, 2 users, load average: 35.63, 8.82, 3.32
12:44:02 up 1382 days, 12:32, 2 users, load average: 48.94, 18.08, 6.86
12:45:02 up 1382 days, 12:33, 2 users, load average: 18.50, 14.97, 6.49
12:46:01 up 1382 days, 12:34, 2 users, load average: 8.13, 12.56, 6.19
12:47:01 up 1382 days, 12:35, 2 users, load average: 2.99, 10.27, 5.80
12:48:01 up 1382 days, 12:36, 2 users, load average: 1.14, 8.42, 5.44
When I look at the Monitorix images, it appears to be Apache threads causing the issue, as they spike, as well. When I can strace a PID I am seeing a ton of reads on websites (this hosts multiple Wordpress websites). There's a slight jump in memory allocation, and when I am running to pat the time, swap isn't 100% used.
I started logging MySQL slow queries and I believe what is logged is not the cause, as they're simple Wordpress SELECT statements, and all are at the time of spikes (so an effect, not the cause).
I also installed fail2ban, but that's not really logging/blocking anything.
I lastly checked all crons and nothing that I see is scheduled to run every 6 hours that would cause this.
I do see the disk I/O spikes, as well, roughly around these times (Wordfence?). Apache workers also spike at these times, as they're averaging around 10-15 until they spike to 40, 80, or 180 today alone.
What else can I look at/log? I'm 99% sure it's Apache causing the issue. All of the Wordpress sites are up-to-date, have Wordfence running (w/ scan settings set to lower usage to save resources and the blocking is set to not do all records).
TIA!