I've been stress testing PHP-FPM set up as static using wrk, with NGINX running on Ubuntu Server 22.04 for the past few days, writing down results for every scenario. When I look at them, it seems to me, the amount of max_children doesn't seem to matter much. And this got me puzzled, since all the articles on the web suggest various methods of calculating the right amount to max out the performance. But none of them really apply to my case, as it seems.
I run this on 4 cores with 4 GB RAM.
I have a test php script that pulls and prints around 11k of lines from MySQL table, 1 MB large document.
The average process size is 12.5 MB
The results are identical for max_children 10, 50, 100. With 200, the server gets overloaded.
This got me thinking, maybe it's related to the number of cores, but no, with only 4 max_children, the results are lower than the previously mentioned, although not by much. 8 max_children is already nearly identical.
All the scenarios (4, 8, 10, 50, 100) for max_children can handle up to 500 connections before time outs and drop outs start to occur.
Can someone please explain to me, how is that possible? What am I missing?
EDIT:
As per Rick's suggestion I tried to ramp up MySQL max_connections to
351 from the default 151, after calculating the max amount from the
available RAM. This didn't affect the results at all so I went to
square 1 and started testing from 1 max_children and 1 wrk
thread upwards and while doing so, encountered more behavior I
couldn't explain.
The max_children that only makes sense seems to be 6. Nothing over
that limit will bring any extra requests.
While testing only single thread wrk requests the CPU load kept
rising with every php-fpm children added and stayed the same, no
matter the amount of connections thrown at it from wrk.
The ceiling for average total requests for a 30s long stress test is
around 11371 requests. This can be reached since 6 max_children
and seems to be related to CPU load, that reaches 90% during this
peak. No more requests can be reached, no matter the amount of
max_children.
A rather strange behavior occurred when testing on 4 threads. The
results for equal number of connections are lower than single thread
stress test, but the latency is lower, which makes sense I guess. But
when it comes to CPU load, anything below 8 connections has
around 70% load, no matter the max_children, but once the
amount of connections go one above, to 9, the CPU load increases
drastically, by over 20% with higher amounts of max_children. The
only thing that comes to my mind is it must be related to the amount
of CPU cores, 4 in this case, that can only handle 2 simultaneous
connections per core. Other than that I'm lost.
My assumption is, at least in my case, the bottleneck is the CPU. I
never run out of memory. I have high limits set both for NGINX and
MySQL connections. I would still appreciate if this could be confirmed and perhaps further explained. Best regards.