Score:1

PHP-FPM static max_children amount doesn't seem to matter much during stress tests

gr flag

I've been stress testing PHP-FPM set up as static using wrk, with NGINX running on Ubuntu Server 22.04 for the past few days, writing down results for every scenario. When I look at them, it seems to me, the amount of max_children doesn't seem to matter much. And this got me puzzled, since all the articles on the web suggest various methods of calculating the right amount to max out the performance. But none of them really apply to my case, as it seems.

I run this on 4 cores with 4 GB RAM. I have a test php script that pulls and prints around 11k of lines from MySQL table, 1 MB large document.

The average process size is 12.5 MB The results are identical for max_children 10, 50, 100. With 200, the server gets overloaded.

This got me thinking, maybe it's related to the number of cores, but no, with only 4 max_children, the results are lower than the previously mentioned, although not by much. 8 max_children is already nearly identical.

All the scenarios (4, 8, 10, 50, 100) for max_children can handle up to 500 connections before time outs and drop outs start to occur.

Can someone please explain to me, how is that possible? What am I missing?

EDIT:

As per Rick's suggestion I tried to ramp up MySQL max_connections to 351 from the default 151, after calculating the max amount from the available RAM. This didn't affect the results at all so I went to square 1 and started testing from 1 max_children and 1 wrk thread upwards and while doing so, encountered more behavior I couldn't explain.

The max_children that only makes sense seems to be 6. Nothing over that limit will bring any extra requests.

While testing only single thread wrk requests the CPU load kept rising with every php-fpm children added and stayed the same, no matter the amount of connections thrown at it from wrk.

The ceiling for average total requests for a 30s long stress test is around 11371 requests. This can be reached since 6 max_children and seems to be related to CPU load, that reaches 90% during this peak. No more requests can be reached, no matter the amount of max_children.

A rather strange behavior occurred when testing on 4 threads. The results for equal number of connections are lower than single thread stress test, but the latency is lower, which makes sense I guess. But when it comes to CPU load, anything below 8 connections has around 70% load, no matter the max_children, but once the amount of connections go one above, to 9, the CPU load increases drastically, by over 20% with higher amounts of max_children. The only thing that comes to my mind is it must be related to the amount of CPU cores, 4 in this case, that can only handle 2 simultaneous connections per core. Other than that I'm lost.

My assumption is, at least in my case, the bottleneck is the CPU. I never run out of memory. I have high limits set both for NGINX and MySQL connections. I would still appreciate if this could be confirmed and perhaps further explained. Best regards.

Score:1
ua flag
  • The important metric is "connections per hour". This is the "capacity" of your system. Stress testing measures "simultaneous connections", which is mostly irrelevant.

  • The number of cores is rarely the limiting factor in the setup you describe. On the other hand, if you do run out of CPU oomph, we should look at the slow queries to see what can be optimized.

  • If a connection takes only 20ms to process a page, then none of the knobs you are tweaking matter much. The connections are coming and going fast enough. If you can shrink the 20 to 15, you can handle 20/15 times as many "connections per hour". That is, by improving the code, each "active" connection finishes faster.

  • Stress testing generally tests how rapidly new connections can be created before the system overloads and appears to "crash". Such a "crash" is really lots of processes stumbling over each other trying to share the same resources (CPU, locks, ram, cache, network, I/O, whatever).

  • MySQL is designed to share evenly. This means that the connections are stumbling over each other. The real cure is to push the problem up the stack. That is, limit the number of connections, preferably at the webserver layer.

  • MySQL can handle only a few dozen active connections simultaneously. max_connections includes "Sleep" connections (not running SQL). Your numbers showed that 100 did not "crash", but I suspect that response time was already suffering.

  • If you set the web server's "max children" to more than MySQL's "max_connections", a stress test may find that the web server is sometimes getting "unable to connect".

  • In the real system (not just the stress test), it is likely to be better to keep "max children" lower than "max_connections" since the browsers understand what to do with that error.

(More)

  • "calculating the max amount..of RAM" -- This is futile. There is no adequate formula for MySQL.
  • What is a "work thread"? In MySQL, watch Threads_running; if it gets over about a dozen, MySQL is about as busy as it can be. If that number gets much higher, the threads stumble over each other, and things actually slow down.
  • top, htop, etc are ways to watch the number of cores in use. I almost never see all cores busy, even on a very busy server. Other resources are usually hit first.
  • If CPU is the bottleneck, we need to look at the queries and the schema.
  • Users want "low latency". Adding too many threads leads to poor latency.
Medito Di Terra avatar
gr flag
Thank you for your thorough answer, Rick. Simultaneous connections are exactly what I'm tuning the server to. The main load will come from API calls from mobile applications, one of which is a social network (a small one, but still). I will look into the MySQL max_connections as suggested and see if that is the bottleneck (makes sense it has to be something outside of PHP, when the numbers don't differ at all for various settings).
Medito Di Terra avatar
gr flag
I have made some further testing and described the results in detail in an edit to the original post. Could you tell me something more with regard to the edit please?
ua flag
I added more...
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.