Score:1

Apache MPM workers stuck in G (gracefully finishing) growing - "scoreboard is full"

us flag

Running MPM worker, Apache 2.4.46, Debian 9

Gracefully finishing workers just grows over time, they dont seem to ever finish. Eventually I run out of capacity and get "scoreboard is full" error. If I restart apache they get released.

I dont believe its anything to do with my website code (php) as many of the hanging requests are just pure image GETs, no php involved.

enter image description here

<IfModule mpm_worker_module>
ServerLimit 500
StartServers       10
MinSpareThreads    50
MaxSpareThreads    100
    ThreadLimit          64
    ThreadsPerChild      64
    MaxRequestWorkers     500
    MaxConnectionsPerChild   0
</IfModule>

scoreboard enter image description here

example g workers enter image description here

apache over week, free slots dwindling enter image description here

enter image description here

tried with keep alive on and off

enter image description here

us flag
Where im at with in now is that ive realised the G stuck PID shown in apache status are not even running ?? > kill 21734 sh: 1: kill: No such process
Score:1
jo flag

When you use MPM worker, requests are handled by threads that exist in processes.

From https://httpd.apache.org/docs/2.4/mod/worker.html

A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the ThreadsPerChild directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive.

On Linux, a process 'contains' threads, that is one PID can have multiple threads which share memory (amongst other resources) with other threads in that PID.

As a matter of fact, Linux really only cares about 'tasks', a non-multi-threaded process is a PID with a container of one task.

When you gracefully reload Apache, you're terminating the containing process. What is happening here is Apache is making each thread wait until all the threads in the containing process have completed prior to restarting the container PID.

So, in you're case, you've got a single thread contained in all the processes in that list that is still busy or stuck somehow.

You've got a few options.

  1. Just give up waiting anyway and restart.
  2. Find the problem thread (might be a bug in the application) and fix it.

1, is easy. Add the configuration option GracefulShutdownTimeout with a value that is high but not stupid. Say 900 seconds. By default this is infinite which means your threads wait forever for your problem thread to finish.

The main downside to this is you run into a chance of hitting a process in the middle of doing something critical -- which terminating might in turn corrupt a file or break the application subtley. You also run a (vanishingly small) chance of terminating a client half-way through processing.

2, will involve you spotting the thread that is stuck in the list of workers and then diagnosing what the connection is doing, but you're bound to find what could be a design flaw and you can account for the behaviour more confidently before just blowing away a problem thread.

us flag
First thanks for your answer, some (multi tasks per pid) of what you say relates and makes sense and has given me some clues but what you say about apache gracefully restarting i dont believe is related as GracefulShutdownTimeout afaik is only related to behaviour of apache when you issue a restart. Im not issuing a restart, my problem is just happening over time, no restarts. What I investigate though is all tasks with same PID, maybe there is a PHP process within that list that is hanging.
us flag
Still happening, I looked over my PID and there is no PHP tasks in the G stuck threads... so not sure whats going on, only static image/asset requests that never get out of gracefully finishing
us flag
So I got the idea that apache server-status only shows static files, not php-fpm activity, which turned out to be the case, so i set up php-fpm status so i can see what its up to, as it seems most likely that its what is hanging, and i also set up slow log to catch it. https://gist.github.com/Jiab77/a9428050ab9bb3f17c5e33343da94fd8
us flag
So still no luck, I set up php process monitoring and the pid is not in there that is stuck in G, and i realise i didnt even need to do that as i could have just looked at the list of running processes for the stuck PID ! So it seems the pid is either not running but apache thinks it is, or its in some state which make it not show up in normal process display ???
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.