I have a weird behaiviour on my pi4 running Ubuntu server 21.04. It's running correctly, but at after a while, a can see a process running 100% CPU from hours, and if I wait longer there are 2, 3 ... other processes running 100% CPU. They seem to be launched by a cron job (from the home automation Jeedom), but this is not my question.
The weird thing is I cannot kill them, even root user with kill -9 . The process is running R, but not responding.
#ps aux | grep 46149
www-data 46149 99.7 0.0 2040 80 ? R Oct04 633:33 sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep"
#sudo kill -9 46149
#ps aux | grep 46149
www-data 46149 99.7 0.0 2040 80 ? R Oct04 633:36 sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep"
In this example, the blocked process is 'ps', but this is not always the same. If a power off the pi, it restarts normally, but another blocked process will appear after a while. And I need to power off, because 'reboot' will not work.
Using 'ps axjf' to see process tree
1 7317 7317 1799 ? -1 Sl 0 0:56 /usr/bin/containerd-shim-runc-v2 -namespace moby -id bf40089312cdb1d7707096fe6fc46520c7c1a17a70eac305473761976c1f4b7d -address /run/cont
7317 7337 7337 7337 ? -1 Ss 0 1:12 \_ /usr/bin/python2 /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
7337 7391 7391 7337 ? -1 S 0 0:02 | \_ /usr/sbin/cron -f -L4
7391 104917 7391 7337 ? -1 S 0 0:00 | | \_ /usr/sbin/CRON -f -L4
104917 104919 104919 104919 ? -1 Ss 0 0:00 | | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/watchdog.php >> /dev/null
104919 104920 104919 104919 ? -1 R 0 1521:41 | | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/watchdog.php >> /dev/null
7391 395309 7391 7337 ? -1 S 0 0:00 | | \_ /usr/sbin/CRON -f -L4
395309 395312 395312 395312 ? -1 Ss 33 0:00 | | \_ /bin/sh -c /usr/bin/php /var/www/html/core/php/jeeCron.php >> /dev/null
395312 395313 395312 395312 ? -1 S 33 0:00 | | \_ /usr/bin/php /var/www/html/core/php/jeeCron.php
395313 395341 395312 395312 ? -1 S 33 0:00 | | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=4$" | grep -v "grep"
395341 395344 395312 395312 ? -1 R 33 109:29 | | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=4$" | grep -v "grep"
7337 7392 7392 7337 ? -1 S 1 0:00 | \_ /usr/sbin/atd -f
7337 8613 8613 7337 ? -1 Sl 0 6:16 | \_ /usr/bin/python3 /usr/bin/fail2ban-server -fc /etc/fail2ban/
7337 11223 10184 10184 ? -1 S 33 0:08 | \_ php /var/www/html/core/class/../php/jeeCron.php cron_id=452778
7337 18465 18465 18465 ? -1 SNs 0 0:08 | \_ /usr/sbin/apache2 -k start
18465 168788 18465 18465 ? -1 SN 33 0:48 | | \_ /usr/sbin/apache2 -k start
18465 354445 18465 18465 ? -1 SN 33 0:27 | | \_ /usr/sbin/apache2 -k start
18465 356077 18465 18465 ? -1 SN 33 0:24 | | \_ /usr/sbin/apache2 -k start
18465 356301 18465 18465 ? -1 SN 33 0:25 | | \_ /usr/sbin/apache2 -k start
18465 362824 18465 18465 ? -1 SN 33 0:16 | | \_ /usr/sbin/apache2 -k start
18465 364208 18465 18465 ? -1 SN 33 0:14 | | \_ /usr/sbin/apache2 -k start
18465 366422 18465 18465 ? -1 SN 33 0:12 | | \_ /usr/sbin/apache2 -k start
18465 366848 18465 18465 ? -1 SN 33 0:12 | | \_ /usr/sbin/apache2 -k start
18465 367416 18465 18465 ? -1 SN 33 0:10 | | \_ /usr/sbin/apache2 -k start
18465 367576 18465 18465 ? -1 SN 33 0:11 | | \_ /usr/sbin/apache2 -k start
18465 405605 18465 18465 ? -1 SN 33 0:03 | | \_ /usr/sbin/apache2 -k start
7337 18824 18465 18465 ? -1 SN 33 174:59 | \_ php /var/www/html/core/class/../php/jeeCron.php cron_id=301554
7337 35774 18465 18465 ? -1 SNl 33 0:31 | \_ node /var/www/html/plugins/alexaapi/resources/alexaapi.js http://app_jeedom amazon.fr alexa.amazon.fr OtAkaDFZj3YlSEQg6T1VGk8Jq8
7337 44738 44738 44738 ? -1 SNs 106 0:00 | \_ /usr/bin/dbus-daemon --system
7337 44767 44766 44766 ? -1 SN 107 1:13 | \_ avahi-daemon: running [bf40089312cd.local]
44767 44768 44766 44766 ? -1 SN 107 0:00 | | \_ avahi-daemon: chroot helper
7337 45616 18465 18465 ? -1 SNl 33 4:20 | \_ homebridge
45616 45664 18465 18465 ? -1 SNl 33 2:10 | | \_ homebridge-config-ui-x
7337 46149 46102 46102 ? -1 R 33 1931:04 | \_ sh -c (ps ax || ps w) | grep -ie "cron_id=7$" | grep -v "grep"
7337 407386 18465 18465 ? -1 RN 33 0:00 | \_ php /var/www/html/core/class/../php/jeeListener.php listener_id=2 event_id=379484 value='1310' datetime='2021-10-06 06:36:25'
7317 22607 22607 22607 ? 22607 Ss+ 0 0:00 \_ /bin/bash
I tried to kill parent: every level of the process tree has been killed, except the parent and the blocked process (2 processes this time with the same parent). Now I have
root 5790 0.0 0.0 0 0 ? Ss Oct09 0:14 \_ [sh]
www-data 267740 99.4 0.0 2040 84 ? RN 05:05 1032:49 \_ sh -c ps ax | grep "resources/alexaapi.js" | grep -v "grep" | wc -l HOME=/var/www LOGNAME=www-data PATH=/usr/bin:/bin SHELL=/bi
www-data 357120 99.5 0.0 2040 80 ? RN 14:00 501:07 \_ sh -c (ps ax || ps w) | grep -ie "cron_id=469432$" | grep -v "grep" HOME=/var/www LOGNAME=www-data PATH=/usr/bin:/bin SHELL=/bin
And with 'ps-ef':
root 5790 5760 0 Oct09 ? 00:00:14 [sh]
www-data 267740 5790 99 Oct10 ? 1-01:58:16 sh -c ps ax | grep "resources/alexaapi.js" | grep -v "grep" | wc -l
www-data 357120 5790 99 Oct10 ? 17:06:33 sh -c (ps ax || ps w) | grep -ie "cron_id=469432$" | grep -v "grep"