Score:0

Understanding ulimits / process limits, or maybe something else. New processes stop opening (fork errors)

ws flag

I'm a little bit lost and need some help understand what exactly is happening with my server.

So this is a Proxmox (Debian) server with several LXC containers running in it, and from time to time everything just starts failing because it seems that new processes/childrens are unable to open. The syslog starts being filled with messages like this:

May 24 18:19:44 pvirtual08 ksmtuned[1645]: /usr/sbin/ksmtuned: fork: retry: Resource temporarily unavailable
May 24 18:19:46 pvirtual08 pve-firewall[4013]: status update error: command 'iptables-save' failed: open3: fork failed: Resource temporarily unavailable at /usr/share/perl5/PVE/Tools.pm line 449.
May 24 18:19:47 pvirtual08 pvestatd[4012]: command 'lxc-info -n 124 -p' failed: open3: fork failed: Resource temporarily unavailable at /usr/share/perl5/PVE/Tools.pm line 449.
May 24 18:19:47 pvirtual08 pvestatd[4012]: command 'lxc-info -n 404 -p' failed: open3: fork failed: Resource temporarily unavailable at /usr/share/perl5/PVE/Tools.pm line 449.

until eventually the entire server just crashes. It seems clear that the server stops being able to open new processes after some time but I don't quite understand why. Everything I read points to either a ulimit being reached, or server being out of RAM. Last time I checked the server was no where near having it's RAM full. Regarding ulimits, and this is where I'm a bit lots, from what I can tell that is not being reached either.

This is the current values for ulimit -a

root@pvirtual08:/var/log# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 514673
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 514656
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Last time it happened I checked the number of processes running with

ps -eLf | grep -v root | wc -l

and it was no where near the limit either, but maybe I'm just counting wrong, or checking the wrong limit.

Is there a way I can know which limit is being reached exactly for the server to stop opening new forks? I way to monitor the current usage vs limit so I can do a monitoring script for example?

I apologize for any dumb questions but this is a bit new and confusing to me, the way the ulimits work.

Ginnungagap avatar
gu flag
Limits are per-user and can also be set by systemd per-service, do you have a `LimitNPROC` set in unit file of the service encountering issues? What user is it running as (I'm assuming root given it's playing with iptables)?
ItsJustMe avatar
ws flag
The errors start appearing for all sorts of services, so I'm assuming the limit i would be hitting is the user limit on ulimit, but my counts don't show that. Unless there's is another limit being hit I'm not aware of
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.