Some process are in unkillable sleeping state while i/o is low

David Chiang

12/17/22, 6:16 AM

I am the system administrator of an Arch Linux-based workstation. Our workstation uses Slurm as the load manager and consists of one master machine and 4 other computation nodes. In the past few months, we observe that processes on some nodes are stuck from time to time, and rebooting the node solves the problem. We found that the stuck processes are in state D (disk sleeping), but when we use top or other commands to check the i/o of the node, we found that the i/o is in fact pretty low.

When some processes on the node are in state D, everything on the node is slow, but this is only for normal users. When we use superuser to run commands (including python) on the stuck nodes, everything works just fine. But when we change the user by su NORMAL_USER, the process is stuck again. We used ps aux and found that the process -bash run by the NORMAL_USER is in state D. We have tried to use strace to trace the stuck process, and we have also dig into the /proc/PID, but we failed to find anything useful. We also failed to identify any useful messages from journalctl. Maybe we are missing something. We are willing to take any advice or comments.

Our kernel version is 5.10.47-1-lts.

Here is the /proc/PID/status for the process in state D. The process is the bash process when we use su NORMAL_USER. It is a single thread process.

Name:   bash
Umask:  0022
State:  D (disk sleep)
Tgid:   3136723
Ngid:   0
Pid:    3136723
PPid:   3136722
TracerPid:  0
Uid:    1000093 1000093 1000093 1000093
Gid:    1000000 1000000 1000000 1000000
FDSize: 256
Groups: 1000000 1000083
NStgid: 3136723
NSpid:  3136723
NSpgid: 3136723
NSsid:  3110369
VmPeak:    16904 kB
VmSize:    16904 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:      3788 kB
VmRSS:      3744 kB
RssAnon:         412 kB
RssFile:        3332 kB
RssShmem:          0 kB
VmData:      608 kB
VmStk:       132 kB
VmExe:       588 kB
VmLib:      1948 kB
VmPTE:        52 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    0
THP_enabled:    1
Threads:    1
SigQ:   12/772094
SigPnd: 0000000000000000
ShdPnd: 0000000008000002
SigBlk: 0000000000000000
SigIgn: 0000000000384004
SigCgt: 000000004b813efb
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp:    0
Seccomp_filters:    0
Speculation_Store_Bypass:   thread vulnerable
Cpus_allowed:   ffff,ffffffff
Cpus_allowed_list:  0-47
Mems_allowed:   00000003
Mems_allowed_list:  0-1
voluntary_ctxt_switches:    4
nonvoluntary_ctxt_switches: 1

Here is the /proc/PID/stack for the same process.

[<0>] nfs_wait_bit_killable+0x1e/0x90 [nfs]
[<0>] nfs4_wait_clnt_recover+0x60/0x90 [nfsv4]
[<0>] nfs4_client_recover_expired_lease+0x17/0x50 [nfsv4]
[<0>] nfs4_do_open+0x2f4/0xbe0 [nfsv4]
[<0>] nfs4_atomic_open+0xe7/0x100 [nfsv4]
[<0>] nfs_atomic_open+0x1e1/0x520 [nfs]
[<0>] path_openat+0x5f5/0xfc0
[<0>] do_filp_open+0x91/0x130
[<0>] do_sys_openat2+0x96/0x150
[<0>] __x64_sys_openat+0x53/0x90
[<0>] do_syscall_64+0x33/0x40
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

0 + 6

performance

nfs

arch-linux

slurm

AlexD

12/17/22, 10:35 AM

A process that is stuck on NFS can be killed with the SIGKILL signal (kill -9).

David Chiang

12/17/22, 10:51 AM

@AlexD I tried to kill those processes and indeed they can be killed! Thank you! Can you think of any reason that the process is stuck on NFS? The network between the computation node and the NFS is working normally, and we didn't find any error messages complaining about the NFS in `journalctl`.

Michael Hampton

12/17/22, 11:11 PM

Why is the node not up to date?

David Chiang

12/18/22, 12:07 PM

@MichaelHampton Do you mean the kernel's version?

David Chiang

12/18/22, 12:12 PM

We found that the process stuck on the NFS may be caused by the arguments we write in the `/etc/fstab`. We use the default parameters, and we think our problems may be solved if we specify `fsc` (the default is `nofsc`). We will update the question if this solves the problem. Thanks everyone!

David Chiang

12/23/22, 1:08 AM

After we change the NFS mounting parameters, the problem still lingers.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Some process are in unkillable sleeping state while i/o is low

TH: กระบวนการบางอย่างอยู่ในสถานะสลีปที่ไม่สามารถกำจัดได้ในขณะที่ i/o ต่ำ

RO: Unele procese sunt într-o stare de repaus care nu poate fi oprită în timp ce i/o este scăzut

RU: Некоторые процессы находятся в неубиваемом спящем состоянии при низком уровне ввода-вывода.

VI: Một số quá trình đang ở trạng thái ngủ không thể thực hiện được trong khi i/o ở mức thấp

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.