Score:0

How to debug systemd shutdown failing to gracefully shutdown?

sd flag
gcb

I have a host that runs qemu vms.

Every restart will leave the qemu pidfile around, showing its process wasn't properly terminated but possibly killed ...question here is exactly how to find out what happened.

$ journalctl -b -1
...
systemd[1]: [email protected]: Deactivated successfully.
systemd[1]: Stopped User Manager for UID 1000.
...
systemd[1]: Stopped Create Static Device Nodes in /dev.
systemd[1]: Reached target System Shutdown.
systemd[1]: Reached target Late Shutdown Services.
systemd[1]: systemd-poweroff.service: Deactivated successfully.
systemd[1]: Finished System Power Off.
systemd[1]: Reached target System Power Off.
systemd[1]: Shutting down.
systemd-shutdown[1]: Syncing filesystems and block devices.
systemd-shutdown[1]: Sending SIGTERM to remaining processes...
systemd-journald[523]: Received SIGTERM from PID 1 (systemd-shutdow).
systemd-journald[523]: Journal stopped

qemu logs doesn't have anything besides startup, which tells me it never got a sigterm signal but was killed. I can't find any mention of it on any logs though...

$ systemctl status systemd-shutdown
Unit systemd-shutdown.service could not be found.

$ systemctl show systemd-shutdown
ExitType=main
Restart=no
NotifyAccess=none
RestartUSec=100ms
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
TimeoutAbortUSec=1min 30s
TimeoutStartFailureMode=terminate
TimeoutStopFailureMode=terminate
RuntimeMaxUSec=infinity
RuntimeRandomizedExtraUSec=0
WatchdogUSec=infinity
WatchdogTimestampMonotonic=0
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=0
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ReloadResult=success
CleanResult=success
UID=[not set]
GID=[not set]
NRestarts=0
ReloadSignal=1
ExecMainStartTimestampMonotonic=0
ExecMainExitTimestampMonotonic=0
ExecMainPID=0
ExecMainCode=0
ExecMainStatus=0
ControlGroupId=0
MemoryCurrent=[not set]
MemoryAvailable=infinity
CPUUsageNSec=[not set]
TasksCurrent=[not set]
IPIngressBytes=[no data]
IPIngressPackets=[no data]
IPEgressBytes=[no data]
IPEgressPackets=[no data]
IOReadBytes=18446744073709551615
IOReadOperations=18446744073709551615
IOWriteBytes=18446744073709551615
IOWriteOperations=18446744073709551615
Delegate=no
CPUAccounting=yes
CPUWeight=[not set]
StartupCPUWeight=[not set]
CPUShares=[not set]
StartupCPUShares=[not set]
CPUQuotaPerSecUSec=infinity
CPUQuotaPeriodUSec=infinity
IOAccounting=no
IOWeight=[not set]
StartupIOWeight=[not set]
BlockIOAccounting=no
BlockIOWeight=[not set]
StartupBlockIOWeight=[not set]
MemoryAccounting=yes
DefaultMemoryLow=0
DefaultMemoryMin=0
MemoryMin=0
MemoryLow=0
MemoryHigh=infinity
MemoryMax=infinity
MemorySwapMax=infinity
MemoryZSwapMax=infinity
MemoryLimit=infinity
DevicePolicy=auto
TasksAccounting=yes
TasksMax=37597
IPAccounting=no
ManagedOOMSwap=auto
ManagedOOMMemoryPressure=auto
ManagedOOMMemoryPressureLimit=0
ManagedOOMPreference=none
UMask=0022
LimitCPU=infinity
LimitCPUSoft=infinity
LimitFSIZE=infinity
LimitFSIZESoft=infinity
LimitDATA=infinity
LimitDATASoft=infinity
LimitSTACK=infinity
LimitSTACKSoft=8388608
LimitCORE=infinity
LimitCORESoft=infinity
LimitRSS=infinity
LimitRSSSoft=infinity
LimitNOFILE=1073741816
LimitNOFILESoft=1073741816
LimitAS=infinity
LimitASSoft=infinity
LimitNPROC=125324
LimitNPROCSoft=125324
LimitMEMLOCK=4116402176
LimitMEMLOCKSoft=4116402176
LimitLOCKS=infinity
LimitLOCKSSoft=infinity
LimitSIGPENDING=125324
LimitSIGPENDINGSoft=125324
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=infinity
LimitRTTIMESoft=infinity
OOMScoreAdjust=0
CoredumpFilter=0x33
Nice=0
IOSchedulingClass=2
IOSchedulingPriority=4
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
CPUAffinityFromNUMA=no
NUMAPolicy=n/a
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=inherit
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
LogLevelMax=-1
LogRateLimitIntervalUSec=0
LogRateLimitBurst=0
SecureBits=0
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend cap_audit_read cap_perfmon cap_bpf cap_checkpoint_restore
DynamicUser=no
RemoveIPC=no
PrivateTmp=no
PrivateDevices=no
ProtectClock=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectKernelLogs=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
PrivateMounts=no
PrivateIPC=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=2147483646
LockPersonality=no
RuntimeDirectoryPreserve=no
RuntimeDirectoryMode=0755
StateDirectoryMode=0755
CacheDirectoryMode=0755
LogsDirectoryMode=0755
ConfigurationDirectoryMode=0755
TimeoutCleanUSec=infinity
MemoryDenyWriteExecute=no
RestrictRealtime=no
RestrictSUIDSGID=no
RestrictNamespaces=no
MountAPIVFS=no
KeyringMode=private
ProtectProc=default
ProcSubset=all
ProtectHostname=no
KillMode=control-group
KillSignal=15
RestartKillSignal=15
FinalKillSignal=9
SendSIGKILL=yes
SendSIGHUP=no
WatchdogSignal=6
Id=systemd-shutdown.service
Names=systemd-shutdown.service
Description=systemd-shutdown.service
LoadState=not-found
ActiveState=inactive
FreezerState=running
SubState=dead
StateChangeTimestampMonotonic=0
InactiveExitTimestampMonotonic=0
ActiveEnterTimestampMonotonic=0
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=no
CanStop=yes
CanReload=no
CanIsolate=no
CanFreeze=yes
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnSuccessJobMode=fail
OnFailureJobMode=replace
IgnoreOnIsolate=no
NeedDaemonReload=no
JobTimeoutUSec=infinity
JobRunningTimeoutUSec=infinity
JobTimeoutAction=none
ConditionResult=no
AssertResult=no
ConditionTimestampMonotonic=0
AssertTimestampMonotonic=0
LoadError=org.freedesktop.systemd1.NoSuchUnit "Unit systemd-shutdown.service not found."
Transient=no
Perpetual=no
StartLimitIntervalUSec=10s
StartLimitBurst=5
StartLimitAction=none
FailureAction=none
SuccessAction=none
CollectMode=inactive

There's no systemd unit for qemu as the users should start them adhoc for development. The usual way to stop it is pkill -F <qemu pid file>, which sends a SIGTERM and it exits fine and very quickly (couple milliseconds) and results in a shutdown message on the qemu log file.

$ time pkill ...
real    0m0.036s
user    0m0.015s
sys     0m0.012s

(it is usually 0.003s. the above command was issue during vm startup, which is the worst case for a shutdown). So not a termination timeout by any means.

Only a few times i will see this (possibly when a dev is running qemu without the daemonize/background option):

systemd[735]: Stopped Firefox Web Browser - Web Browser.
systemd[735]: app-org.kde.konsole-123.scope: Failed to kill control group /user.slice/user-1000.slice/[email protected]/app.slice/app-org.kde.konsole-123.scope, ignoring: Operation not permitted
systemd[735]: app-org.kde.konsole-123.scope: Killing process 13618 (qemu-system-x86) with signal SIGKILL.
systemd[735]: app-org.kde.konsole-123.scope: Killing process 13622 (kvm-nx-lpage-recovery-13618) with signal SIGKILL.
systemd[735]: app-org.kde.konsole-123.scope: Failed to kill control group /user.slice/user-1000.slice/[email protected]/app.slice/app-org.kde.konsole-123.scope, ignoring: Operation not permitted
systemd[735]: Stopped Konsole - Terminal.
systemd[735]: app-org.kde.konsole-123.scope: Consumed 7min 31.417s CPU time.
systemd[735]: Removed slice User Background Tasks Slice.
systemd[735]: background.slice: Consumed 5min 46.912s CPU time.

QEMU emulator version 8.0.2
systemd 253.6
kernel 6.4.3

Any ideas on how to debug or fix this bad shutdown of qemu during systemd-shutdown?

djdomi avatar
za flag
welcome, please read [ask] as there is information missing. and please read it carefully there is also a guide attached how to ask a good question ;)
sd flag
gcb
not clear what is missing. have all relevant versions and system logs. english is my 4th language, so my questions will not have perfect prose nor i will get any stab at my writing in it, unless they are direct and to the point.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.