In my system, multiple logical volumes are created, mounted and /etc/fstab
entries are added after systemd starts up.
When system is rebooted for the first time, during shutdown many umount failures are observed.
Interestingly, in subsequent reboots, this problem is not observed.
The difference in behavior between first and subsequent reboots might be caused by the fact that during first reboot alone, the *.mount
fragments are not generated by systemd-fstab-generator
. I am particularly interested in fixing the issue for /var/log
as I'm facing data loss here.
Following the advice from here, I added override conf for systemd-journal-flush.service
as:
[Unit]
PartOf=var.mount var-log.mount
[Service]
ExecStop=/usr/local/bin/debugger.sh
Also, a debugging script was added in ExecStop to check for any open fds.
#!/bin/sh
#
date > /root/out
echo "=============" >> /root/out
fuser -m /var/log >> /root/out
echo "=============" >> /root/out
lsof | grep /var/log >> /root/out
echo "===============" >> /root/out
ps aux >> /root/out
With this change as well, umount fails in first reboot and works in subsequent reboots.
[FAILED] Failed unmounting /var/crash/kernel.
[FAILED] Failed unmounting /var/lib/systemd/coredump.
[FAILED] Failed unmounting /var/log/audit.
[FAILED] Failed unmounting /var/sonic.
[FAILED] Failed unmounting /var/tmp.
[ OK ] Stopped target Swap.
Unmounting /mnt/sftp...
Unmounting /var/crash...
[FAILED] Failed unmounting /var/crash.
[ OK ] Stopped Flush Journal to Persistent Storage.
Unmounting /var/log...
[FAILED] Failed unmounting /var/log.
From the debugging script output, both lsof
and fuser
outputs are empty.
I don't find anything suspicious in ps
output as well. (There is one umount
in D
state. Not sure if it is something that needs to be checked.)
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 27.4 0.2 188048 37548 ? Ss 15:43 5:26 /usr/lib/systemd/systemd --show-status=1 --log-target=journal-or-kmsg --log-level=notice
root 2 0.0 0.0 0 0 ? S 15:43 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? I< 15:43 0:00 [rcu_gp]
root 4 0.0 0.0 0 0 ? I< 15:43 0:00 [rcu_par_gp]
root 5 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/0:0-cgroup_destroy]
root 6 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/0:0H-kblockd]
root 8 0.0 0.0 0 0 ? I< 15:43 0:00 [mm_percpu_wq]
root 9 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/0]
root 10 0.2 0.0 0 0 ? I 15:43 0:02 [rcu_sched]
root 11 0.0 0.0 0 0 ? I 15:43 0:00 [rcu_bh]
root 12 0.0 0.0 0 0 ? S 15:43 0:00 [migration/0]
root 13 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/0]
root 14 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/1]
root 15 0.0 0.0 0 0 ? S 15:43 0:00 [migration/1]
root 16 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/1]
root 18 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/1:0H-kblockd]
root 19 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/2]
root 20 0.0 0.0 0 0 ? S 15:43 0:00 [migration/2]
root 21 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/2]
root 23 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/2:0H-kblockd]
root 24 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/3]
root 25 0.0 0.0 0 0 ? S 15:43 0:00 [migration/3]
root 26 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/3]
root 28 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/3:0H-kblockd]
root 29 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/4]
root 30 0.0 0.0 0 0 ? S 15:43 0:00 [migration/4]
root 31 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/4]
root 33 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/4:0H-kblockd]
root 34 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/5]
root 35 0.0 0.0 0 0 ? S 15:43 0:00 [migration/5]
root 36 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/5]
root 38 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/5:0H-kblockd]
root 39 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/6]
root 40 0.0 0.0 0 0 ? S 15:43 0:00 [migration/6]
root 41 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/6]
root 43 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/6:0H-kblockd]
root 44 0.0 0.0 0 0 ? S 15:43 0:00 [cpuhp/7]
root 45 0.0 0.0 0 0 ? S 15:43 0:00 [migration/7]
root 46 0.0 0.0 0 0 ? S 15:43 0:00 [ksoftirqd/7]
root 48 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/7:0H-kblockd]
root 49 0.0 0.0 0 0 ? S 15:43 0:00 [kdevtmpfs]
root 50 0.0 0.0 0 0 ? I< 15:43 0:00 [netns]
root 51 0.0 0.0 0 0 ? S 15:43 0:00 [kauditd]
root 53 0.0 0.0 0 0 ? S 15:43 0:00 [khungtaskd]
root 54 0.0 0.0 0 0 ? S 15:43 0:00 [oom_reaper]
root 55 0.0 0.0 0 0 ? I< 15:43 0:00 [writeback]
root 56 0.0 0.0 0 0 ? S 15:43 0:00 [kcompactd0]
root 57 0.0 0.0 0 0 ? SN 15:43 0:00 [ksmd]
root 58 0.0 0.0 0 0 ? I< 15:43 0:00 [crypto]
root 59 0.0 0.0 0 0 ? I< 15:43 0:00 [kblockd]
root 60 0.0 0.0 0 0 ? I< 15:43 0:00 [ata_sff]
root 61 0.0 0.0 0 0 ? I< 15:43 0:00 [md]
root 62 0.0 0.0 0 0 ? I< 15:43 0:00 [devfreq_wq]
root 63 0.0 0.0 0 0 ? S 15:43 0:00 [watchdogd]
root 65 0.0 0.0 0 0 ? S 15:43 0:00 [kswapd0]
root 73 0.0 0.0 0 0 ? I< 15:43 0:00 [kthrotld]
root 74 0.0 0.0 0 0 ? S 15:43 0:00 [irq/30-pciehp]
root 76 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/1:1-memcg_kmem_cache]
root 77 0.0 0.0 0 0 ? I 15:43 0:01 [kworker/3:1-events]
root 78 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/4:1-cgroup_destroy]
root 79 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/5:1-events]
root 80 0.0 0.0 0 0 ? I 15:43 0:01 [kworker/6:1-memcg_kmem_cache]
root 82 0.0 0.0 0 0 ? I< 15:43 0:00 [acpi_thermal_pm]
root 84 0.0 0.0 0 0 ? I< 15:43 0:00 [tpm-vtpm]
root 85 0.0 0.0 0 0 ? S 15:43 0:00 [scsi_eh_0]
root 86 0.0 0.0 0 0 ? I< 15:43 0:00 [scsi_tmf_0]
root 87 0.0 0.0 0 0 ? S 15:43 0:00 [scsi_eh_1]
root 88 0.0 0.0 0 0 ? I< 15:43 0:00 [scsi_tmf_1]
root 89 0.0 0.0 0 0 ? S 15:43 0:00 [scsi_eh_2]
root 90 0.0 0.0 0 0 ? I< 15:43 0:00 [scsi_tmf_2]
root 91 0.0 0.0 0 0 ? S 15:43 0:00 [scsi_eh_3]
root 92 0.0 0.0 0 0 ? I< 15:43 0:00 [scsi_tmf_3]
root 94 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/u16:3-flush-252:18]
root 97 0.0 0.0 0 0 ? I< 15:43 0:00 [ipv6_addrconf]
root 99 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/1:1H-kblockd]
root 118 0.0 0.0 0 0 ? I< 15:43 0:00 [ixgbe]
root 126 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/6:2-events]
root 133 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/2:1H-kblockd]
root 135 0.0 0.0 0 0 ? I< 15:43 0:00 [kdmflush]
root 138 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/4:1H-kblockd]
root 139 0.0 0.0 0 0 ? S 15:43 0:00 [jbd2/dm-0-8]
root 140 0.0 0.0 0 0 ? I< 15:43 0:00 [ext4-rsv-conver]
root 142 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/0:1H-kblockd]
root 145 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/6:1H-kblockd]
root 147 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/7:1H-events_highpri]
root 158 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/3:1H-kblockd]
root 166 0.2 0.3 80384 57352 ? Ss 15:43 0:02 /usr/lib/systemd/systemd-journald
root 171 0.0 0.0 0 0 ? I< 15:43 0:00 [kworker/5:1H-kblockd]
root 172 0.0 0.0 0 0 ? I< 15:43 0:00 [ipmi-msghandler]
root 173 0.0 0.0 0 0 ? I 15:43 0:00 [kworker/2:2-events_power_efficient]
root 183 0.2 0.0 7164 5260 ? Ss 15:43 0:02 /usr/lib/systemd/systemd-udevd
root 227 0.0 0.0 0 0 ? S 15:43 0:00 [scsi_eh_4]
root 228 0.0 0.0 0 0 ? I< 15:43 0:00 [scsi_tmf_4]
root 231 0.0 0.0 0 0 ? S 15:43 0:00 [usb-storage]
root 237 0.0 0.0 0 0 ? I< 15:43 0:00 [ttm_swap]
root 272 0.0 0.0 0 0 ? SN 15:43 0:00 [kipmi0]
root 68631 0.0 0.0 0 0 ? I 15:48 0:00 [kworker/3:0-cgroup_destroy]
root 68632 0.0 0.0 0 0 ? I 15:48 0:00 [kworker/1:2-mm_percpu_wq]
root 95828 0.1 0.0 0 0 ? I 15:51 0:00 [kworker/2:0-cgroup_destroy]
root 98121 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98348 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98522 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98540 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98644 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98816 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98879 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 98912 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99063 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99098 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99132 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99450 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99503 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99536 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 99856 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100023 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100049 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100153 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100189 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100244 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100279 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 100316 0.0 0.0 0 0 ? I< 15:51 0:00 [kdmflush]
root 101223 0.2 0.0 0 0 ? I 15:51 0:01 [kworker/u16:1-events_unbound]
root 101857 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-2-8]
root 101858 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101864 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-3-8]
root 101865 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101872 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-4-8]
root 101873 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101946 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-14-8]
root 101947 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101953 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-15-8]
root 101954 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101960 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-16-8]
root 101961 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101981 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-20-8]
root 101982 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101988 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/dm-22-8]
root 101989 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 101995 0.0 0.0 0 0 ? S 15:51 0:00 [jbd2/sda3-8]
root 101996 0.0 0.0 0 0 ? I< 15:51 0:00 [ext4-rsv-conver]
root 102071 0.1 0.0 0 0 ? I 15:51 0:00 [kworker/7:0-events]
root 102103 0.0 0.0 0 0 ? I 15:51 0:00 [kworker/5:0-rcu_gp]
root 102283 0.1 0.0 0 0 ? I 15:51 0:00 [kworker/4:0-memcg_kmem_cache]
root 102829 0.0 0.0 0 0 ? I 15:52 0:00 [kworker/7:3-cgroup_pidlist_destroy]
root 102941 0.0 0.0 0 0 ? I 15:52 0:00 [kworker/0:1-rcu_gp]
root 116853 0.0 0.0 0 0 ? I 15:59 0:00 [kworker/u16:0-events_unbound]
root 116965 0.0 0.0 0 0 ? I 15:59 0:00 [kworker/1:0-mm_percpu_wq]
root 118574 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/u16:2-flush-252:22]
root 118585 0.0 0.0 7164 3708 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118586 0.0 0.0 7164 3644 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118587 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118588 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118589 0.5 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118590 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118591 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118592 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118593 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118594 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118595 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118596 0.0 0.0 7164 3620 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118597 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118598 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118599 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118600 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118601 1.6 0.0 7164 3780 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118603 0.0 0.0 7164 3640 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118604 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118605 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118606 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118607 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118608 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118609 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118610 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118611 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118612 0.0 0.0 7164 3528 ? S 16:02 0:00 /usr/lib/systemd/systemd-udevd
root 118640 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/3:2-cgroup_destroy]
root 118731 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/3:3-events]
root 118745 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/4:2-cgroup_destroy]
root 118823 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/7:1-cgroup_destroy]
root 118898 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/0:2]
root 118899 0.2 0.0 0 0 ? I 16:02 0:00 [kworker/0:3-events]
root 119014 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/6:0-mm_percpu_wq]
root 119052 0.0 0.0 0 0 ? I 16:02 0:00 [kworker/2:1-events]
root 119111 0.0 0.0 6136 932 ? R 16:02 0:00 /bin/umount /var/lib/docker -c
root 119117 0.0 0.0 6808 2936 ? Ss 16:02 0:00 /bin/sh /usr/local/bin/debugger.sh
root 119120 0.0 0.0 6136 876 ? D 16:02 0:00 /bin/umount /mnt/sftp -c
root 119124 0.0 0.0 6132 2036 ? R 16:02 0:00 ps aux
So my questions are:
- Am i missing any graceful handling that is needed when
/etc/fstab
entries are added at runtime. Any specific way to inform systemd to create *.mount
and systemd-fsck@*
fragments?
- If no processes are using the mount points, what else can cause the umount failure? How to debug this further?
- Any options to solve data loss? Adding another service to do lazy/force umount could help?