I have a trivial systemd service (called "testpath") which simply prints some output to be captured by the journal. The journal is intermittently and unpredictably losing part of the output. The behaviour seems so erratic, and the test case so trivial, that it's hard to see how it could be any kind of intended behaviour.
Can anyone offer a explanation in terms of expected systemd behaviour? Failing that, can anyone suggest a strategy to try to isolate this as a bug?
The service simply runs this script:
% cat ~/tmp/testpath.bin
#!/bin/bash
sleep 0.1
ps -p $$
echo Echo statement.
The expected behaviour is that both the ps
and echo
outputs should appear in the journal at each system invocation. But instead the the ps
output appears maybe half the time, and the echo
output maybe 90% of the time. For example, manually running systemctl --user start testpath
produced this journal:
% journalctl --user -u testpath.service --follow
<some lines elided>
Oct 30 00:47:36 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:47:36 skierfe testpath.bin[862530]: PID TTY TIME CMD
Oct 30 00:47:36 skierfe testpath.bin[862530]: 862527 ? 00:00:00 testpath.bin
Oct 30 00:47:36 skierfe testpath.bin[862527]: Echo statement.
Oct 30 00:53:17 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:17 skierfe testpath.bin[863505]: Echo statement.
Oct 30 00:53:20 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:21 skierfe testpath.bin[863545]: Echo statement.
Oct 30 00:53:23 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:23 skierfe testpath.bin[863553]: Echo statement.
Oct 30 00:53:26 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:26 skierfe testpath.bin[863558]: Echo statement.
Oct 30 00:53:28 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:33 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:33 skierfe testpath.bin[863577]: PID TTY TIME CMD
Oct 30 00:53:33 skierfe testpath.bin[863577]: 863574 ? 00:00:00 testpath.bin
Oct 30 00:53:33 skierfe testpath.bin[863574]: Echo statement.
Oct 30 00:53:37 skierfe systemd[1518]: Started Testpath service.
Oct 30 00:53:37 skierfe testpath.bin[863579]: Echo statement.
The unit file:
skierfe:(f)..config/systemd/user % cat ~/.config/systemd/user/testpath.service
[Unit]
Description = Testpath service
[Service]
ExecStart = %h/tmp/testpath.bin
Things I've considered or tried (none of which have helped):
- System resource contention: no, new Core i5 system with low loadavg
- Rate-limit parameters per #774809: no,
RateLimitInterval
and RateLimitBurst
are commented out in /etc/systemd/journald.conf, and the logging rate is trivial anyway
- Extending or removing the
sleep
command (timing issues?)
- Changing the service to
Type=oneshot
- Setting explicit
StandardOutput=journal
per #1089887
None of these have changed the behaviour.
I'm on x86_64 Ubuntu 22.04 with current package updates.
skierfe:(f)~/tmp % systemd --version
systemd 249 (249.11-0ubuntu3.6)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified