Description of problem: Sometimes, journal has message systemd: chronyd.service: Supervising process 1117 which is not our child. We'll most likely not notice when it exits. in it. Version-Release number of selected component (if applicable): chrony-4.3-1.el9.s390x How reproducible: Very non-deterministic. Steps to Reproduce: 1. Have chronyd.service enabled. 2. Boot the system. 3. Check journal for "Supervising process" Actual results: chronyd.service: Supervising process 1117 which is not our child. We'll most likely not notice when it exits. Expected results: No such message. Additional info:
I have a reason to believe that this is caused by a race condition in go_daemon (upstream currently at https://git.tuxfamily.org/chrony/chrony.git/tree/main.c#n344). When the initial process exits before the first forked process exits, the second forked process whose pid got written to pid_file still has ppid that first forked process. So when systemd checks the ppid for the pid found in the /run/chrony/chronyd.pid, it still sees that first forked child, not self.
The same issue on usbguard.service (bug 2042345) got fixed with https://github.com/USBGuard/usbguard/pull/554/files.
The chronyd grandparent process waits for one end of a pipe to be closed by both the middle process and the grandchild. The middle process doesn't do that explicitly, it's left up to the kernel to close it when the process terminates. It seems that happens before the ppid of the child is changed, which can break the order of termination. Thanks for the report.
Upstream fix: https://gitlab.com/chrony/chrony/-/commit/0db30fd0b169b01890c428a3cfba611a222e3509
This issue will be fixed by rebase to chrony-4.4 (bug #2231078).