Bug 2051991 - systemd complains that sendmail and sm-client PID files can't be read after service start
Summary: systemd complains that sendmail and sm-client PID files can't be read after s...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: systemd
Version: CentOS Stream
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: systemd maint
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-08 13:42 UTC by Jonathan Kamens
Modified: 2023-08-08 07:28 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-08 07:28:35 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch (809 bytes, patch)
2022-02-08 13:42 UTC, Jonathan Kamens
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-111470 0 None None None 2022-02-08 13:45:26 UTC

Description Jonathan Kamens 2022-02-08 13:42:50 UTC
Created attachment 1859798 [details]
patch

Jan 17 18:25:32 jik4 systemd[1]: sm-client.service: Failed to parse PID from file /run/sm-client.pid: Invalid argument
Jan 17 20:25:03 jik4 systemd[1]: sendmail.service: Can't open PID file /run/sendmail.pid (yet?) after start: No such file or directory

The fix is to put a brief sleep in an ExecStartPost command in the service unit file to give them time to fork and create their PID in the child.

See the attached patch, which introduces delays which I've empirically determined (by restarting sendmail and sm-client daily with these delays in place for several weeks) are long enough to give the PID files time to be created.

Comment 1 Jonathan Kamens 2022-02-12 21:18:57 UTC
0.2 seconds, the delay in my diff, apparently isn't a long enough sleep on reboot. I just rebooted and got the error about sendmail.pid not existing yet. I suggest using 0.3 instead of 0.2.

Comment 2 Jonathan Kamens 2022-02-19 20:28:19 UTC
Welp, apparently 0.3 isn't long enough either, so I suppose at least 0.4 seconds is necessary. Just rounding up to 1 second I suppose wouldn't hurt anything and would give a big error bar.

Comment 4 Jaroslav Škarvada 2023-07-13 16:12:17 UTC
This is both systemd.

> Jan 17 18:25:32 jik4 systemd[1]: sm-client.service: Failed to parse PID from file /run/sm-client.pid: Invalid argument

This is because the /run/sm-client.pid historically contains two lines:
# cat /run/sm-client.pid
6018
/usr/sbin/sendmail -L sm-msp-queue -Ac -q1h

I.e. the PID and the command. It behaves this way for decades and it's unlikely sendmail upstream would accept any change of it due to the backward compatibility, thus systemd should cope with it.

> Jan 17 20:25:03 jik4 systemd[1]: sendmail.service: Can't open PID file /run/sendmail.pid (yet?) after start: No such file or directory

This is because of the way the sendmail daemon is written - the parent process can exit before the child PID is written. Again it behaves this way for decades and it's unlikely sendmail upstream would accept any rewrite of the daemon forking routine. AFAIK systemd contains some mitigations for the legacy daemons behaving this way, thus I don't know why it doesn't work in your case. I think for the forking service the systemd could wait for the PID, e.g. 1 second wait limit shouldn't cause any harm. Empiric sleeps in the service file aren't the way to go. Maybe it's just the diagnostic message from the systemd that the PID isn't there at the time the parent process exited and it will correctly read the PID once it appears. In such case there could be a mechanism how to silence such diagnostic messages on the production system.

Comment 5 RHEL Program Management 2023-08-08 07:28:35 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.