Bug 1422771 - sendmail.service tweak
Summary: sendmail.service tweak
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: sendmail
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jaroslav Škarvada
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-02-16 07:41 UTC by Fabrice Bellet
Modified: 2017-03-08 11:18 UTC (History)
4 users (show)

Fixed In Version: sendmail-8.15.2-8.fc25
Clone Of:
Environment:
Last Closed: 2017-03-03 03:54:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Fabrice Bellet 2017-02-16 07:41:15 UTC
The systemd  and sm-client services maybe could be improved by adding a "StartLimitInterval=0" instruction. The reason of this modification is that the network-manager dispatcher script may restart the service at the same rate the user requests a network change, and this may easily hit the default start-hit-rate limit of systemd (5 bursts starts per 10 secs). When it happens, the service is silently _not_ restarted:

Feb 15 18:00:43 bonobo.bellet.info systemd[1]: sendmail.service: Failed with result 'start-limit-hit'.

As a side comment, I'm also wondering if the nm-dispatcher script for sendmail could not only restart sendmail when the hostname is modified ? This is what is important for sendmail I think (knowing its own hostname, to decide when delivery should be local or relayed).

Comment 1 Jaroslav Škarvada 2017-02-16 10:22:26 UTC
(In reply to Fabrice Bellet from comment #0)
> The systemd  and sm-client services maybe could be improved by adding a
> "StartLimitInterval=0" instruction. The reason of this modification is that
> the network-manager dispatcher script may restart the service at the same
> rate the user requests a network change, and this may easily hit the default
> start-hit-rate limit of systemd (5 bursts starts per 10 secs). When it
> happens, the service is silently _not_ restarted:
> 
> Feb 15 18:00:43 bonobo.bellet.info systemd[1]: sendmail.service: Failed with
> result 'start-limit-hit'.
>
Thanks for info. But didn't you mean "StartLimitIntervalSec"?

> As a side comment, I'm also wondering if the nm-dispatcher script for
> sendmail could not only restart sendmail when the hostname is modified ?
> This is what is important for sendmail I think (knowing its own hostname, to
> decide when delivery should be local or relayed).

Last time when I looked on this problem, I wasn't able to find anything related in NM. Maybe worth NM RFE?

Comment 2 Fabrice Bellet 2017-02-16 15:35:42 UTC
(In reply to Jaroslav Škarvada from comment #1)
> Thanks for info. But didn't you mean "StartLimitIntervalSec"?

Ah. that's interesting, because I didn't notice the option name mismatch. I copied this option from other service files (lvm2-pvscan@.service and kdump.service on my fedora 25 box).

StartLimitIntervalSec=0 doesn't do the job. StartLimitBurst=0 seems the required option instead, if we want to stay with "documented" options :)

> 
> > As a side comment, I'm also wondering if the nm-dispatcher script for
> > sendmail could not only restart sendmail when the hostname is modified ?
> > This is what is important for sendmail I think (knowing its own hostname, to
> > decide when delivery should be local or relayed).
> 
> Last time when I looked on this problem, I wasn't able to find anything
> related in NM. Maybe worth NM RFE?

For what I know of current networkmanager capabilities, dispatcher scripts can react on hostname change initiated by networkmanager itself (arg $2 == hostname). This event can be triggered automatically when receiving the IP address from a DHCP server (from address lookup), or manually when the user changes the hostname with "nmcli general hostname newhostname". This second possibility _also_ modifies the static hostname in /etc/hostname.

There may also be subtleties, depending on whether /etc/hostname is empty, or whether hostname==localhost.localdomain at NetworkManager startup, but I'm not 100% sure about that.

Comment 3 Jaroslav Škarvada 2017-02-16 15:53:50 UTC
(In reply to Fabrice Bellet from comment #2)
> (In reply to Jaroslav Škarvada from comment #1)
> > Thanks for info. But didn't you mean "StartLimitIntervalSec"?
> 
> Ah. that's interesting, because I didn't notice the option name mismatch. I
> copied this option from other service files (lvm2-pvscan@.service and
> kdump.service on my fedora 25 box).
> 
> StartLimitIntervalSec=0 doesn't do the job. StartLimitBurst=0 seems the
> required option instead, if we want to stay with "documented" options :)
>
Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it out with the systemd guys, because it doesn't seem to be documented.

> > 
> > > As a side comment, I'm also wondering if the nm-dispatcher script for
> > > sendmail could not only restart sendmail when the hostname is modified ?
> > > This is what is important for sendmail I think (knowing its own hostname, to
> > > decide when delivery should be local or relayed).
> > 
> > Last time when I looked on this problem, I wasn't able to find anything
> > related in NM. Maybe worth NM RFE?
> 
> For what I know of current networkmanager capabilities, dispatcher scripts
> can react on hostname change initiated by networkmanager itself (arg $2 ==
> hostname). This event can be triggered automatically when receiving the IP
> address from a DHCP server (from address lookup), or manually when the user
> changes the hostname with "nmcli general hostname newhostname". This second
> possibility _also_ modifies the static hostname in /etc/hostname.
> 
> There may also be subtleties, depending on whether /etc/hostname is empty,
> or whether hostname==localhost.localdomain at NetworkManager startup, but
> I'm not 100% sure about that.

Thanks for info, but I am currently not 100% sure whether it's enough, because IIRC sendmail is doing reverse DNS check which could reveal inconsistency if the IP changes.

Comment 4 Fabrice Bellet 2017-02-16 19:29:20 UTC
(In reply to Jaroslav Škarvada from comment #3)
> Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it
> out with the systemd guys, because it doesn't seem to be documented.

Yes, "StartLimitInterval=0" has the desired behaviour for me (both in sendmail.service and sm-client.service)

Comment 5 Jaroslav Škarvada 2017-02-17 09:59:23 UTC
(In reply to Fabrice Bellet from comment #4)
> (In reply to Jaroslav Škarvada from comment #3)
> > Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it
> > out with the systemd guys, because it doesn't seem to be documented.
> 
> Yes, "StartLimitInterval=0" has the desired behaviour for me (both in
> sendmail.service and sm-client.service)

Interesting, it seems StartLimitInterval is just obsoleted alias for StartLimitIntervalSec (in f25, i.e. systemd-231-13.fc25). Adding Michal to sort it out.

Comment 6 Michal Sekletar 2017-02-20 09:32:53 UTC
(In reply to Jaroslav Škarvada from comment #5)
> (In reply to Fabrice Bellet from comment #4)
> > (In reply to Jaroslav Škarvada from comment #3)
> > > Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it
> > > out with the systemd guys, because it doesn't seem to be documented.
> > 
> > Yes, "StartLimitInterval=0" has the desired behaviour for me (both in
> > sendmail.service and sm-client.service)
> 
> Interesting, it seems StartLimitInterval is just obsoleted alias for
> StartLimitIntervalSec (in f25, i.e. systemd-231-13.fc25). Adding Michal to
> sort it out.

StartLimitInterval is the proper name of the option while StartLimitIntervalSec is name of the DBus property. In configuration files please use the former.

For complete list of options (with pointers to additional docs) see "man 7 systemd.directives"

Comment 7 Jaroslav Škarvada 2017-02-20 21:32:41 UTC
(In reply to Michal Sekletar from comment #6)
> (In reply to Jaroslav Škarvada from comment #5)
> > (In reply to Fabrice Bellet from comment #4)
> > > (In reply to Jaroslav Škarvada from comment #3)
> > > > Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it
> > > > out with the systemd guys, because it doesn't seem to be documented.
> > > 
> > > Yes, "StartLimitInterval=0" has the desired behaviour for me (both in
> > > sendmail.service and sm-client.service)
> > 
> > Interesting, it seems StartLimitInterval is just obsoleted alias for
> > StartLimitIntervalSec (in f25, i.e. systemd-231-13.fc25). Adding Michal to
> > sort it out.
> 
> StartLimitInterval is the proper name of the option while
> StartLimitIntervalSec is name of the DBus property. In configuration files
> please use the former.
> 
> For complete list of options (with pointers to additional docs) see "man 7
> systemd.directives"

Sorry I cannot find it in the "man 7 systemd.directives", is it systemd bug / non documented option?:

$ man 7 systemd.directives | grep StartLimitInterval
       StartLimitIntervalSec=
       DefaultStartLimitIntervalSec=

$ man systemd.unit | grep StartLimitInterval
       StartLimitIntervalSec=, StartLimitBurst=
           modified. Use StartLimitIntervalSec= to configure the checking interval (defaults to DefaultStartLimitIntervalSec= in
           Configure the action to take if the rate limit configured with StartLimitIntervalSec= and StartLimitBurst= is hit. Takes

Comment 8 Michal Sekletar 2017-02-21 09:08:35 UTC
(In reply to Jaroslav Škarvada from comment #7)
> (In reply to Michal Sekletar from comment #6)
> > (In reply to Jaroslav Škarvada from comment #5)
> > > (In reply to Fabrice Bellet from comment #4)
> > > > (In reply to Jaroslav Škarvada from comment #3)
> > > > > Has the "StartLimitInterval=0" worked for you? If yes I am going to sort it
> > > > > out with the systemd guys, because it doesn't seem to be documented.
> > > > 
> > > > Yes, "StartLimitInterval=0" has the desired behaviour for me (both in
> > > > sendmail.service and sm-client.service)
> > > 
> > > Interesting, it seems StartLimitInterval is just obsoleted alias for
> > > StartLimitIntervalSec (in f25, i.e. systemd-231-13.fc25). Adding Michal to
> > > sort it out.
> > 
> > StartLimitInterval is the proper name of the option while
> > StartLimitIntervalSec is name of the DBus property. In configuration files
> > please use the former.
> > 
> > For complete list of options (with pointers to additional docs) see "man 7
> > systemd.directives"
> 
> Sorry I cannot find it in the "man 7 systemd.directives", is it systemd bug
> / non documented option?:
> 
> $ man 7 systemd.directives | grep StartLimitInterval
>        StartLimitIntervalSec=
>        DefaultStartLimitIntervalSec=
> 
> $ man systemd.unit | grep StartLimitInterval
>        StartLimitIntervalSec=, StartLimitBurst=
>            modified. Use StartLimitIntervalSec= to configure the checking
> interval (defaults to DefaultStartLimitIntervalSec= in
>            Configure the action to take if the rate limit configured with
> StartLimitIntervalSec= and StartLimitBurst= is hit. Takes

You are right. Sorry for the confusion. My systemd version still have StartLimitInterval though.

Here is upstream commit that renamed StartLimitInterval to StartLimitIntervalSec,

https://github.com/systemd/systemd/commit/f0367da7d1a61ad698a55d17b5c28ddce0dc265a

Note that old name still works. Hence I'd stick to old name if you don't want to have different unit file versions for different Fedora releases. But it is your call.

Comment 9 Jaroslav Škarvada 2017-02-21 10:16:29 UTC
Thanks for info. I can see clearly in the code it's just rename, so I don't understand why StartLimitInterval=0 worked for reporter, but StartLimitIntervalSec=0 didn't.

Michal what's the correct way to disable the rate limiting in the unit file (i.e. to allow unlimited number of service restarts even with multiple failures)? I am unable to find it in the documentation. According to the reporter in the comment 2, it seems that StartLimitInterval=0 works, StartLimitBurst=0 doesn't work, and StartLimitBurst=0 works, which seems quite strange for me (I am going to retest).

Comment 10 Michal Sekletar 2017-02-21 10:56:02 UTC
(In reply to Jaroslav Škarvada from comment #9)
> Thanks for info. I can see clearly in the code it's just rename, so I don't
> understand why StartLimitInterval=0 worked for reporter, but
> StartLimitIntervalSec=0 didn't.
> 
> Michal what's the correct way to disable the rate limiting in the unit file
> (i.e. to allow unlimited number of service restarts even with multiple
> failures)? I am unable to find it in the documentation. According to the
> reporter in the comment 2, it seems that StartLimitInterval=0 works,
> StartLimitBurst=0 doesn't work, and StartLimitBurst=0 works, which seems
> quite strange for me (I am going to retest).

Safest bet is to set both to zero. However, according to our docs setting just the interval should be sufficient, 

"Use StartLimitInterval= to configure the checking interval (defaults to DefaultStartLimitInterval= in manager configuration file, set to 0 to disable any kind of rate limiting)."

Comment 11 Jaroslav Škarvada 2017-02-21 16:24:46 UTC
As I thought and proved by following simple experiment StartLimitIntervalSec works the same as StartLimitInterval. I have default systemd settings which means "by default, units which are started more than 5 times within 10 seconds are not permitted to start any more times until the 10 second interval ends". This means six quick restarts should do the trick. Reproducer:

Default sendmail.service:
# systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail
Job for sendmail.service failed.
See "systemctl status sendmail.service" and "journalctl -xe" for details.
# echo $?
1

sendmail.service with StartLimitIntervalSec=0:
# systemctl daemon-reload
# systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail
# echo $?
0

sendmail.service with StartLimitInterval=0
# systemctl daemon-reload
# systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail
# echo $?
0

sendmail.service with StartLimitBurst=0
# systemctl daemon-reload
# systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail; systemctl restart sendmail
# echo $?
0

I.e. all three variants works. I will probably go with StartLimitIntervalSec instead of StartLimitInterval which is marked as obsoleted and I am not going to backport this feature to older Fedoras.

Comment 12 Fabrice Bellet 2017-02-21 17:13:44 UTC
That's odd, because I have a different behaviour when adding StartLimitInterval=0 or StartLimitIntervalSec=0. 

[root@bonobo ~]# rpm -q systemd
systemd-231-12.fc25.x86_64
[root@bonobo ~]# rpm -V systemd
S.5....T.  c /etc/systemd/coredump.conf
S.5....T.  c /etc/systemd/logind.conf
.......T.  c /etc/systemd/system.conf
[root@bonobo ~]# systemctl is-enabled sendmail
enabled


I modified /usr/lib/systemd/system/sendmail.service, adding StartLimitInterval=0 or StartLimitIntervalSec=0 at the end of section [Service]. Could the section be relevant in that case ?

Comment 13 Jaroslav Škarvada 2017-02-21 17:35:35 UTC
(In reply to Fabrice Bellet from comment #12)
> That's odd, because I have a different behaviour when adding
> StartLimitInterval=0 or StartLimitIntervalSec=0. 
> 
> [root@bonobo ~]# rpm -q systemd
> systemd-231-12.fc25.x86_64
> [root@bonobo ~]# rpm -V systemd
> S.5....T.  c /etc/systemd/coredump.conf
> S.5....T.  c /etc/systemd/logind.conf
> .......T.  c /etc/systemd/system.conf
> [root@bonobo ~]# systemctl is-enabled sendmail
> enabled
> 
> 
> I modified /usr/lib/systemd/system/sendmail.service, adding
> StartLimitInterval=0 or StartLimitIntervalSec=0 at the end of section
> [Service]. Could the section be relevant in that case ?

AFAIK (and according to the doc) it should be in the [unit] section, and my tests in comment 11 used it so, i.e.:

# cat /usr/lib/systemd/system/sendmail.service
[Unit]
Description=Sendmail Mail Transport Agent
After=syslog.target network.target
Conflicts=postfix.service exim.service
Wants=sm-client.service
StartLimitIntervalSec=0

[Service]
Type=forking
PIDFile=/run/sendmail.pid
Environment=SENDMAIL_OPTS=-q1h
EnvironmentFile=-/etc/sysconfig/sendmail
ExecStartPre=-/etc/mail/make
ExecStartPre=-/etc/mail/make aliases
ExecStart=/usr/sbin/sendmail -bd $SENDMAIL_OPTS $SENDMAIL_OPTARG

[Install]
WantedBy=multi-user.target
Also=sm-client.service

Comment 14 Jaroslav Škarvada 2017-02-21 17:38:17 UTC
Maybe StartLimitInterval somehow bubbles from the [Service] section, or maybe it's there allowed for backward compatibility, I didn't check this in the systemd code.

Comment 15 Jaroslav Škarvada 2017-02-21 17:41:57 UTC
Yup, it seems so in the systemd code (load-fragment-gperf.gperf.m4):

m4_dnl The following three only exist for compatibility, they moved into Unit, see above
Service.StartLimitInterval,      config_parse_sec,                   0,                             offsetof(Unit, start_limit.interval)
Service.StartLimitBurst,         config_parse_unsigned,              0,                             offsetof(Unit, start_limit.burst)
Service.StartLimitAction,        config_parse_failure_action,        0,                             offsetof(Unit, start_limit_action)

And there is no StartLimitIntervalSec.

Comment 16 Jaroslav Škarvada 2017-02-21 17:43:46 UTC
Michal, I expected answer similar to comment 11 and comment 15, just nitpicking ;)

Comment 17 Jaroslav Škarvada 2017-02-21 17:44:49 UTC
Fabrice could you confirm StartLimitIntervalSec worked for you in the [unit] section?

Comment 18 Fabrice Bellet 2017-02-21 19:27:35 UTC
(In reply to Jaroslav Škarvada from comment #17)
> Fabrice could you confirm StartLimitIntervalSec worked for you in the [unit]
> section?

Yes, that's right: StartLimitIntervalSec=0 works too, when inserted in the [unit] section.

Comment 19 Jaroslav Škarvada 2017-02-21 23:30:28 UTC
Thanks, I am happy we sorted it out.

Comment 20 Fedora Update System 2017-02-21 23:57:09 UTC
sendmail-8.15.2-8.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-b64ee22a45

Comment 21 Fabrice Bellet 2017-02-22 10:10:25 UTC
Should sm-client.service be modified too ? I notice that the start-limit-hit failed state is now reported on sm-client.

Comment 22 Fedora Update System 2017-02-22 21:09:01 UTC
sendmail-8.15.2-8.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-b64ee22a45

Comment 23 Jaroslav Škarvada 2017-02-23 12:46:30 UTC
(In reply to Fabrice Bellet from comment #21)
> Should sm-client.service be modified too ? I notice that the start-limit-hit
> failed state is now reported on sm-client.

Yes, you are right, it's also worth the fix.

Comment 24 Michal Sekletar 2017-02-23 13:38:49 UTC
(In reply to Jaroslav Škarvada from comment #16)
> Michal, I expected answer similar to comment 11 and comment 15, just
> nitpicking ;)

Mehh, if I were to try and experiment with all configurations/options/setups I see in different bugs that I am handling then I wouldn't get any work done at all because I'd be just configuring systems all the time ;)

Comment 25 Fedora Update System 2017-03-03 03:54:33 UTC
sendmail-8.15.2-8.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 26 joopbraak 2017-03-07 10:59:15 UTC
This limit in restarting sendmail service is there for a reason I assume? Why does sendmail have to be patched to fix a buggy networkmanager? The fix should be in networkmanager, not in sendmail.

Comment 27 Jaroslav Škarvada 2017-03-07 12:22:06 UTC
(In reply to joopbraak from comment #26)
> This limit in restarting sendmail service is there for a reason I assume?
> Why does sendmail have to be patched to fix a buggy networkmanager? The fix
> should be in networkmanager, not in sendmail.

It's workaround. The limit is by default on for all services, I guess it's there to prevent loops of never ending restarts which shouldn't be this case. And there are other services also utilizing this functionality. But feel free to propose better solution / patches. Unfortunately at the moment we have nothing better.

Comment 28 joopbraak 2017-03-08 10:36:01 UTC
Like Fabrice Bellet said:

"As a side comment, I'm also wondering if the nm-dispatcher script for sendmail could not only restart sendmail when the hostname is modified ? This is what is important for sendmail I think (knowing its own hostname, to decide when delivery should be local or relayed)."

or just build in a delay in restarting the service?

I'm no coder, but it doesn't seem that difficult to me.

Comment 29 Jaroslav Škarvada 2017-03-08 11:18:25 UTC
(In reply to joopbraak from comment #28)
> Like Fabrice Bellet said:
> 
> "As a side comment, I'm also wondering if the nm-dispatcher script for
> sendmail could not only restart sendmail when the hostname is modified ?
> This is what is important for sendmail I think (knowing its own hostname, to
> decide when delivery should be local or relayed)."
>
To be honest I don't know. Also see comment 2 there maybe more complication. The dispatcher script will also have to compare the new hostname with the current hostname and restart only if there is a change - and there may be races so we could miss restart. Nor counting that currently I don't know whether it's enough to just react on the hostname change.

Please note I am not against such change, but there is no code at the moment. And even with the code, it would require testing.

> or just build in a delay in restarting the service?
> 
> I'm no coder, but it doesn't seem that difficult to me.

This would need some layer in between to count the delay (i.e. the manager), locking, etc. Not trivial. I am currently not aware of such systemd functionality.


Note You need to log in before you can comment on or make changes to this bug.