Bug 719931 - sendmail fails to start with systemd test update
Summary: sendmail fails to start with systemd test update
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: sendmail
Version: 15
Hardware: All
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jaroslav Škarvada
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 494832 702621
TreeView+ depends on / blocked
 
Reported: 2011-07-08 12:59 UTC by Michal Schmidt
Modified: 2011-11-25 02:14 UTC (History)
13 users (show)

Fixed In Version: sendmail-8.14.5-2.fc15.2
Doc Type: Bug Fix
Doc Text:
Clone Of: 719884
: 748416 (view as bug list)
Environment:
Last Closed: 2011-11-25 02:14:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Experimental patch that prevents the PID to change on HUP (477 bytes, patch)
2011-07-10 19:13 UTC, Jaroslav Škarvada
no flags Details | Diff
enable foreground for sm-client (714 bytes, patch)
2011-07-21 14:26 UTC, Jaroslav Škarvada
no flags Details | Diff
Proposed rawhide changes (3.09 KB, patch)
2011-07-22 09:12 UTC, Jaroslav Škarvada
no flags Details | Diff

Description Michal Schmidt 2011-07-08 12:59:00 UTC
+++ This bug was initially created as a clone of Bug #719884 +++

Created attachment 511883 [details]
log messages for bug

Description of problem: sendmail fails to start after installing systemd test update:

$ rpm -qa systemd\*
systemd-units-26-7.fc15.x86_64
systemd-sysv-26-7.fc15.x86_64
systemd-26-7.fc15.x86_64

Not reproducible with previous stable release: 26-5.fc15

--- Additional comment from mschwendt on 2011-07-08 05:56:23 EDT ---

Created attachment 511884 [details]
log messages for working systemd version

--- Additional comment from mschmidt on 2011-07-08 07:43:45 EDT ---

It seems this is not exactly a new bug, but the fixes in systemd-26-7.fc15 uncovered a previously latent bug. The reason I believe so is that the log from the working version (26-5.fc15) shows some unexpected behaviour too:

1. sendmail is started (OK)
2. sendmail is reloaded (OK)
3. Then however sendmail exits unexpectedly:
 [   35.800771] systemd[1]: Received SIGCHLD from PID 1275 (sendmail).
 [   35.801981] systemd[1]: Got SIGCHLD for process 1275 (sendmail)
 [   35.802026] systemd[1]: Child 1275 died (code=exited, status=0/SUCCESS)
 [   35.802031] systemd[1]: Child 1275 belongs to sendmail.service
 [   35.802039] systemd[1]: sendmail.service: main process exited, code=exited, status=0
 [   35.802145] systemd[1]: sendmail.service changed running -> exited

What does "systemctl status sendmail.service" show after booting with the working version (26-5.fc15)?
And what does "cat /var/run/sendmail.pid" show?

--- Additional comment from mschmidt on 2011-07-08 07:49:25 EDT ---

And here's another thing to test: Remove the line "# pidfile: /var/run/sendmail.pid" from /etc/init.d/sendmail and see what happens after boot with the new systemd version (26-7.fc15).

--- Additional comment from mschwendt on 2011-07-08 07:59:29 EDT ---

Re: comment 2

# systemctl status sendmail.service
sendmail.service - LSB: start and stop sendmail
	  Loaded: loaded (/etc/rc.d/init.d/sendmail)
	  Active: active (exited) since Fri, 08 Jul 2011 13:56:45 +0200; 1min 8s ago
	 Process: 1502 ExecReload=/etc/rc.d/init.d/sendmail reload (code=exited, status=0/SUCCESS)
	 Process: 1130 ExecStart=/etc/rc.d/init.d/sendmail start (code=exited, status=0/SUCCESS)
	Main PID: 1258 (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/sendmail.service
		  ├ 1516 sendmail: accepting connections
		  └ 1518 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue


# cat /var/run/sendmail.pid
1516
/usr/sbin/sendmail -bd -q1h


Reply to comment 3 will follow...

--- Additional comment from mschwendt on 2011-07-08 08:02:06 EDT ---

Re: comment 3

With the removed pid line, it started:

# systemctl status sendmail.service
sendmail.service - LSB: start and stop sendmail
	  Loaded: loaded (/etc/rc.d/init.d/sendmail)
	  Active: active (running) since Fri, 08 Jul 2011 14:00:33 +0200; 25s ago
	 Process: 1487 ExecReload=/etc/rc.d/init.d/sendmail reload (code=exited, status=0/SUCCESS)
	 Process: 1131 ExecStart=/etc/rc.d/init.d/sendmail start (code=exited, status=0/SUCCESS)
	  CGroup: name=systemd:/system/sendmail.service
		  ├ 1505 sendmail: accepting connections
		  └ 1507 sendmail: Queue runner@01:00:00 for /var/spool/cli...

# cat /var/run/sendmail.pid
1505
/usr/sbin/sendmail -bd -q1h

--- Additional comment from mschmidt on 2011-07-08 08:55:45 EDT ---

I see. sendmail does not merely reload its configuration on SIGHUP (the reload action). It does a full restart itself and it then runs with a new PID.
(http://www.c3.hu/docs/oreilly/tcpip/sendmail/ch26_03.htm)

This is incorrect behaviour. LSB specifies the "reload" action as:
  cause the configuration of the service to be reloaded without actually
  stopping and restarting the service

If a service does not support a real reload, its initscript should not provide the action.


Could systemd workaround this somehow? It could re-read the pidfile after '/etc/init.d/sendmail reload' finishes. But the reload action merely sends SIGHUP to the sendmail process and nothing guarantees that sendmail has finished its restarting by the time the initscript exits.

For now I'll remove the patch from systemd-26-7.fc15 that uncovered this bug (0001-service-pidfile-in-SysV-chkconfig-header-implies-a-r.patch). However the patch fixes some other bugs, so it would be nice to be able to put it back eventually.


Michael,
see if using "restart" instead of "reload" in /NetworkManager/dispatcher.d/10-sendmail fixes it.

Comment 1 Michael Schwendt 2011-07-08 15:09:14 UTC
Yes, success:

# rpm -V sendmail
5S.T.....    /etc/NetworkManager/dispatcher.d/10-sendmail
# systemctl status sendmail.service
sendmail.service - LSB: start and stop sendmail
	  Loaded: loaded (/etc/rc.d/init.d/sendmail)
	  Active: active (running) since Fri, 08 Jul 2011 17:07:00 +0200; 35s ago
	 Process: 1396 ExecStop=/etc/rc.d/init.d/sendmail stop (code=exited, status=0/SUCCESS)
	 Process: 1422 ExecStart=/etc/rc.d/init.d/sendmail start (code=exited, status=0/SUCCESS)
	Main PID: 1435 (sendmail)
	  CGroup: name=systemd:/system/sendmail.service
		  ├ 1435 sendmail: accepting connections
		  └ 1444 sendmail: Queue runner@01:00:00 for /var/spool/cli...
# rpm -qa systemd\*
systemd-26-7.fc15.x86_64
systemd-units-26-7.fc15.x86_64
systemd-sysv-26-7.fc15.x86_64

Comment 2 Jaroslav Škarvada 2011-07-10 19:13:21 UTC
Created attachment 512109 [details]
Experimental patch that prevents the PID to change on HUP

Michal thanks for the analysis. Experimental patch that prevents the PID to change is attached. Scratch build with this patch applied is also available:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3189837

I will consult this issue with the sendmail upstream. In case the patch will not be accepted we will have to workaround this in systemd or the reload function will have to be removed.

Comment 3 Jaroslav Škarvada 2011-07-10 19:24:27 UTC
F15 and rawhide experimental scratch builds:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3189842
http://koji.fedoraproject.org/koji/taskinfo?taskID=3189839

Let's wait for the upstream opinion.

Comment 4 Michal Schmidt 2011-07-12 14:04:28 UTC
(In reply to comment #2)
> Created attachment 512109 [details]
> Experimental patch that prevents the PID to change on HUP

Nice trick.

(In reply to comment #3)
> F15 and rawhide experimental scratch builds:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=3189842

Works for me as expected.

Comment 5 Jaroslav Škarvada 2011-07-20 17:10:50 UTC
It was rejected by upstream. Copy of the communication follows:

>> it seems there is unnecessary fork on HUP if running as
>> daemon. The result is that the PID changes on HUP.
>> Maybe this is a feature, but it could cause a problem for
>> distros that uses this functionality in initscripts for
>> reload (e.g. http://bugzilla.redhat.com/show_bug.cgi?id=719931).
>
> "What's the problem you are trying to solve?"
> It seems that is "just" a semantics problem with the way your
> init scripts work: "reload" vs. "restart" -- kill -HUP is
> a restart, not a reload (AFAICT).
>
>> Would it be possible to fix this? I attached simple
>> patch, but I am not sure about the possible side-effects,
>
> Me neither. Moreover, you introduce a new call that isn't used in
> sendmail (getsid()). Last, but not least, this is a change in
> behaviour and hence would have to wait for 8.15.
>
> Summary: I don't see much gain from making such a change, but I see
> a lot of work to avoid breaking things.

In summary: Sendmail uses HUP for restart not for reload and changing this behaviour is too risky with low benefit.

Michal, is it possible to fix this in systemd? Otherwise I will switch all reloads to restarts.

Comment 6 Michal Schmidt 2011-07-21 13:15:11 UTC
OK, upstream says the sendmail daemon does offer an actual reload. Fair enough.
Please replace all uses of 'service sendmail reload' with 'service sendmail restart' and remove the reload action from the initscript in F15.

As long as an initscript offers a reload action that:
 - makes the daemon change PID
   and
 - is asynchronous (exits before the daemon's PID change is complete),
it is not possible to fix/workaround it in systemd.


I had a look at sendmail.service in Rawhide. It has some ExecReload= lines:
  ExecReload=-/etc/mail/make
  ExecReload=-/etc/mail/make aliases
I am not very familiar with sendmail, but I don't think this is sufficient to make the daemon take the new configuration into account.
If it's indeed buggy, perhaps you should remove these two lines too.


(As a sidenote, I find the option to override DAEMON="-bd" in sysconfig dubious. What benefit does it bring? I'd just hardcode it in the unit file. Since the unit type is defined as "forking", using anything else than "-bd" in sysconfig would break it. For the future I suggest looking into the possibility of using "-bD" and changing the type to "simple".)

Comment 7 Jaroslav Škarvada 2011-07-21 13:36:00 UTC
(In reply to comment #6)
> I had a look at sendmail.service in Rawhide. It has some ExecReload= lines:
>   ExecReload=-/etc/mail/make
>   ExecReload=-/etc/mail/make aliases
> I am not very familiar with sendmail, but I don't think this is sufficient to
> make the daemon take the new configuration into account.
> If it's indeed buggy, perhaps you should remove these two lines too.
>
Sorry, I am not familiar with systemd, do we need explicit:
ExecReload=/bin/kill -HUP $MAINPID
?

The 'make' command synchronously regenerates the config files (only if sendmail-cf package is installed, otherwise it fails).

> (As a sidenote, I find the option to override DAEMON="-bd" in sysconfig
> dubious. What benefit does it bring? I'd just hardcode it in the unit file.
> Since the unit type is defined as "forking", using anything else than "-bd" in
> sysconfig would break it. For the future I suggest looking into the possibility
> of using "-bD" and changing the type to "simple".)
>
Interesting, as I wrote I am not familiar with systemd. When the sendmail runs in foreground (with -bD) it doesn't change its PID on HUP. Could we use foreground mode and simple? It would also resolve the reload problem. Or is there any side effect?

Comment 8 Paul Howarth 2011-07-21 13:54:11 UTC
(In reply to comment #7)
> Interesting, as I wrote I am not familiar with systemd. When the sendmail runs
> in foreground (with -bD) it doesn't change its PID on HUP. Could we use
> foreground mode and simple? It would also resolve the reload problem. Or is
> there any side effect?

The manpage doesn't mention any side-effects, so I'd give it a go. Some daemons do logging to stdout when run in the foreground like this for debugging, but I don't think sendmail does.

Comment 9 Paul Howarth 2011-07-21 14:00:40 UTC
I don't see any way of making sm-client run in the foreground and still do regular queue runs though.

Comment 10 Michal Schmidt 2011-07-21 14:03:52 UTC
(In reply to comment #7)
> Sorry, I am not familiar with systemd, do we need explicit:
> ExecReload=/bin/kill -HUP $MAINPID
> ?

I did not test it, but I believe so, yes. Either that or remove the ExecReload items altogether.

> Interesting, as I wrote I am not familiar with systemd. When the sendmail runs
> in foreground (with -bD) it doesn't change its PID on HUP. Could we use
> foreground mode and simple? It would also resolve the reload problem.

Correct.

> Or is there any side effect?

Hmm, yes, there would be a side-effect. The ordering requirement in sm-client.service would not be effective anymore:
  After= ... sendmail.service
That's because for "Type=simple" service there is no way to let systemd know when the service is ready to accept external connections, so systemd just fires off the service and then goes on starting subsequent services. This problem goes away if the service is patched to use socket activation, because then the socket can always accept connections.

Comment 11 Jaroslav Škarvada 2011-07-21 14:23:33 UTC
> > Or is there any side effect?
> 
> Hmm, yes, there would be a side-effect. The ordering requirement in
> sm-client.service would not be effective anymore:
>   After= ... sendmail.service
> That's because for "Type=simple" service there is no way to let systemd know
> when the service is ready to accept external connections, so systemd just fires
> off the service and then goes on starting subsequent services. This problem
> goes away if the service is patched to use socket activation, because then the
> socket can always accept connections.

Is the ordering for exec preserved? So, does it fire sendmail.service and then immediately sm-client service or is it possible that it e.g. fires sm-client service and then after some time sendmail.service (random ordering)?

Comment 12 Jaroslav Škarvada 2011-07-21 14:26:55 UTC
Created attachment 514215 [details]
enable foreground for sm-client

> I don't see any way of making sm-client run in the foreground and still do
regular queue runs though.

If we would go in foreground, I think we could patch it.

Comment 13 Michal Schmidt 2011-07-21 14:37:52 UTC
(In reply to comment #11)
> Is the ordering for exec preserved? So, does it fire sendmail.service and then
> immediately sm-client service

Yes, but this does not buy you much, does it? Scheduling of the resulting processes is not deterministic...

> or is it possible that it e.g. fires sm-client
> service and then after some time sendmail.service (random ordering)?

No, it won't do that.

Comment 14 Jaroslav Škarvada 2011-07-21 14:47:21 UTC
(In reply to comment #13)
> (In reply to comment #11)
> > Is the ordering for exec preserved? So, does it fire sendmail.service and then
> > immediately sm-client service
> 
> Yes, but this does not buy you much, does it? Scheduling of the resulting
> processes is not deterministic...
> 
It is still better than truly random. IMHO the slight delay during the startup shouldn't be problem. But we will have to split the make files.

Comment 15 Jaroslav Škarvada 2011-07-21 14:51:54 UTC
Or implement locking.

Or drop the foreground idea and go without reload.

Michal, with forking is it guaranteed that the ExecStartPre of service B is not run before (or simultaneously) ExecStartPre of service A, if service B has "after" service A?

Comment 16 Michal Schmidt 2011-07-21 15:28:23 UTC
(In reply to comment #15)
> Or implement locking.
> 
> Or drop the foreground idea and go without reload.

Yes, this option is easy and safe. We can revisit Type=simple in the future when sendmail gets patched for socket activation.

> Michal, with forking is it guaranteed that the ExecStartPre of service B is not
> run before (or simultaneously) ExecStartPre of service A, if service B has
> "after" service A?

Yes, this is guaranteed. And it is guaranteed even all types, even for Type=simple.

Let's suppose that a start of A.service and B.service is requested. B.service has "After=A.service". The ordering will be as follows:
 1. all ExecStartPre of A.service will be run sequentially.
 2. ExecStart of A.service will be run.
    What happens next depends on the type of A.service:
    a) if it's "simple" -  A.service is now considered running and systemd
       will proceed immediately with step 3.
    b) if it's "forking", systemd will wait for the ExecStart process to exit,
       because that is the event that signifies the readiness of the daemon.
 3. all ExecStartPre of B.service will be run sequentially.
 4. ExecStart of B.service will be run.

Notice that whatever type the services are, step 3 will always be ordered after step 1.

Comment 17 Jaroslav Škarvada 2011-07-22 09:12:58 UTC
Created attachment 514643 [details]
Proposed rawhide changes

Comment 18 Fedora Update System 2011-07-22 10:27:28 UTC
sendmail-8.14.5-2.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/sendmail-8.14.5-2.fc15

Comment 19 Paul Howarth 2011-07-22 12:04:36 UTC
Changing the variable names in /etc/sysconfig/sendmail presents a problem at upgrade when sendmail is restarted, as SENDMAIL_OPTS will be empty if the user has edited /etc/sysconfig/sendmail and so the new one gets installed as /etc/sysconfig/sendmail.rpmnew. This results in the restarted sendmail daemon not doing queue runs and the client failing altogether.

This could be prevented by putting default values into the service files using "Environment = SENDMAIL_OPTS=-q1h SENDMAIL_OPTARG="
before the EnvironmentFile lines.

Comment 20 Jaroslav Škarvada 2011-07-22 14:28:17 UTC
Thanks, we could simplify it to:

Environment = SENDMAIL_OPTS=-q1h

because there is no change in SENDMAIL_OPTARG and it will be initially blank if not overridden in /etc/sysconfig/sendmail.

Or probably we could simplify it even more and use only SENDMAIL_OPTARG in /etc/sysconfig/sendmail, e.g. SENDMAIL_OPTARG=-q1h - The backward compatibility of /etc/sysconfig/sendmail was already broken during update from 8.14.5-2 to 8.14.5-3, so users with non default settings had to update by hand.

Comment 21 Fedora Update System 2011-07-22 22:09:22 UTC
sendmail-8.14.5-3.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/sendmail-8.14.5-3.fc15

Comment 22 Fedora Update System 2011-07-23 02:06:04 UTC
Package sendmail-8.14.5-3.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sendmail-8.14.5-3.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/sendmail-8.14.5-3.fc15
then log in and leave karma (feedback).

Comment 23 Paul Howarth 2011-07-25 07:48:03 UTC
(In reply to comment #20)
> Thanks, we could simplify it to:
> 
> Environment = SENDMAIL_OPTS=-q1h
> 
> because there is no change in SENDMAIL_OPTARG and it will be initially blank if
> not overridden in /etc/sysconfig/sendmail.

True.

> Or probably we could simplify it even more and use only SENDMAIL_OPTARG in
> /etc/sysconfig/sendmail, e.g. SENDMAIL_OPTARG=-q1h - The backward compatibility
> of /etc/sysconfig/sendmail was already broken during update from 8.14.5-2 to
> 8.14.5-3, so users with non default settings had to update by hand.

The largest number of users upgrading to Rawhide sendmail will come when F16 is released; some of those people may have customized SENDMAIL_OPTARG in /etc/sysconfig/sendmail. It would benefit those people not to use SENDMAIL_OPTARG for the queue option, migrating the old DAEMON and QUEUE options into SENDMAIL_OPTS as it is at the moment, so when they update and sendmail is restarted, it'll still start properly, albeit with the default queue period rather than anything they'd changed QUEUE to.

Comment 24 Jaroslav Škarvada 2011-07-25 14:17:12 UTC
Thanks

Comment 25 Fedora Update System 2011-09-15 08:13:29 UTC
sendmail-8.14.5-2.fc15.1 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/sendmail-8.14.5-2.fc15.1

Comment 26 Michal Schmidt 2011-10-24 11:39:42 UTC
A tester's remark in http://www.abclinuxu.cz/poradna/linux/show/346943 (in Czech) has just made me realize that "restart" will start the service even if it is disabled. The dispatcher script should use "try-restart" instead.

Comment 27 Fedora Update System 2011-10-24 12:36:13 UTC
sendmail-8.14.5-2.fc15.2 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/sendmail-8.14.5-2.fc15.2

Comment 28 Jaroslav Škarvada 2011-10-24 12:51:05 UTC
(In reply to comment #26)
> A tester's remark in http://www.abclinuxu.cz/poradna/linux/show/346943 (in
> Czech) has just made me realize that "restart" will start the service even if
> it is disabled. The dispatcher script should use "try-restart" instead.
>
Tracked in bug 748416.

Comment 29 Fedora Update System 2011-11-25 02:14:52 UTC
sendmail-8.14.5-2.fc15.2 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.