Bug 781657

Summary: systemd-38-4.fc17 prevents machine from rebooting
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal>
Component: systemdAssignee: systemd-maint
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: hvtaifwkbgefbaei, johannbg, metherid, mschmidt, notting, plautrba, robatino, sysoutfran, systemd-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-38-6.git9fa2f41.fc17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-22 21:31:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michal Jaegermann 2012-01-14 02:27:39 UTC
Description of problem:

After a machine was booted using systemd-38-3.fc17 a 'reboot' command typed on a text console by root terminates few services and the last line printed on a screen is:

network [1331]: Shutting down interface eth1: [OK]

and nothing happens afterwards.  The only way out, after a long wait, is a power switch.  There were no problems with rebooting or shutting down when 
systemd-37-4.fc17 and reverting restores sanity.

Version-Release number of selected component (if applicable):
systemd-38-3.fc17

How reproducible:
on every attempt

Additional info:
When system was booted with systemd-37-4.fc17 and systemd replaced with systemd-38-3.fc17, and ONLY then, then at least some "Failed at step STDOUT spawning" were displayed on a screen and logged before a machine refused to reboot.  I am not entirely sure if these were the same messages but at least this was logged:


systemd[5185]: Failed at step STDOUT spawning /lib/systemd/systemd-logind: No such file or directory
systemd[5202]: Failed at step STDOUT spawning /bin/umount: No such file or directory
systemd[5204]: Failed at step STDOUT spawning /sbin/swapoff: No such file or directory
systemd[5206]: Failed at step STDOUT spawning /sbin/swapoff: No such file or directory
systemd[5208]: Failed at step STDOUT spawning /sbin/swapoff: No such file or directory
systemd[5212]: Failed at step STDOUT spawning /sbin/modprobe: No such file or directory

right after this showed up in logs:

systemd[1]: Reexecuting.
systemd[1]: systemd 38 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +SYSVINIT +LIBCRYPTSETUP; fedora)
ed: systemd-38-3.fc17.x86_64
ed: systemd-sysv-38-3.fc17.x86_64

All later attempts to reboot left me entirely in a dark about possible reasons.

Comment 1 Michal Jaegermann 2012-01-14 02:52:48 UTC
Oh, when booted to level 1 then both shutdown and reboot work without any issues. Only when starting a machine in a way one could want it possibly use when this goes wrong.

Comment 2 Michal Schmidt 2012-01-14 11:53:07 UTC
Try adding these two lines to /lib/systemd/system/syslog.socket:

Conflicts=shutdown.target
Before=shutdown.target

(as in http://cgit.freedesktop.org/systemd/systemd/commit/?id=ead51eb4ed55981f290e40a871ffbca6480c4cd3)

Comment 3 Michal Schmidt 2012-01-14 11:57:31 UTC
... and don't forget to do 'systemctl daemon-reload' before you reboot after editing.

Comment 4 Michal Jaegermann 2012-01-14 18:18:17 UTC
(In reply to comment #2)
> Try adding these two lines to /lib/systemd/system/syslog.socket:
> 
> Conflicts=shutdown.target
> Before=shutdown.target

Yes, this makes a substantial difference.  Thanks. After "Shutting down interface ..." line I see now

Sending SIGTERM to remaining process

and the rest of shutdown or reboot follows as expected.

I still think that it is too easy to get something wrong here and the whole design is too brittle.  I had cases (not understood, not repeatable) when systemd died for unfathomable reasons and a machine was sort of running but becoming impossible to reboot other than by pulling a plug.

Comment 5 Michal Schmidt 2012-01-14 19:47:14 UTC
syslog-related problems are often nasty. Try sending a SIGSTOP to the syslog daemon sometime ;-)

When running Rawhide, I definitely recommend having SysRq enabled for the Alt+SysRq + {S,U,B} emergency combination.

Comment 6 Michal Jaegermann 2012-01-14 21:28:09 UTC
(In reply to comment #5)
> When running Rawhide, I definitely recommend having SysRq enabled for the
> Alt+SysRq + {S,U,B} emergency combination.

In theory this should be somewhat different from "pulling a plug"; in practice not necessary. :-)

Comment 7 Frank Murphy 2012-01-15 11:47:41 UTC
(In reply to comment #5)
> syslog-related problems are often nasty. Try sending a SIGSTOP to the syslog
> daemon sometime ;-)
> 
> When running Rawhide, I definitely recommend having SysRq enabled for the
> Alt+SysRq + {S,U,B} emergency combination.

This is new to me
A pointer?

Comment 8 Andre Robatino 2012-01-15 16:25:24 UTC
(In reply to comment #2)
> Try adding these two lines to /lib/systemd/system/syslog.socket:
> 
> Conflicts=shutdown.target
> Before=shutdown.target
> 
> (as in
> http://cgit.freedesktop.org/systemd/systemd/commit/?id=ead51eb4ed55981f290e40a871ffbca6480c4cd3)

I did this, then ran "systemctl daemon-reload" before rebooting, and it worked the first time, but not any more. Now it hangs during shutdown as before.

Comment 9 Michal Jaegermann 2012-01-15 17:40:38 UTC
(In reply to comment #7)

> > When running Rawhide, I definitely recommend having SysRq enabled for the
> > Alt+SysRq + {S,U,B} emergency combination.
> 
> This is new to me
> A pointer?

Ahem!
https://www.kernel.org/doc/Documentation/sysrq.txt
https://fedoraproject.org/wiki/QA/Sysrq
https://en.wikipedia.org/wiki/Magic_SysRq_key

If Wikipedia has an article about it then it cannot be that new. :-)

Comment 10 Frank Murphy 2012-01-15 17:45:37 UTC
Trust me, we had no pc's at school, many still don't.
I'm almost middle-aged and on catch-up. :)

Comment 11 Andre Robatino 2012-01-16 17:22:01 UTC
(In reply to comment #8)

> I did this, then ran "systemctl daemon-reload" before rebooting, and it worked
> the first time, but not any more. Now it hangs during shutdown as before.

I reverted the edit, then hard powered off, then did the edit and systemctl daemon-reload command again and rebooted, and it seems to work persistently this time. Only difference is that I rebooted rather than powering off after originally making the changes (normally I poweroff each day). I don't see anything in the systemctl man page indicating that should make a difference.

I'm sure the original edit was correct, since as I said it worked the first time, and the edit itself was persistent (though not its effect).

Comment 12 Andre Robatino 2012-01-16 18:18:40 UTC
Sorry, meant to say that I powered off after making the original changes (which only worked once). This time, I rebooted instead.

Comment 13 Andre Robatino 2012-01-22 14:18:39 UTC
Same problem with systemd-38-4.fc17, same fix works (comment 2 and comment 3).

Comment 14 Michal Schmidt 2012-01-22 21:31:20 UTC
Fixed in systemd-38-6.git9fa2f41.fc17.