Bug 1196417

Summary: During live image boot, systemd decides run-initramfs-squashfs.mount 'is bound to an inactive unit' and tries to stop it, sometimes triggering a kernel crash
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: systemdAssignee: systemd-maint
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 22CC: johannbg, jsynacek, kparal, lbrabec, lnykryn, msekleta, pschindl, robatino, samuel-rhbugs, s, systemd-maint, vpavlin, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: AcceptedBlocker
Fixed In Version: systemd-219-6.fc22 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 01:13:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1043121    

Description Adam Williamson 2015-02-25 23:18:01 UTC
This bug is splitting out the systemd part of https://bugzilla.redhat.com/show_bug.cgi?id=1195899 .

Also see https://bugzilla.redhat.com/show_bug.cgi?id=1195761 , which is a different bug caused by the same change.

After systemd 219 landed in Fedora 22, we noticed that nightly live images would sometimes fail to boot, due to a kernel null pointer dereference.

On further investigation we noted that systemd-219 does some stuff to run-initramfs-squashfs.mount which systemd-218 does not. These messages appear when booting the 2015-02-21 and later nightlies:

Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too.
Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs...
Feb 24 20:03:30 localhost umount[526]: umount: /run/initramfs/squashfs: target is busy
Feb 24 20:03:30 localhost umount[526]: (In some cases useful info about processes that
Feb 24 20:03:30 localhost umount[526]: use the device is found by lsof(8) or fuser(1).)
Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount mount process exited, code=exited status=32
Feb 24 20:03:30 localhost systemd[1]: Failed unmounting /run/initramfs/squashfs.
Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too.
Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs...
Feb 24 20:03:30 localhost umount[529]: umount: /run/initramfs/squashfs: target is busy
Feb 24 20:03:30 localhost umount[529]: (In some cases useful info about processes that
Feb 24 20:03:30 localhost umount[529]: use the device is found by lsof(8) or fuser(1).)
Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount mount process exited, code=exited status=32
Feb 24 20:03:30 localhost systemd[1]: Failed unmounting /run/initramfs/squashfs.
Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount is bound to inactive unit. Stopping, too.
Feb 24 20:03:30 localhost systemd[1]: Unmounting /run/initramfs/squashfs...
Feb 24 20:03:30 localhost systemd[1]: Unmounted /run/initramfs/squashfs.
Feb 24 20:03:30 localhost systemd[1]: Unit run-initramfs-squashfs.mount entered failed state.
Feb 24 20:03:30 localhost systemd[1]: run-initramfs-squashfs.mount failed to run 'mount' task: No such file or directory
Feb 24 20:03:30 localhost systemd[1]: Failed to mount /run/initramfs/squashfs.
Feb 24 20:03:30 localhost systemd[1]: Mounting /run/initramfs/squashfs...

No such messages appear when booting 2015-02-18 and earlier nightlies.

After poking through various dracut and systemd changes, I came to suspect this systemd commit:

http://cgit.freedesktop.org/systemd/systemd/commit/?id=06e97888883e2cc12eb6514e80c7f0014295f59b

when I went to build a systemd package with it reverted for testing, I found that cgwalters had beaten me to it:

http://koji.fedoraproject.org/koji/buildinfo?buildID=614715

so I built a live image with that systemd-219-5.fc22 package. Indeed, it boots successfully on multiple attempts, and does not have the log messages listed above.

I have left #1195899 open because the kernel should not crash when this happens, obviously, but it's systemd that changed its behaviour, and it doesn't seem like this 'try multiple times to unmount /run/initramfs/squashfs then re-mount it' behaviour is actually what we *want*, so I believe there are two bugs - the kernel crash, and this behaviour from systemd - and as systemd is the one that changed and the new behaviour really looks wrong, it should be the higher priority to fix.

The revert in 219-5 does seem to address this issue, but I'm guessing that won't be considered the 'correct' long-term fix, so it seems a good idea to have a bug open.

Proposing as an Alpha blocker: conditional violation of "All release-blocking images must boot in their supported configurations."

Comment 1 Fedora Update System 2015-02-25 23:35:08 UTC
systemd-219-5.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/systemd-219-5.fc22

Comment 2 Lukas Brabec 2015-02-26 09:11:43 UTC
This seems to be similar https://bugzilla.redhat.com/show_bug.cgi?id=1195750

Comment 3 Adam Williamson 2015-02-26 16:47:36 UTC
*** Bug 1195750 has been marked as a duplicate of this bug. ***

Comment 4 Fedora Update System 2015-02-26 17:42:10 UTC
Package systemd-219-5.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing systemd-219-5.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-2655/systemd-219-5.fc22
then log in and leave karma (feedback).

Comment 5 Petr Schindler 2015-03-02 19:09:20 UTC
Discussed at today's blocker review meeting [1].

This bug was accepted as Alpha Blocker - This bug is a clear violation of the following criterion: "All release-blocking images must boot in their supported configurations."

http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-02/

Comment 6 Petr Schindler 2015-03-02 19:09:49 UTC
Discussed at today's blocker review meeting [1].

This bug was accepted as Alpha Blocker - This bug is a clear violation of the following criterion: "All release-blocking images must boot in their supported configurations."

http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-02/

Comment 7 Fedora Update System 2015-03-04 01:55:42 UTC
systemd-219-6.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/systemd-219-6.fc22

Comment 8 Fedora Update System 2015-03-05 01:13:45 UTC
systemd-219-5.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 Fedora Update System 2015-03-09 08:27:55 UTC
systemd-219-6.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.