Bug 847418 - MS_SHARED breaks pivot_root(), causes booting trouble in switch-root and shutdown
Summary: MS_SHARED breaks pivot_root(), causes booting trouble in switch-root and shut...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 789285 847477 854611 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-11 00:22 UTC by Michal Jaegermann
Modified: 2013-05-29 14:53 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-11-02 13:17:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg.3.6.0-0.rc1.git3.2.fc18.x86_64 with a systemd debugging information from a failed boot (140.43 KB, text/plain)
2012-08-11 00:24 UTC, Michal Jaegermann
no flags Details
dmesg for the same kernel as above but with intramfs using systemd-187-3.fc18 (299.56 KB, text/plain)
2012-08-11 00:27 UTC, Michal Jaegermann
no flags Details

Description Michal Jaegermann 2012-08-11 00:22:04 UTC
Description of problem:

After an update to systemd-188-2.fc18 an attempt to boot using initramfs generated with this systemd version results in the following on my screen:


dracut-initqueue[892]: Mounted root filesystem /dev/sdc11
dracut-pre-pivot[898]: Checking ext3: /dev/disk/by-label/\x2fusr1
dracut-pre-pivot[898]: issuing e2fsck -a  /dev/disk/by-label/\x2fusr1
dracut-pre-pivot[898]: e2fsck: Cannot continue, aborting.
dracut-pre-pivot[898]: Warning: e2fsck returned with 8
dracut-pre-pivot[898]: Warning: /dev/disk/by-label/x2fusr1 is mounted.
dracut-pre-pivot[898]: Warning: *** An error occurred during the file system check.
dracut-pre-pivot[898]: Warning: *** Dropping you to a shell; the system will try
dracut-pre-pivot[898]: Warning: *** to mount the filesystem(s), when you leave the shell.
dracut-pre-pivot[898]: Warning:

In this moment this drops me into "Repair" and I am stuck.  Does not matter if I will mount or unmount some file systems, run fsck or whatever after an exit from this shell I am ending up with the next series of messages like the above and we can play that game again.  There is no apparent way around that bogosity.


Version-Release number of selected component (if applicable):
systemd-188-2.fc18

How reproducible:
always

Expected results:
systemd not preventing boot

Additional info:
Downgrading to systemd-187-3.fc18 and using that to produce initramfs gets me back into bug 840242 territory but at least I can get to a shell prompts

Comment 1 Michal Jaegermann 2012-08-11 00:24:07 UTC
Created attachment 603655 [details]
dmesg.3.6.0-0.rc1.git3.2.fc18.x86_64 with a systemd debugging information from a failed boot

Comment 2 Michal Jaegermann 2012-08-11 00:27:51 UTC
Created attachment 603656 [details]
dmesg for the same kernel as above but with intramfs using systemd-187-3.fc18

Added for comparison purposes.  Using that requires a manual intervention in a boot process to mount "forgotten" disks, as described in bug 840242, but at least does not block me entirely.

Comment 3 Kevin Fenzi 2012-08-11 21:58:12 UTC
*** Bug 847477 has been marked as a duplicate of this bug. ***

Comment 4 Lennart Poettering 2012-08-11 23:06:27 UTC
Hmm, so my guess is that this is actually a kernel bug triggered by the fact that we now remount evertyhing MS_SHARED very early on. The ref counting of the fs in the kernel is broken which results in pivot_root() breaking.

We can work around this I guess by remounting things MS_PRIVATE right before switching root. But this probably should be fixed in the kernel instead.

A work-around for this issue is to boot without initrd. In grub you can edit the commands for your boot, drop the initrd line and replace the root=UUID... line with root=/dev/sda6 (or wherever your root fs is located; if you are on LVM you are fucked, use an older initrd image/kernel)

Comment 5 Lennart Poettering 2012-08-11 23:56:56 UTC
I have now added a work-around to git upstream, and backported it to F18:

http://koji.fedoraproject.org/koji/taskinfo?taskID=4379957

Will reassign this to the kernel now, since I am quite sure there's something wrong with the fs ref-counting and mount semantics.

Kernel folks: file systems marked MS_SHARED cannot be moved with mount(), MS_MOVE fail with EINVAL.

Comment 6 Lennart Poettering 2012-08-12 00:18:55 UTC
To clarify that, neither pivot_root() nor MS_MOVE works are compatible with MS_SHARED.

Comment 7 Lennart Poettering 2012-08-12 00:26:21 UTC
Hmm, judging by do_move_mount() in namespace.c this actually appears to be intended behaviour of the kernel. But I do wonder why.

Hmm, if this is supposed to stay that way then we probably should file bugs against util-linux too, so that the switch-root and pivot-root utils remount things MS_PRIVATE before moving things, too.

Comment 8 Harald Hoyer 2012-08-13 07:33:54 UTC
(In reply to comment #5)
> I have now added a work-around to git upstream, and backported it to F18:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=4379957
> 
> Will reassign this to the kernel now, since I am quite sure there's
> something wrong with the fs ref-counting and mount semantics.
> 
> Kernel folks: file systems marked MS_SHARED cannot be moved with mount(),
> MS_MOVE fail with EINVAL.

build failed

Comment 9 Lennart Poettering 2012-08-13 13:31:06 UTC
(In reply to comment #8)

> build failed

Yes, sorry for the confusion, a later build did work:

http://koji.fedoraproject.org/koji/buildinfo?buildID=347337

Comment 10 Josh Boyer 2012-08-13 14:22:46 UTC
(In reply to comment #7)
> Hmm, judging by do_move_mount() in namespace.c this actually appears to be
> intended behaviour of the kernel. But I do wonder why.
> 
> Hmm, if this is supposed to stay that way then we probably should file bugs
> against util-linux too, so that the switch-root and pivot-root utils remount
> things MS_PRIVATE before moving things, too.

As far as I can see, the code in question has been in place since 2005 so this isn't new.  Documentation/filesystems/sharedsubtree.txt has a brief blurb that says:

"NOTE: moving a mount residing under a shared mount is invalid."

I've added Al to CC to see if he has any insight here, but I don't think this is a kernel bug at the moment.  Just intended behavior.

Comment 11 Michal Jaegermann 2012-08-16 22:29:42 UTC
systemd-188-3.fc18 is at least as broken as systemd-187-3.fc18 but at least it does not drop me into "Repair" and can be coaxed to boot in some sense.

Comment 12 Michal Jaegermann 2012-08-26 22:04:14 UTC
Wth a combination of
systemd-188-3.fc19.x86_64
dracut-023-2.fc18.x86_64
kernel-3.6.0-0.rc2.git2.1.fc18.x86_64
we are back to a square one, i.e. boot misbehaves in an exactly the same way as described in the original report.

After dropping back to systemd-188-3.fc18 (fc18 and NOT fc19) and redoing initramfs I can boot once again - well, modulo an outstanding bug 840242.

Comment 13 Kevin Fenzi 2012-08-26 22:06:13 UTC
Yep. The master branch never got the patch... so 188-3 is different in f18/f19. 

Also, 189 builds for f18 here are also broken, so I guess the patch never went upstream for the 189 release? 

It would be nice if we converged f18 and master branches and built both moving forward.

Comment 14 Jeff Layton 2012-09-06 12:43:45 UTC
*** Bug 854611 has been marked as a duplicate of this bug. ***

Comment 15 Tom "spot" Callaway 2012-09-06 13:12:55 UTC
Looking at Lennart's patches, both of them are applied in -189, so if the systemd-188-3.fc18 build worked, the -189.fc18 builds _should also_ work, unless something new in 189 causes a problem.

Comment 16 Jeff Layton 2012-09-06 13:26:12 UTC
Great -- sounds like we just need for someone to build these packages for rawhide (f19).

Comment 17 Tom "spot" Callaway 2012-09-06 13:28:51 UTC
Normally, I'd just push 189 into rawhide, but the spec file has a multi-paragraph warning about why the systemd maintainers do not want me to do this, so I opted out. Paging Lennart to do it.

Comment 18 Lennart Poettering 2012-09-07 16:26:06 UTC
We don't build Rawhide packages separately. We want Rawhide to simply inherit from F18 as long as possible. Unfortunately somebody who updated the packages didn't know that and updated the package in Rawhide, so that inheriting was disabled from that point on. We then updated F18 a couple of time which never ended up in Rawhide.

I have now manually untagged the broken package from Rawhide so that we inherit from F18 again. I have also added the aforementioned message to the .spec file to ensure that other folks who update the package don't make the same mistakes.

Honestly I believe the entire git logic in Fedora is backwards. Instead of keeping master all the time around it should just fork off the next version from the previous one when necessary and just get rid entirely of master.

Anyway, this is all corrected now, as the package got untagged and people should get the proper version from F18 again. If you run rawhide then please make sure to downgrade to the latest systemd rpm from F18 again. Thanks.

Comment 19 Dennis Gilmore 2012-09-07 16:38:27 UTC
Lennart regardless of your personal beliefs or preferences the work flow you should be using is to put all changes in rawhide first then merge down to f18 and lower as appropriate. Please do so.

Comment 20 Dennis Gilmore 2012-09-07 16:42:32 UTC
Lennart your also not allowed to untag builds that have been pushed out,  you need to build a higher nvr from master that fixes the issue. i have tagged the build back in. please do the right thing and do a fixed build in master.

Comment 21 Richard W.M. Jones 2012-09-12 07:33:09 UTC
(In reply to comment #18)
> We don't build Rawhide packages separately. We want Rawhide to simply
> inherit from F18 as long as possible.

What I do for several packages is to put all the changes into
master, and then merge those into f18.

With 'fedpkg clone -B' this is particularly easy:

(1) cd master
(2) add changes, commit, push
(3) cd ../f18
(4) git pull ../master  # merges the changes into f18
(5) fedpkg push

Of course this only works so long as no specific patch has
to be pushed to f18 only.  Once that happens we use git cherry-pick
instead of merging, but I try to delay that happening as long
as possible.

Comment 22 Adam Williamson 2012-11-01 22:45:09 UTC
Does this still need to be open? Hasn't it been cleaned up for weeks? Or is there still an underlying kernel bug?

Comment 23 Josh Boyer 2012-11-02 13:17:17 UTC
There's underlying kernel behavior that has existed since shared mounts went into the kernel.  It's been this way since for years.  Comment #10 still applies.

I'm going to close this out as ERRATA and reassign it to systemd.  If Lennart or others would like to see different behavior, I would suggest taking it to the upstream VFS developers.

Comment 24 Harald Hoyer 2013-05-29 13:53:46 UTC
*** Bug 789285 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.