Bug 1232411
Summary: | Rawhide (23) boot.iso nightlies do not boot with dracut 042+ | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> | ||||||||||
Component: | dracut | Assignee: | dracut-maint-list | ||||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | urgent | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 23 | CC: | bcl, bmj001, dracut-maint-list, harald, jonathan, mvollmer, robatino, satellitgo, zbyszek | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||
Fixed In Version: | anaconda-23.12-1 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2015-07-16 15:04:11 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 1170817 | ||||||||||||
Attachments: |
|
Description
Adam Williamson
2015-06-16 17:03:53 UTC
OK, so this is still related to the dmsquash-live stuff. It seems that the non-live installer images do actually rely on dracut's dmsquash-live. If you boot a boot.iso from before dracut-042 with rd.debug you see this: /sbin/dmsquash-live-root@271(main): printf 'mount %s /dev/mapper/live-rw %s\n'' /sysroot and there's a file /lib/dracut/hooks/mount/01-661-live.sh: mount /dev/mapper/live-rw /sysroot and indeed, /dev/mapper/live-rw is mounted as /sysroot. That's the bit that dracut 042 disabled when systemd is active: https://git.kernel.org/cgit/boot/dracut/dracut.git/commit/modules.d/90dmsquash-live/dmsquash-live-root.sh?id=8ff624df9f3f300a008711d114a8769464a054db but the new generator does not cope with the way installer images are set up. As I read the generator: https://git.kernel.org/cgit/boot/dracut/dracut.git/tree/modules.d/90dmsquash-live/dmsquash-generator.sh it will only actually create the mount if the cmdline has something like 'root=live' or 'root=live:SOMETHINGOROTHER', but for installer images (and probably other scenarios), this is not the case. The cmdline for a boot.iso is: BOOT_IMAGE=vmlinuz initrd=initrd.img inst.stage2=hd:LABEL=Fedora-rawhide-x86_64 quiet There's a lot of detail to it, but basically, if you compare dmsquash-generator.sh and dmsquash-live-root.sh , the latter clearly handles a lot more cases than the former, so in dracut-042 - where the latter doesn't actually mount anything if systemd is in use - we break some cases. Aha. So I got to wondering how dmsquash-live-root got triggered at all for installer images, as dracut itself doesn't look like it would do it, and turns out there's an interaction with anaconda-dracut. anaconda-dracut does some of the root discovery / prep itself, then calls dmsquash-live-root: https://github.com/rhinstaller/anaconda/blob/master/dracut/anaconda-diskroot#L51 https://github.com/rhinstaller/anaconda/blob/master/dracut/anaconda-lib.sh#L68 https://github.com/rhinstaller/anaconda/blob/master/dracut/anaconda-lib.sh#L100 but as noted in #c1, this is now broken because dmsquash-live-root doesn't actually *mount* the device any more when systemd is in use. I don't know what the best solution here is, but at least I know what the problem is. CCing bcl as he seems to touch anaconda-dracut a lot. Possibly a dumb idea, but is it really necessary for the systemd generator to re-do all the 'check for "valid" cmdline parameter' stuff from parse-dmsquash-live.sh (see https://git.kernel.org/cgit/boot/dracut/dracut.git/tree/modules.d/90dmsquash-live/parse-dmsquash-live.sh )? Couldn't it be simplified to simply create the mount unit so long as /dev/mapper/live-rw exists? That avoids duplicating code and also makes it work in this case (where some other module is setting up /dev/mapper/live-rw in a way that wasn't expected...) If there's a race problem there (the generator may get run before /dev/mapper/live-rw is actually set up), it could always create the unit, but use a systemd unit condition: ROOTFLAGS="$(getarg rootflags)" { echo "[Unit]" echo "Before=initrd-root-fs.target" echo "ConditionPathExists=/dev/mapper/live-rw" echo "[Mount]" echo "Where=/sysroot" echo "What=/dev/mapper/live-rw" [ -n "$ROOTFLAGS" ] && echo "Options=${ROOTFLAGS}" } > "$GENERATOR_DIR"/sysroot.mount which would cause the mount to only be tried if /dev/mapper/live-rw existed. I may be missing something here, of course, it's just an initial idea. Ah, no, I see the problem with that. Now I understand why the logic is duplicated between parse-dmsquash-live.sh and dmsquash-generator.sh : it's because the generator is run by systemd, and we only want it to produce a sysroot.mount when we know the dmsquash-live stuff is actually going to kick in. The reason is that when we produce sysroot.mount we are overriding systemd-fstab-generator . If we produce a sysroot.mount which doesn't do anything, we preclude systemd from trying to mount root with its own generator, and thus break all the cases which systemd's generator should handle (i.e. all the 'normal' root=something cases). So it's basically correct that the dmsquash generator tries to only actually produce a mount file when it knows one is needed, and because the generator is run entirely outside of dracut itself, the logic more or less has to be duplicated. However, an unfortunate consequence is that it breaks this case, where another module is making use of the dmsquash-live module in a way dracut doesn't know about. anaconda-dracut could of course ship its own generator (or just a simple static sysroot.mount file, I guess). Not sure if there's a better fix. Created attachment 1040642 [details]
Proposed fix for anaconda-dracut usage of dmsquash-live-root
Created attachment 1040643 [details]
anaconda-dracut part of the fix
I've looked at this a bit today and the simple solution is to add an argument to dmsquash-live-root so that the old mount hook will be created when anaconda-dracut calls it. I tried to find a more generic way for the dmsquash generator to run, possibly triggering the creation of the sysroot.mount based on /dev/mapper/live-rw but the timing isn't right (that path didn't exist when the generator is run). There may be a cleaner solution to all of this, but for now these 2 patches should get things booting again. I don't want to do anything like carry a sysroot.mount in anaconda-dracut, it really shouldn't have to know anything about that. Harald, can you please give your opinion on Brian's proposal and merge the dracut part if you agree with it? We need to get Rawhide boot.iso working again so we can test it properly. Why can't the anaconda-dracut part install the mount hook? We could -- but that makes this code more of a mess, not less. We'd then be carrying the hook code in 2 places. Ultimately we'd like to see less custom code in anaconda-dracut, not more of it. Created attachment 1042468 [details]
patch to always mount /dev/mapper/live-rw
How about this? If /dev/mapper/live-rw has been created, no matter the method, it should be mounted on /sysroot
I think the potential problem with that is that it might conflict with systemd's own systemd-fstab-generator - but I'll have to take a look at exactly how that works to know for sure. So yeah, the problem is we have two scenarios: 1) We want dracut (or anaconda-dracut) to mount /sysroot 2) We want systemd to mount /sysroot Scenario 2) is handled by systemd's systemd-fstab-generator , the source for which is http://cgit.freedesktop.org/systemd/systemd/tree/src/fstab-generator/fstab-generator.c . So far as /sysroot goes it basically looks for a 'root=' parameter on the cmdline and if it finds one, generates a mount unit called 'sysroot.mount'. If a unit called sysroot.mount already exists it will bail out and not do anything. The current dracut generator also creates a file called sysroot.mount - so when dracut's generator kicks in, systemd's generator will not. This means that dracut's generator should *only* kick in if it's actually going to successfully mount /sysroot (which is how it's currently set up; dracut's generator won't create a sysroot.mount unless it's actually going to be used). The problem with #c12 (which was my initial idea as well) is that it will break normal boots, where systemd should mount /sysroot based on the 'root=' cmdline parameter - the sysroot.mount which is now *unconditionally created* by the dracut generator will cause systemd's own generator to be effectively disabled, even when dracut's mount doesn't actually *do* anything because the ConditionPathExists is not satisfied. Obviously we could rename dracut's mount unit to anything other than sysroot.mount, and then we wouldn't have that problem. But then we'd have a different problem: we'd have *two* competing /sysroot mount units. I don't know what systemd's behaviour would be in that case, but it doesn't sound like a good idea. It might work OK in 'normal' boot cases because the dracut mount wouldn't do anything, but what happens when we're actually going to use the dracut mount? Both it and the systemd one try to kick in? What happens then? I suppose one thing we could do is send a patch for *systemd*'s generator that makes it mount /dev/mapper/live-rw as /sysroot if it exists. That seems like it'd solve things somewhat elegantly, but it might be a bit too 'special sauce'. sigh, no, that doesn't work because setting a root= on the cmdline bypasses the anaconda-dracut bits. Grr. Sorry, I missed a bit there. I tried just adding 'root=/dev/mapper/live-rw' on the cmdline of a boot.iso to see if that'd be enough to make things work, but it isn't, because it causes the anaconda-dracut bits not to run. Created attachment 1042645 [details]
Proposed patch for anaconda
Here is my proposed patch for anaconda. No dracut patch needed.
(In reply to Harald Hoyer from comment #18) > Created attachment 1042645 [details] > Proposed patch for anaconda > > Here is my proposed patch for anaconda. No dracut patch needed. Right, and if we can't come up with something better I guess I'll go with that. The problem I have is that now we've got 3 layers of stuff deciding how to mount root and handle the cmdline args. We're making this MORE complex instead of less and it's going to be harder to maintain in the future. if we go with that it'd probably make sense to at least stick in a comment briefly explaining what's going on and maybe referencing this bug. (In reply to Brian Lane from comment #19) > (In reply to Harald Hoyer from comment #18) > > Created attachment 1042645 [details] > > Proposed patch for anaconda > > > > Here is my proposed patch for anaconda. No dracut patch needed. > > Right, and if we can't come up with something better I guess I'll go with > that. The problem I have is that now we've got 3 layers of stuff deciding > how to mount root and handle the cmdline args. We're making this MORE > complex instead of less and it's going to be harder to maintain in the > future. Well, isn't this a very specialized case of default handling, if "root=" is not present on the kernel command line, which normally leads to kernel panic? I've fixed this in anaconda-dracut for now. slightly off topic: bfo.iso [1] works for installs of rawhide: using this source: https://kojipkgs.fedoraproject.org/mash/rawhide/x86_64/os/ [1] http://dl.fedoraproject.org/pub/alt/bfo/bfo.iso I installed f23 cinnamon to VirtualBox with it. References: http://wiki.sugarlabs.org/go/Fedora_23#Alternate_to_netinstall_.28boot.iso.29 This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23 This is fixed. Hey, I have an idea: Instead of dealing with the complexities of dracut, we could take just those parts needed to boot out of /usr and put them into a small root filesystem, with, say, something like /bin and /lib . And instead of systemd, we could just use some shell scripts, in something like, say, /etc/rc.d/init.d/ . It might take a little work initially to decide what has to go into /bin and /lib, and it might takes 5 or 10 seconds longer to boot, but just think how much simpler it would be! Bugzilla is for fixing bugs, not for trolling. And no, this stuff wasn't at all simple before dracut and systemd either. |