Created attachment 892670 [details]
v1: add support for btrfs in grub2
Description of problem:
With grub2-2.02 and os-prober-1.58-5 in rawhide, only grubby support for booting of /boot on a subvol of a directory on a rootfs ("/") on a subvol is blocking.
No more! Attached is the patch to provide that support.
This patch adds getSubvolPrefix() for handling a btrfs subvol
prefix on the filenames for grub2. If get_Root_Specifier() results
in a NULL rootspec, then getSubvolPrefix() is executed to extract any
btrfs subvol prefix. With this modification, booting off
/boot on a btrfs subvol is supported.
To determine if booting is to be performed off a btrfs volume
or subvolume, the lines of the entry are scanned to detect
if "insmod btrfs" is present. If it is, then btrfs is assumed.
If the kernel filename includes only one slash, then there is
no btrfs prefix. Otherwise, the first part defined by "/../"
will be the btrfs prefix if booting btrfs. Unless, of course,
there are 2 slashes in the filename but bootPrefix is zero length
in which case we are booting off a btrfs volume!
Besides the kernel, the subvol prefix must also be on the
initrd filename. The subvol prefix for initrd is based on
the entry's kernel's filename. Note that the lines of
the entry must be scanned to determine if we are booting on
Code was added to addLine(tmpl() so that the subvol prefix not added
for an initrd since updateInitrd() does that already.
Besides the code needed to provide the functionality, this patch
includes debugging code to support the implementation. This
debugging code will have no effect unless DEBUG is enabled.
This patch also includes a couple of minor fixes. In findTemplate()
the saved_entry code is fixed so it works.
Add code findValidEntryByIndex() to test for the
comment "### END /etc/grub.d/10_linux ###" indicating that the
previous entry was the last "valid" entry. This prevents findTemplate()
from looking at entries created by 30_os-prober and 40_custom.
Created attachment 892671 [details]
Lots of dbPrintf() message to aid debugging btrfs support.
http://koji.fedoraproject.org/koji/taskinfo?taskID=6820037 is a Rawhide scratch build with the patch applied, for testing. I'll spin up a live image that includes it and run it through a few tests.
I put an email out to test.org advertising the patch and asking for an volunteers but this build is a better source.
This is OK for testing if nothing currently working breaks but to really do anything with btrfs, you need an updates image for anaconda such as the ones I have on ftp://czarc.net
oh, you did one already? I was about to do that next. That'll be the ones in ftp://czarc.net/pub/updates/ , I guess?
So I built a live image with my patched grubby build, it also has os-prober 1.58-5. I applied ftp://czarc.net/pub/updates/anaconda-21.35-1a-updates.img , and used custom partitioning to create a layout where /boot was a btrfs subvol .
Installation was successful, but attempting to boot the installed system produces a kernel panic. Screenshot of the VM will be attached.
Created attachment 893044 [details]
screenshot of kernel panic from attempt to boot system with /boot as btrfs subvol
Created attachment 893046 [details]
grub.cfg from install where boot fails
(In reply to Adam Williamson from comment #6 & #7)
The grub.cfg is malformed. It only contains a linux16 entry, no initrd entry. I'm not sure why it's using linux16 instead of linux, or why it doesn't have an initrd.
My recollection during F20 testing is that there was a change in anaconda, possibly threading, where it would call grub2-mkconfig before dracut finished building the initramfs and hence we get a grub.cfg routinely without an initrd line. But then grubby seemed to fix this problem for /boot on ext4 but doesn't fix it on Btrfs. So grubby is covering up an anaconda bug, grub2-mkconfig shouldn't be called until dracut is complete.
Adam, first of all, you need to have DEBUG enabled so that the spue of messages can give some idea of what is going on. The messages will be in anaconda.log for an install.
Second, I do not have your exact build but I was able to use a rawhide root.iso (both x86_64 and i686) top install with updates= pointed to one of my update images.
I looked at the koji cited but I cannot seems to find the related src.rpm so that I could see exactly what was included.
I have had some "strange" situations where things happened differently when DEBUG was enabled or not. I will run a test on my side with DEBUG not enabled for a rawhide install.
Adam, bad news ... I cannot duplicate your problem.
I just did a x86_64 qemu-kvm gui install using the rawhide root.iso with anaconda 21.34-1, my updates image anaconda-21.35-1a-updates.img, and the non-debug grubby in my local rpm repository which I added to the install. Installed with no problems and booted with no problems with /, /home and /boot on btrfs subvols.
I selected a lxde desktop since this is easier to run on a virtual system.
BTW, the linux16/initrd16 is something new with grub2-2.02 and has something to do with the boot protocol ... I don't understand it, it does work, and the grub2 folks seem to believe it is important. Also, if you have grub2-2.02 installed but an old format grub.cfg file with linux/initrd, things still work just fine.
And now I am having another problem with anaconda. When I attempt install using either i386 or x86_64 boot.iso with anaconda-21.35-1, I get a blank (gray) screen with nothing on it.
What is really strange is that I installed with these boot.iso earlier and it worked fine.
BTW, the boot parameter inst.debug=1 does not work.
Strange problem with anaconda. The patches to pyanaconda/bootloader.py and grubby.c are working but is "now" this very long pause entering and leaving custom disk configuration. The virtual system is not loocked up in that you can switch to a virtual terminal. The vcpu looks to be about 35% buzy so something is out there "spinning". Then again, I was giving a livusb install a try (has patched anaconda and the updated grubby AND IT STARTED WORKING as normal again???
Anyway, if there is some way you could get me access to the iso image you had, maybe I can figure out what is wrong.
OK, I created a small LXDE livecd and then installed that on a USB flash drive. Booted the live image and check to make sure all of the packages and repos were correct. Then did a live install. That worked but when I rebooted I got boot errors.
I rebooted the live USB, mounted all of the subvols and chroot'ed into the system. I then attempted to run grub2-install and grub2-mkconfig ... both errored out.
I then rebooted anaconda in rescue mode, chrooted in, ran both grub2-install and grub2-mkconfig successfully.
Reboot and it worked just fine.
BTW, I checked anaconda.program.log locatinging the execution of "new-kernel-pkg" and, sure enough. there were the huge number of grubby messages ending with a successful update.
I have not seen anything like this from a boot.iso. I am now building a "full monty" rawhide DVD iso with patched anaconda and updated grubby.
Oh, and even better, the liveUSB had a "old" kernel [new kernel just installed in rawhide] so I got a chance to test the install ... worked great.
Scratch build of a grubby package with the debugging patch applied failed: http://kojipkgs.fedoraproject.org//work/tasks/4236/6824236/build.log . I can probably disable 'make check' to make it work, but interesting that just enabling debug mode apparently causes the tests to fail.
Sorry about that. Should have mentioned that previously. When DEBUG is enabled, there is some change to the output of grubby which causes some tests to fail. You inded need to disable #%check when DEBUG is enabled.
When you do get something built and another iso built, is there a way I could get hold of that iso so that I could try and see what is happening?
I have a live iso I built myself but that will not be the same as yours.
See https://bugzilla.redhat.com/show_bug.cgi?id=983685 for hang problem. With patch applied and using the updated systemd, the install works great.
OK, I looked at the grub.cfg and the kernel panic is because init16 is missing. I need anaconda.program.log with DEBUG enabled so I can see if grubby is contributing to the problem.
I also had a problem running /usr/bin/liveinst. With systemd-212-4 applied, the udevadm settle problem goes away so you can actually get through the install. However, the boot failed. It looks like grubby was OK but was handed a bad grub.cfg ... at least entry=0 was the rescue kernel and entry=1 the main kernel. I am not sure what is going on here but I do not see a problem with grubby itself.
I am currently running pungi to build a new iso so I can test doing a regular install.
I also need to look into anaconda more. While kickstart should work fine with just btrfs re-enabled, the gui needs some work so if it is btrfs AND you are allocating /boot AND there is no space left so you can allocate asmall ext4 partition, it will try using a subvol rather than erroring out.
That GUI bit ought to work already, I think.
Heh. I think I may have put grubby-debuginfo .rpm but not grubby .rpm in my first test iso. *oops*
building a new one now with the debugging patch included to see how that goes.
Oops ... Houston, we have a problem ... but not with grubby.
Built a livecd-lxde iso with just "as of this morning 20140509" rawhide.
Booted up and ran /usr/bin/liveinst
produces something bad so that there is a boot error ... this can be corrected by booting up in rescue mode and running grub2-install /dev/vda as well as grub2-mkconfig -o /boot/grub2/grub.cfg
Opening a new BZ report as soon as I gather the supporting doc.
OK, I am at a loss here. What does " Custom partitioning screen rendering broken at 1024x768 in Russian" have to do with grubby and btrfs?
Also, rawhide testing resuming since I learned that the installer works with 2 or more vcpu. I have also started testing on a dual core hardware. So far, it just plain works.
"OK, I am at a loss here. What does " Custom partitioning screen rendering broken at 1024x768 in Russian" have to do with grubby and btrfs?"
Their tabs were open next to each other in my browser. =) Just a mistake, that's why I changed it.
I have a new website http://czarc.org
updates.img files for anaconda-21.37 and updated patch files for grubby-8.35 are there. There is also a yum repository with updated rpms:
Also, see https://bugzilla.redhat.com/show_bug.cgi?id=1099627
live install again has a good grub.cfg
gene: can't seem to get to your site at present.
Problem at my end. Now fixed.
NOTE: The current anaconda (21.39-1) has the patches enable boot off btrfs and LVMlv as well as fixes for Live Install.
I've built a live image with anaconda 21.39 and a patched grubby build, and done one successful install run already. However, on a second run, there seems to have been an exception in anaconda at run time. I believe this is affected by the existing layout of the disk:
18:03:50,064 DEBUG anaconda: running handleException
18:03:50,064 DEBUG anaconda: Traceback (most recent call last):
File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 227, in run
threading.Thread.run(self, *args, **kwargs)
File "/usr/lib64/python2.7/threading.py", line 766, in run
File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 189, in storageInitialize
File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 471, in reset
File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2069, in populate
File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2133, in _populate
File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1245, in addUdevDevice
File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1859, in handleUdevDeviceFormat
File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1703, in handleBTRFSFormat
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 4713, in __init__
super(BTRFSVolumeDevice, self).__init__(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 4627, in __init__
super(BTRFSDevice, self).__init__(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 2231, in __init__
super(ContainerDevice, self).__init__(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 567, in __init__
self.format = fmt
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 1038, in <lambda>
lambda d,f: d._setFormat(f),
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 4733, in _setFormat
self.name = "btrfs.%d" % self.id
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 406, in <lambda>
lambda s, v: s._setName(v),
File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 686, in _setName
raise errors.DeviceError("Cannot rename existing device.")
DeviceError: Cannot rename existing device.
OK, I think that's not related to the btrfs /boot work. Well, so far this looks OK in my testing: I was able to install with /boot as a btrfs subvol, and anaconda correctly prevented me from making it a subvol of an *encrypted* btrfs volume. I can boot both regular and rescue kernels on the installed system. Next to check installing kernel updates.
Created attachment 903279 [details]
add code to validate grub2 entry
Created attachment 903280 [details]
v2 add support for btrfs when grub2 is bootloader
refactored with just update to support /boot on btrfs
Created attachment 903281 [details]
v2 add lots of debugging
added code to support development and debugging of support for /boot on btrfs
working on adding tests and will post that patch as soon as I get it.
So far, my tests with the refactoring of the single patch into three patches is working well.
Created attachment 903283 [details]
v2 add code to validate a grub2 entry
Oops. Fixed problem in src.rpm and forgot to update repository. Now fixed.
Created attachment 906996 [details]
v2.1 add support for btrfs when grub2 is bootloader
updated to support add-kernel & initrd in same execution
Created attachment 906997 [details]
v2.1 add compile time enabled debugging code for btrfs
Created attachment 906998 [details]
v2.1 add tests for btrfs support
Created attachment 907052 [details]
v2.2 add support for btrfs when grub2 is bootloader
bugfix whitespace error
as you seem to be revving the patches quite a bit, it might be interesting to carry your work as a git branch? you could fork grubby git upstream onto czarc.org, keep czarc's master branch synced with upstream, and add your work as a branch?
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.
More information and reason for this action is here:
This message is a reminder that Fedora 23 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 23. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 23 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
This is still an issue in F25.
What is the current remaining state of this issue?
Grub2 has been working with btrfs for years, I use btrfs for /boot on ubunntu and arch derivative distros all the time. What is holding Fedora back here?
I don't think anyone who could potentially fix it is very interested in doing so. Note that the anaconda development team was drastically reduced in size last year.
The distinguishing feature in Fedora is grubby, which other distros don't use. We can't stop using it in Fedora for...various reasons.
I think an awful lot of users who install /boot in btrfs also boot with something like rEFInd (I do).
Perhaps the shortest-path workable solution for btrfs users would be for anaconda to have a simple option to skip installation of a bootloader; then it doesn't need to complain where the user put /boot, and grubby isn't a problem.
I actually find it quite annoying when other distros don't have such an option; the result is 2 launchers in rEFInd for the same install, the /boot that it finds on its own, and also the bootloader that was installed.
This is effectively a duplicate of bug 864198. Anyway, Gene Czarninski died two years ago, so he's not around to discuss the proposed patches which were submitted upstream and were initially accepted but then were rejected for unspecified reasons with work not progressing futher.
What's needed to progress it further? A volunteer who can fix the bug and submit it upstream. Whether the patches in this bug give a clue how to go about the problem, I don't know.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
Thank you for reporting this bug and we are sorry it could not be fixed.
I have updated the code and submitted a pull request.
COPR builds are available here:
grubby-8.40-10.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-c0e98e4635
Proposed as a Freeze Exception for 28-beta by Fedora user npmccallum using the blocker tracking app because:
This was actually ready to go before the freeze, but I was advised to wait for bodhi to get testing. However, this means that we can't land in the beta, which is where we'd get most of our testing. This directly affects the ability to ship the parallel feature in anaconda in beta. So we'd like an exception.
The feature itself has multiple (upstream) tests that all pass during rpm build. It has also been manually tested by multiple people.
I believe this is a prime candidate for a freeze exception precisely because it affects the install process. In F27, it is impossible to install with this configuration. But, if this is accepted as a freeze exception, F28 will be able to install in this configuration.
grubby-8.40-10.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-c0e98e4635
*** Bug 864198 has been marked as a duplicate of this bug. ***
Discussed during the 2018-03-12 blocker review meeting: 
The decision to classify this bug as an AcceptedFreezeException was made as this is a feature that people have been asking for for a long time that would add significant value to the Fedora 28 release, and if we want to add that feature, it would be best to test it in Beta instead of Final.
grubby-8.40-10.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.
grubby-8.40-8.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-badf6d0f9e
grubby-8.40-8.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-badf6d0f9e
grubby-8.40-8.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
Still doesn't work in Fedora 31