Bug 864198 - grubby fatal error updating grub.cfg when /boot is btrfs
grubby fatal error updating grub.cfg when /boot is btrfs
Status: ASSIGNED
Product: Fedora
Classification: Fedora
Component: grubby (Show other bugs)
24
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Peter Jones
Fedora Extras Quality Assurance
:
: 895606 1012646 1045790 1207390 (view as bug list)
Depends On: 1094489
Blocks: 689509 989644
  Show dependency treegraph
 
Reported: 2012-10-08 16:32 EDT by Chris Murphy
Modified: 2016-08-13 21:40 EDT (History)
34 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
grubby debug output (4.71 KB, text/plain)
2012-10-23 12:07 EDT, Chris Murphy
no flags Details
grub.cfg (5.23 KB, text/plain)
2012-10-23 12:07 EDT, Chris Murphy
no flags Details
info requested from comment 5 (3.09 KB, text/plain)
2012-10-24 11:13 EDT, Chris Murphy
no flags Details
address potential security problem in grubby findtTemplate() (1.15 KB, patch)
2013-11-21 08:23 EST, Gene Czarcinski
no flags Details | Diff
if grub2 and btrfs, use grub2-mkconfig (4.30 KB, patch)
2013-11-21 08:25 EST, Gene Czarcinski
no flags Details | Diff
when rootfs is btrfs, use grub2-mkconfig instead of grubby (4.69 KB, patch)
2013-11-22 13:56 EST, Gene Czarcinski
no flags Details | Diff
Suggestion for grubby (based on 031447907d40441e7c8778bb4b9feb496659a632) (3.51 KB, patch)
2015-04-12 10:42 EDT, Philip
no flags Details | Diff

  None (edit)
Description Chris Murphy 2012-10-08 16:32:37 EDT
Description of problem:
If /boot is on btrfs (as tested, specifically a btrfs subvolume mounted on /boot), grubby fails to update grub.cfg.

Version-Release number of selected component (if applicable):
grubby-8.20-1.fc18.x86_64
kernel-3.6.0-3.fc18.x86_64


How reproducible:
100% so far

Steps to Reproduce:
1. F18 TC1 anaconda creates root and home subvols, while boot goes on ext4.
2. Create a boot subvol on the existing btrfs volume.
3. cp -a contents of ext4 boot to the btrfs boot subvol.
4. Update fstab, update grub.cfg with grub2-mkconfig; on one attempt I had to relabel with restorecon.
[at this point there is a working bootable system]
5. yum update that inclues a kernel update
  
Actual results:
Error message within yum updating:

Installing : kernel-3.6.0-3.fc18.x86_64          96/190 
grubby fatal error: unable to find a suitable template

The resulting grub.cfg lacks an entry for this new kernel.

Expected results:
dlehman has said /boot as a btrfs subvol should be possible for F18, instead of only allowing ext4; in which case I'd expect grubby can handle updating grub.cfg.

Otherwise, this is somewhat cosmetic because I can manually run grub2-mkconfig which takes care of the problem.

Additional info:
I have not regressed to /boot being btrfs on its own partition, sans any subvolumes, but it seems either grubby should be able to update grub.cfg for /boot on a btrfs subvol, or it should call grub2-mkconfig which does work.
Comment 1 Chris Murphy 2012-10-08 16:34:28 EDT
Bug 530108 might be related.
Comment 2 Peter Jones 2012-10-23 10:13:57 EDT
Please attach your grub.cfg .  Also, if you could, run "rpm -q --scripts -p $KERNEL_PACKAGE", and manually run each of the grubby commands from %post and %posttrans with "--debug" until it shows the failure.  Then attach the output from that to this bug.
Comment 3 Chris Murphy 2012-10-23 12:07:01 EDT
Created attachment 632180 [details]
grubby debug output

# bash -x /sbin/new-kernel-pkg --package kernel --install 3.6.2-2.fc18.x86_64

produces this line resulting in the error:

/sbin/grubby --grub2 -c /boot/grub2/grub.cfg --add-kernel=/boot/vmlinuz-3.6.2-2.fc18.x86_64 --copy-default --make-default --title 'Fedora (3.6.2-2.fc18.x86_64)' '--args=root=UUID=780b8553-4097-4136-92a4-c6fd48779b0c ' '--remove-kernel=TITLE=Fedora (3.6.2-2.fc18.x86_64)'

Attached is the result of the above with --debug added.
Comment 4 Chris Murphy 2012-10-23 12:07:31 EDT
Created attachment 632183 [details]
grub.cfg
Comment 5 Peter Jones 2012-10-24 08:37:40 EDT
Huh.  So it's trying to find kernels in /boot/boot/ .  Can you show me /proc/mounts and the output of "stat /" and "stat /boot"?
Comment 6 Chris Murphy 2012-10-24 11:13:44 EDT
Created attachment 632845 [details]
info requested from comment 5

Summary from cat /proc/mounts
/dev/sda1 / btrfs rw,seclabel,relatime,space_cache 0 0
/dev/sda1 /home btrfs rw,seclabel,relatime,space_cache 0 0
/dev/sda1 /boot btrfs rw,seclabel,relatime,space_cache 0 0

Summary stat

  File: ‘/’
  Size: 152       	Blocks: 0          IO Block: 4096   directory
Device: 1fh/31d	Inode: 256         Links: 1
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)

  File: ‘/boot’
  Size: 1470      	Blocks: 0          IO Block: 4096   directory
Device: 26h/38d	Inode: 256         Links: 1
Access: (0555/dr-xr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Comment 7 Chris Murphy 2012-10-24 11:21:53 EDT
subvols show up in mountinfo

# cat /proc/self/mountinfo | grep btrfs
34 1 0:29 /root / rw,relatime shared:1 - btrfs /dev/sda1 rw,seclabel,space_cache
41 34 0:29 /home /home rw,relatime shared:26 - btrfs /dev/sda1 rw,seclabel,space_cache
43 34 0:29 /boot /boot rw,relatime shared:27 - btrfs /dev/sda1 rw,seclabel,space_cache

The fstab also shows subvol boot is mounted at /boot. But for some reason with grubby it's imagining the btrfs volume mounted at it's default subvol? And then following the boot subvol as if it were a folder? In that case it would be /boot/boot.

UUID=780b8553-4097-4136-92a4-c6fd48779b0c /                       btrfs   subvol=root     1 1
UUID=780b8553-4097-4136-92a4-c6fd48779b0c /boot                   btrfs   subvol=boot     1 2
UUID=780b8553-4097-4136-92a4-c6fd48779b0c /home                   btrfs   subvol=home     1 2
Comment 8 Chris Murphy 2012-10-26 12:02:04 EDT
I'm not sure how grubby looks for subvolumes, if at all? But there was a change in subvolume list format, which affected anaconda in Bug 868468.
Comment 9 Chris Murphy 2013-01-05 16:35:31 EST
This bug is still alive, and anaconda will let the user create a layout as describe in comment 7; meaning post install users who produce this layout will need to manually run grub2-mkconfig to update grub.cfg after any kernel updates.
Comment 10 Reartes Guillermo 2013-01-18 11:18:01 EST
Bug 895606 seems like a DUP of this one.
Comment 11 Dmitri 2013-03-04 10:27:40 EST
I have the same issue, though my grubby seems to be trying to access /root/boot/*:

DBG: Image entry failed: access to /root/boot/vmlinuz-3.7.9-205.fc18.x86_64 failed

When I ran anaconda, I used the default btrfs partitioning scheme iirc, which gives me the following ftab line (I added compression later):

UUID=b67f524b-ae87-4409-ba81-4c1d1ede4c64 /                       btrfs   subvol=root,compress=lzo     1 1
Comment 12 Chris Murphy 2013-03-04 17:35:09 EST
(In reply to comment #11)
default btrfs partition scheme puts / on btrfs and /boot on ext4. So your install is custom, which is why you experience this problem also. FWIW if you use one of these, you'll get a correct grub.cfg that includes the new kernel:

BIOS
grub2-mkconfig -o /boot/grub2/grub.cfg

UEFI
grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
Comment 13 poma 2013-04-11 14:38:44 EDT
(In reply to comment #12)
> (In reply to comment #11)
> default btrfs partition scheme puts / on btrfs and /boot on ext4. So your
> install is custom, which is why you experience this problem also. FWIW if
> you use one of these, you'll get a correct grub.cfg that includes the new
> kernel:
> 
> BIOS
> grub2-mkconfig -o /boot/grub2/grub.cfg
> 
> UEFI
> grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

If this is indeed the default with respect to Btrfs managment, can it be concluded that Fedora does not really support the Btrfs in the true sense of the word?
Comment 14 Chris Murphy 2013-04-24 10:39:17 EDT
(In reply to comment #13)
Fedora is a project, it's ever changing and evolving, so I don't know what you mean by "support". I also don't know what you mean by "this" or "management". So if you want to try to write a more specific and coherent question, I'll try to answer it.

There are some valid reasons for putting /boot on ext4 by default, even though GRUB2 can handle /boot on btrfs or as a btrfs subvol. As for the grub.cfg being in different locations depending on BIOS vs UEFI, I think that's a legitimate gripe, and I'd like to know more why there's a difference in location, as it's inconsistent with upstream GRUB.
Comment 15 poma 2013-04-26 14:29:25 EDT
(In reply to comment #14)

I do understand, the Fedora is limited by the upstream tools capabilities.
At the moment of playfulness I forgot what it's all about. :)
Besides even Extlinux can't cope with multi device btrfs.

Syslinux 5.10 2013 - syslinux-5.10-pre2:
http://git.kernel.org/cgit/boot/syslinux/syslinux.git/tree/extlinux/main.c?id=syslinux-5.10-pre2#n1208
http://git.kernel.org/cgit/boot/syslinux/syslinux.git/tree/extlinux/main.c?id=syslinux-5.10-pre2#n1155
http://git.kernel.org/cgit/boot/syslinux/syslinux.git/tree/core/fs/btrfs/btrfs.c?id=syslinux-5.10-pre2#n366

Booting from Hard Disk...

SYSLINUX 4.05 EDD 2011-12-09 Copyright (C) 1994-2011 H. Peter Anvin et al
warning: only support single device btrfs
EDD: Error 0100 reading sector <lotto numbers?> :)
Comment 16 Matthew Miller 2013-05-11 09:51:17 EDT
(In reply to comment #15)
> I do understand, the Fedora is limited by the upstream tools capabilities.
> At the moment of playfulness I forgot what it's all about. :)
> Besides even Extlinux can't cope with multi device btrfs.

For this reason, I'm only talking about ext2/3/4 for Extlinux in Fedora 19.
Comment 17 Chris Murphy 2013-05-13 13:49:48 EDT
Still reproducible in F19, both with /boot on its own mount point set to Btrfs device type, and also with just a / mount point (which is likely what happened in comment 11).

1. Installation destination > Choose a new blank disk.
2. Installation options > set partition scheme to Btrfs, choose to modify/review.
3. Manual partitioning > create a new mount point /. By default this will be Btrfs device type, and / will be installed to a subvolume named root.
4. Proceed with the installation.
5. Reboot.
6. yum update kernel

Result:
grubby fatal error: unable to find a suitable template

Setting version to 19.
Comment 18 Chris Murphy 2013-09-24 21:33:19 EDT
Still a bug in F20 alpha and also with rawhide kernels, so I'm flipping version to rawhide.
Comment 19 Chris Murphy 2013-09-27 15:29:40 EDT
Proposing as a Fedora 20 beta blocker because it's the actual cause of bug 1012646. "A system installed without a graphical package set must boot to a working login prompt without any unintended user intervention, and all virtual consoles intended to provide a working login prompt must do so." And also "No part of any release-blocking desktop's panel (or equivalent) configuration may crash on startup or be entirely non-functional."


Brief summary: The installer allows the user to place boot on Btrfs subvolumes, but because of this bug grubby doesn't add the necessary initrd line to the installer created grub.cfg, so on reboot from a successfully installed system, it kernel panics.
Comment 20 Gene Czarcinski 2013-09-27 16:41:27 EDT
There are a number of "small" but significant problems that makes using btrfs difficult (this one included).  They really need to be addressed sooner rather than later.  While some folks might not want to used BTRFS because it is still experimental, others will be precluded from using it because they cannot get their system installed on it.
Comment 21 Chris Murphy 2013-10-02 15:39:13 EDT
Adding to btrfs tracker.
Comment 22 Mike Ruckman 2013-10-02 16:27:08 EDT
Discussed this in the 2013-10-02 Blocker Review Meeting [1]. This has been voted a RejectedBlocker. This does not clearly violate any F20 beta criteria and is a corner case. Therefore, it does not qualify as a release blocking bug for f20 beta.

[1] http://meetbot.fedoraproject.org/fedora-blocker-review/2013-10-02/
Comment 23 Gene Czarcinski 2013-10-03 06:13:43 EDT
While it is not a blocker, I believe that it should be a higher priority to be addressed for fixing that it obviously is.

Fortunately, you can run grub2-mkconfig -o /boot/grub2/grub.cfg (or UFI version) after updating the kernel and have everything still work properly.
Comment 24 Chris Murphy 2013-10-09 14:17:01 EDT
I'm uncertain how this doesn't clearly violate the F20 beta criterion that the system needs to boot to either a desktop or a working login prompt. Proposing as a final blocker for the same reasoning; it is also a regression from F19, the system as installed is bootable there.

Rejecting blocker on the basis that boot on btrfs is unsupported is a better explanation; but I haven't heard anyone clearly say that boot on btrfs is unsupported.
Comment 25 Adam Williamson 2013-10-11 06:53:56 EDT
Chris: "I'm uncertain how this doesn't clearly violate the F20 beta criterion that the system needs to boot to either a desktop or a working login prompt."

Because of:

https://fedoraproject.org/wiki/Blocker_Bug_FAQ#What_about_hardware_and_local_configuration_dependent_issues.3F

We can't, practically speaking, accept absolutely any bug that can cause some extremely obscure configuration not to boot as a blocker bug. There are always going to be obscure bugs in kernel drivers and things.

There is some tension between that note and this language in the criteria right before the 'Expected installed system boot behavior' section:

"Except where otherwise specified, each of these requirements applies to all supported configurations described above."

which we should resolve a bit more clearly, but in practice, it has always been the case that blocker status is not guaranteed for very configuration-dependent bugs.

I do think this wasn't a complete slam-dunk 'non blocker' for Beta, it could possibly have been taken as one, since we do have some reasonably clear requirements for custom partitioning to work at Beta. But rejecting it as a blocker was acceptable under the current policy. I could see it possibly being accepted as a Final blocker, though. It is a pretty subjective call with the way things are currently written. It would be good if we could make it clearer, but it's very very hard to come up with an 'objective' written policy that covers every possible scenario sensibly :/
Comment 26 Adam Williamson 2013-10-11 06:58:43 EDT
We should definitely document this at least, noting that before I forget.
Comment 27 Chris Murphy 2013-10-11 11:30:17 EDT
(In reply to Adam Williamson from comment #25)
The problem with the locality exception is that this is universal; it's not at all analogous to the video card example. It always happens when /boot is on btrfs for everyone, and the installer makes it easy to arrive at this configuration. How is this configuration obscure?

If this were ext4 or XFS, the controversy would be it taking more than 12 hours to fix, rather than 12 months. The real issue isn't the configuration, which clearly have legitimate technical/resource limitations to make work, but whether btrfs is to have problem resolution parity with other file systems or if there's an exception.
Comment 28 Adam Williamson 2013-10-11 12:51:48 EDT
*** Bug 1012646 has been marked as a duplicate of this bug. ***
Comment 29 Adam Williamson 2013-10-11 12:57:15 EDT
Chris: it is dependent upon a specific configuration: the configuration that /boot is on a btrfs volume. The determination is entirely based on how common/important that configuration is considered to be.
Comment 30 Gene Czarcinski 2013-10-11 13:42:09 EDT
OK, I cannot help myself.

Anything involving BTRFS seems to take a low priority.  Yes, if you remember to run grub2-mkconfig after the kernel update or you have /boot on a regular partition or even a logical volume, then things work.  Why is there reluctance to addressing this problem?  Twelve months is a little silly isn't it?

Another example is https://bugzilla.redhat.com/show_bug.cgi?id=892747 where the existing anaconda code simply ignores any existing btrfs subvolumes if they are specified in kickstart.  You can add some code to your post-install script to fixup your /etc/fstab but you should not have to do that.

Anyway, concerning grubby, if someone has the time, I suggest they take a look at the code and propose a fix.
Comment 31 Chris Murphy 2013-10-11 14:34:08 EDT
Adam: The bug prevents the configuration from being common or important. The workaround is esoteric. Therefore the determination is entirely based on circular reasoning. And that is a self-justification recipe for any btrfs bug being put on an infinite back burner with no plan of how to move forward.
Comment 32 Adam Williamson 2013-10-12 08:12:24 EDT
Chris, you're exaggerating drastically. A bug is not 'on an infinite back burner' because it was rejected as a release blocker. I'm only concerned with the release blocker determination, here. What priority the bug is assigned is up to the anaconda team. Obviously they kind of have to give blocker bugs a high priority, but the converse is not true: they are not required to give *non* blocker bugs a *low* priority.
Comment 33 Chris Murphy 2013-10-12 14:32:59 EDT
Adam: Except it's a grubby bug. And except you're confused about my complaint, which is usage of a logical fallacy to justify the rejection. And I don't want it applied to either this or future similar bugs: oh well, it's a rare configuration, so we won't block because of that, even though the bug is what causes it to be a rare configuration and it otherwise meets the requirements for blocker status.

I don't find that a convincing argument for any bug, let alone this one.
Comment 34 Chris Murphy 2013-10-14 09:54:19 EDT
*** Bug 895606 has been marked as a duplicate of this bug. ***
Comment 35 Tim Flink 2013-10-15 18:16:56 EDT
This was rejected as a beta blocker and re-proposed as a final blocker, changing whiteboard and blocks field to reflect this
Comment 36 Mike Ruckman 2013-11-12 16:13:55 EST
I followed the steps in comment 17 in a VM and it resulted in a functioning system. The updated kernel doesn't show as a boot option, but the system boots fine. 

Running:  grub2-mkconfig -o /boot/grub2/grub.cfg

Fixes the boot options and results in a working system with the F20 Beta DVD image.
Comment 37 Adam Williamson 2013-11-12 20:17:37 EST
"The updated kernel doesn't show as a boot option, but the system boots fine. "

That would be the expected outcome, yes.

I haven't tested it, but you could wind up in trouble after several kernel updates, when yum starts 'cleaning up' ones more than three releases old - though I believe it would never 'clean' the currently-running kernel.
Comment 38 Adam Williamson 2013-11-12 20:18:52 EST
Oh, sorry, I forgot that the bug as reported is that grubby creates a new entry but doesn't add the initrd line.
Comment 39 Adam Williamson 2013-11-13 14:25:25 EST
Discussed at 2013-11-13 blocker review meeting - http://meetbot.fedoraproject.org/fedora-blocker-review/2013-11-13/f20-final-blocker-review-1.2013-11-13-17.01.log.txt . Accepted as a blocker as a conditional violation of criterion "The installed system must be able to download and install updates with the default console package manager." - the update is not correctly 'installed' in the case of /boot being on btrfs (the updated kernel is not bootable), and this may have security implications (user believes a kernel that fixes a security bug has been installed and should be in use, but this is not the case).
Comment 40 Chris Murphy 2013-11-14 04:22:38 EST
summary: this bug has two consequences, a.) Fedora 20 installation when /boot is on Btrfs fails to boot, kernel panics, due to lack of initrd entry in the grub.cfg; b.) Fedora 20 kernel updates fail to be added to the grub.cfg.
Comment 41 Gene Czarcinski 2013-11-21 08:22:04 EST
I do not know what the status is for "fixing" grubby.  Nothing has been posted to the grubby git repository so I must assume, "not much."  I have a patch for grubby itself which corrects what I (anyway) consider a potential security problem: in findTemplate() a candidate may be found but is never used; the p[atch corrects that.  Also, the code at the end of findTemplate() merrily goes through the entire grub.cfg file looking for a candidate ... including the os-prober menuentry and 40_custom and any others ... I added a limit of two.

The second patch is a hack:  it "hurts" to use grubby is btrfs is the root file system.  OK, so do not use grubby is the root file system is btrfs AND this is being done for grub2.  The "hack" its to the bach script new-kernel-pkg.

I have done manual testing so far and will be doing more extensive testing which I will report on later.  See commit comments for more info.
Comment 42 Gene Czarcinski 2013-11-21 08:23:30 EST
Created attachment 827180 [details]
address potential security problem in grubby findtTemplate()
Comment 43 Gene Czarcinski 2013-11-21 08:25:05 EST
Created attachment 827181 [details]
if grub2 and btrfs, use grub2-mkconfig

See commit comments
Comment 44 David Lehman 2013-11-21 11:28:12 EST
When/if we have full support in the tools for /boot on btrfs subvolume feel free to open a bug against python-blivet to get automatic partitioning changed accordingly.
Comment 45 Gene Czarcinski 2013-11-21 13:34:55 EST
Either I will or Chris will.  This is really silly because you already support installing /boot on btrfs ... just not in a separate subvolume.  I just completed an install and reboot of a two partition system: 1 is swap and the second is a BTRFS volume with two subvolumes: root ("/") and home ("/home").  Installed with no problems and works fine.

I have been kickstart installing on single-device and multi-device btrfs volumes with /boot on a ext4 partition, with /boot on a subvol, and with /boot simply as a directory on the root subvol.  They all install fine [except for that UUID thing that I would like to see sooner rather than later].

No, most of the problems are not with the installer (anaconda and friends) but with some of the support tools such as grubby, grub2, and os-prober.  There are "fixes" now available to address all the problems.
Comment 46 Adam Williamson 2013-11-21 14:24:33 EST
I don't think the 'hack' is the way to go, here. we just need to bug pjones hard enough to fix grubby.
Comment 47 Gene Czarcinski 2013-11-21 16:12:00 EST
Adam, have you ever looked at that code?  I do not envy pjones trying to fix this problem inside of grubby itself.  IMHO, a new problem could be caused just as easily as fixing this one.

For example, just a quick look at findTemplate() resulted in the attached patch.  Another example is the new-kernel-ppkg --rminitrd flat does not work ... it does nothing but, because apparently  rpm cleans up /boot for all files with the version id of the kernel being removed, there is no problem.

Now, I was just about to report some test results.  Using the patches attached to this bugzilla report, I created an updated grubby-8.28-1 rpm.  I then installed using the Fedora-20-Beta DVD on single and dual device BTRFS volumes with /boot on a subvol and /boot being a directory on the root subvol.  

For each of the test system siturations: 
  1. Running on the 3.11.6 kernel
  2. apply updated rpms for grubby, grub2, grub2-tools, and os-prober (the last three have btrfs support fixes applied)
  3. edit /bin/kernel-install to add "-v" to new-kernel-pkg so they are more verbose
  4. with yum, update (install) kernel 3.11.8 and reboot to run 3.11.8
  5. OK, it runs, reboot back to 3.11.6
  6. running on 3.11.6, yum remove the 3.11.8 kernel

Everything works.  There are some differences since grub2-mkconfig can be somewhat verbose.
Comment 48 Gene Czarcinski 2013-11-22 13:56:20 EST
Created attachment 827953 [details]
when rootfs is btrfs, use grub2-mkconfig instead of grubby

This version has been through additional testing.  Also, unless "-v" (verbose) is specified for new-kernel-pkg to messages from grub2-mkconfig is redirected to /dev/null and all of the debug messages os-prober sends to syslog and disabled.  Therefore, there is little to see that grub2-mkconfig is used.

And, best of all, it works.
Comment 49 Gene Czarcinski 2013-12-01 15:18:43 EST
After doing a lot more research, I have come to the conclusion that this is a lost cause.  I am not sure where (or if) a plan is laid out but the direction seems to be systemd with kernel-install and the bootloader spec:
http://www.freedesktop.org/wiki/Specifications/BootLoaderSpec/

Although Fedora does not use it yet, the code is starting to get added so that kernel-install will handle dracut creating the initramfs file and depmod getting run.  The question in my mind was, OK, so what.  The bootloader configuration still needed updating and then I found BootLoaderSpec.

A patch is in grub2 to support this and I read that an initial attempt has been made for zipl (s390).  Something for extlinux is still needed.  I can sort-of see the goal but the path getting there is not so clear.

There are still al lot of questions in my mind as to how this is going to work.  For example, I have a system with a rootfs mounted on "/".  Then I have the new, special, common to everyone and writable by all systems /boot partition which is either ext2/3/4 or vfat.  Then (most likely) I have another partition/LVM/btrfs-subvol mounted on a directory in /boot (directory named my machine-id).

I also wonder what release is the target: F21?  F22?

My suggestion for grubby and btrfs:  at this point, either implement something like I did or state the restriction that /boot cannot be on a BTRFS filesystem (and anaconda needs to stop supporting it).  The last option is consistent with the future kernel-install/BootLoaderSpec "plans" as I now understand them.
Comment 50 Peter Jones 2013-12-02 09:31:05 EST
As dlehman said, this needs to be fixed if we want to enable installing to such a configuration, but right now we don't.

I'm all for keeping this open so we can track it and fix it, but this shouldn't be an F20 blocker at all.
Comment 51 Gene Czarcinski 2013-12-02 10:24:33 EST
After having done some research, I believe that pjones and I are now in agreement.  This should not be a blocker and the release notes should say that /boot in BTRFS is not currently supported.  If you want consistency, say to put it on a separate ext234 partition like extlinux requires.

That said, there really needs to be some documentation which lays out the plan for kernel updating and bootloader configuration.  If it is going to be systemd/kernel-install/BootLoaderSpec then that should be spelled out.

Also, why is anaconda installing into an unsupported configuration?
Comment 52 Adam Williamson 2013-12-02 11:45:58 EST
Apparently pjones wasn't aware that anaconda allows this configuration. I've confirmed Gene's report that it does: custom part happily allowed me to create a layout with btrfs /boot , btrfs / , and swap, no warnings or anything.

So, it sounds like pjones considers this an unsupported configuration and it's not realistic to fix it in F20 time frame, so the alternative is simply to fix anaconda not to allow it. We should be able to do that in time.
Comment 53 Gene Czarcinski 2013-12-02 13:53:27 EST
Adam, you put swap on BTRFS???  Now that is something that I believe will not work and I believe that the BTRFS-folks have no plans on ever supporting swap on their filesystem.

Now, if anaconda lets you put swap on btrfs, that is a anaconda bug ... fortunately one that I believe needs to be fixed maybe before F21.

And again, let me say that if the direction is systemd/BootLoaderSpec, then maybe we should force a separate partition either ext234 or vfat.
Comment 54 Adam Williamson 2013-12-02 14:36:42 EST
gene: no, I didn't. I wrote "btrfs /boot , btrfs / , and swap". Note the word 'btrfs' occurs next to '/boot' and again next to '/', but not next to 'swap'.
Comment 55 Gene Czarcinski 2013-12-02 14:42:32 EST
Take is good!  However, I am going one step further and installing with just btrfs / --subvol --name=root  seven

You can install onto a system with no swap space defined and it even runs.  I did that accidentally and don't really know what would happen with no swap.
Comment 56 Adam Williamson 2013-12-02 15:02:44 EST
off-topic, this isn't a disk layout discussion forum. bug's 56 comments long already, please help keep it focused. thanks.
Comment 57 Adam Williamson 2013-12-02 16:44:47 EST
Bureaucracy note: as changing anaconda not to allow this layout doesn't really "fix" this bug, we should keep it open but drop the blocker status when we get an anaconda build that doesn't allow the layout any more.
Comment 58 David Lehman 2013-12-03 11:56:07 EST
Patch to disallow /boot on btrfs subvol has been posted for review.
Comment 59 Fedora Update System 2013-12-04 20:12:37 EST
anaconda-20.25.14-1.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/anaconda-20.25.14-1.fc20
Comment 60 Fedora Update System 2013-12-05 16:25:26 EST
Package anaconda-20.25.14-1.fc20, python-blivet-0.23.8-1.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing anaconda-20.25.14-1.fc20 python-blivet-0.23.8-1.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-22800/python-blivet-0.23.8-1.fc20,anaconda-20.25.14-1.fc20
then log in and leave karma (feedback).
Comment 61 Adam Williamson 2013-12-05 18:13:21 EST
The 'fix' here is the big-hammer-workaround of 'no /boot allowed on btrfs', for the record, so take that into account in testing. The grubby bug is not fixed, but you should not be able to hit it on a clean install of F20 because we won't let you put /boot on btrfs.

We shouldn't close the bug, because it still exists, but we can drop blocker status, assuming the 'fix' works, when anaconda goes stable.
Comment 62 Reartes Guillermo 2013-12-06 10:46:29 EST
@BCL

I tested TC5 and tried to put /boot on btrfs.
I tried many things, to see if i found a way (or a crash)... but no.

So the patch to prevent /boot on btrfs does work and i did not find any crash or weird stuff. So it looks stable. (Anaconda 20.25.14-1)
Comment 63 Adam Williamson 2013-12-10 17:37:52 EST
As per confirmation in c#62, the blocker issue is now addressed, so dropping blocker status.
Comment 64 Adam Williamson 2013-12-12 20:31:07 EST
Dropping commonbugs as this is unlikely to be common any more, since we don't allow it.
Comment 65 Will Woods 2014-01-13 12:39:47 EST
*** Bug 1045790 has been marked as a duplicate of this bug. ***
Comment 66 Chris Murphy 2014-02-12 02:46:40 EST
(In reply to Peter Jones from comment #50)
> As dlehman said, this needs to be fixed if we want to enable installing to
> such a configuration, but right now we don't.

Why don't we want to enable installing to such a configuration? Ubuntu, openSUSE, Debian, Arch, all support /boot on Btrfs. The Fedora installer is leading in this area, but grubby is seriously holding Fedora well behind the capability of other distributions.
Comment 67 Fabian Deutsch 2014-04-02 04:49:39 EDT
Hey,

is there actually someone working on solving this?
Comment 68 Gene Czarcinski 2014-04-21 14:10:19 EDT
This is still the long-pole-in-th-tent with respect to btrfs support on Fedora.

My fix/patch is attached.  The package maintainer does not like my solution of doing the fix in bash-shell code and using grub2-mkconfig to do the updating.

As I understand it, his reasoning is that grub2-mkconfig rebuilds all of grub.cfg whereas gruppy updates grub.cfg leaving any use edits to grub.cfg in place.  HOWEVER, there is a longstanding statement by grub2 that the grub.cfg file should never be user modified.
Comment 69 Adam Williamson 2014-04-21 14:47:24 EDT
that line's in grub.cfg from upstream, on the assumption that the distro will call grub-mkconfig on kernel installs. it doesn't entirely apply to fedora's design...
Comment 70 Gene Czarcinski 2014-04-21 15:08:21 EDT
Adam, this is the first time that someone mentioned that Fedora's design is different.  In the interest of moving this forward, I see two possibilities:

1. fix grubby.c ... having examined this code, this is no small task!

2. An alternate package to grubby.  The problem here is that this package would need to be specified, installed, and used by anaconda.  I am skeptical that this would be accepted even though I like this option.

As mentioned by cmurph, all other the other major distributions support /boot on btrfs.  Why is Fedora (and RHEL) so resistant to fixing this?  [Other than digging into grubby.c looks to be a bit of a nightmare]

One other question: Why is it Fedora's design to allow direct user modifications to grub.cfg?

As I recall, one of Fedora's goals is to keep packages as close to upstream as possible.
Comment 71 Gene Czarcinski 2014-04-21 15:16:04 EDT
Yet another question: Where is it articulated that it is a Fedora design goal to allow/support direct user modifications to grub.cfg (to preserve those changes over the event of a new kernel install)?
Comment 72 Adam Williamson 2014-04-21 15:35:24 EDT
by 'different design', I simply mean the use of grubby. it's not really that we intend to allow user changes to grub.cfg, it's just a side effect.
Comment 73 Adam Williamson 2014-04-21 15:36:17 EDT
"Why is Fedora (and RHEL) so resistant to fixing this?"

we aren't, at all. we just don't consider it a particularly high-priority issue. and pjones won't take a fix he doesn't consider correct/appropriate, and no-one else feels inclined to overrule him.
Comment 74 Gene Czarcinski 2014-04-22 05:47:52 EDT
There are some folks (such as myself and Chris Murphy) who believe that the priority of addressing this "problem" should be raised.  I propose that this discussion should be moved to the devel mailing list to get comments from more individuals.  Comments?

BTW, maybe this would raise the profile of grubby.c so that someone would step up to making it work correctly or, perhaps, there would be a concensus that using bash-script code in place of grubby.c is acceptable afterall.

Since many folks supporting Fedora (and Red Hat) products seem to be in love with python, would it be acceptable to replace grubby.c with python code?
Comment 75 Adam Williamson 2014-04-22 12:20:00 EDT
discussions should be on mailing lists, yes.
Comment 76 Helmut Horvath 2014-11-03 04:47:43 EST
Any update on this issue? What is the conclusion of the mailing list discussion? Despite the possible workaround using grub2-mkconfig directly, I wonder why nothing is happening here, despite the willingness of Gene to fix this and the fact that restricting the installer is not the most clever idea (the installer supports subvolumes that trigger this problem too, forbidding that, would be similar to forbidding btrfs at all imho.).
Comment 77 Gene Czarcinski 2014-11-03 08:10:16 EST
A bit too late for Fedora 21. Peter (pjones) is working on it for Fedora 22.

If you need the update, take a look at http://czarc.org
Comment 78 Chris Murphy 2015-01-25 04:34:37 EST
Fedora 21 installer permits /boot on btrfs subvolume. However fedup upgrade of Fedora 21 to Fedora 22 will fail in this case due to this bug. Therefore Fedora 21 installations where /boot is on Btrfs can't be fedup'd unless there's a backport to grubby.fc21.
Comment 79 Chris Murphy 2015-02-07 15:18:01 EST
grubby-8.35-8.fc21.x86_64 still has this behavior but now the failure is silent.
Comment 80 Fedora Blocker Bugs Application 2015-03-20 18:36:02 EDT
Proposed as a Blocker for 22-beta by Fedora user chrismurphy using the blocker tracking app because:

 Same reasoning as in comment 39. What's changed is that Anaconda permits /boot on Btrfs by working around this bug at install time by calling grub2-mkconfig twice:
https://github.com/rhinstaller/anaconda/commit/0fc01e834082f20896728f330faea8e0b200a159
And tested patches to fix this in grubby have been available, just not yet merged.
Comment 81 Adam Williamson 2015-03-23 12:42:53 EDT
Discussed at 2015-03-23 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-23/f22-blocker-review.2015-03-23-16.02.log.txt .

This is a long and complex bug and we're not clear on what exactly is being proposed for blocker status. Chris, can you please clarify exactly what aspect of this bug you'd like to make a blocker, what its practical impact is, and what the expected resolution is?
Comment 82 Chris Murphy 2015-03-23 15:35:23 EDT
dnf/yum updates, gnome-software+systemd offline updates, and fedup upgrades that include new kernels don't have an updated grub menu entry at all due to this bug.

Recommending blocker under the same logic in comment 39 that made this bug a blocker back then: "The installed system must be able to download and install updates with the default console package manager" the update is not completely installed, and can also have security implications since the newer kernel isn't being booted.

What's new since the previous blocker was reverted:
1. Anaconda chose to fix bug 1200539 (a variant on this bug that affects live installs being unbootable) by running grub3-mkconfig twice, rather than what they've done in the past which is to prevent the user from putting /boot on Btrfs (comment 58).
2.  Gene Czarcinski has supplied tested patches (tested by him, AdamW, and me at least), and last we hear is that Peter is working on this for Fedora 22, that is in comment Nov 3.

Basically jones needs to merge Gene's patches into grubby.

The alternatives are: back to David's comment 58 patch, ergo regress back to Fedora 19/20 state where /boot on Btrfs isn't allowed; or do nothing and accept this bug as-is indefinitely.
Comment 83 Chris Murphy 2015-03-23 23:31:52 EDT
Beta criteria includes Btrfs support without exceptions for /boot; and it says the default graphical package manager must install updates, which as comment 39 correctly argued, doesn't happen.

However, a further detail. Using the default graphical package manager, this error is silent. The user only sees "grubby fatal error" with CLI updates.

I think any "not a blocker" suggestions require some explanation; or a challenge (proposed revision) to the release criteria.
Comment 84 Adam Williamson 2015-03-25 21:16:08 EDT
So the executive summary is that anaconda now allows /boot on btrfs again, but grubby was never fixed for its known bugs in this case. OK.

+1 blocker: we should either fix grubby or disallow /boot-on-btrfs once again.
Comment 85 Adam Williamson 2015-03-30 12:47:00 EDT
Discussed at 2015-03-30 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-30/f22-blocker-review.2015-03-30-16.04.log.txt . Accepted as a blocker as a violation of criterion "The installed system must be able to download and install updates with the default console package manager."

As before, we'll note that disallowing /boot-on-btrfs is a valid way to make this bug no longer a blocker.
Comment 86 Brian Lane 2015-04-01 19:54:24 EDT
*** Bug 1207390 has been marked as a duplicate of this bug. ***
Comment 87 Adam Williamson 2015-04-06 19:54:13 EDT
https://github.com/rhinstaller/anaconda/pull/63/files is an anaconda PR which disables /boot-on-btrfs again. That would 'address' this blocker (without resolving the bug).
Comment 88 Bernardo Donadio 2015-04-07 14:19:26 EDT
Being btrfs the proposed default filesystem for F23, wouldn't be wiser to, if this bug can't be fixed, just forbid the /boot to sit as a subvolume of a btrfs tree?

This bug isn't an issue if we create a btrfs filesystem for / and /home, and another btrfs filesystem just for /boot, is it?

This would kind of break the idea of a single snapshot containing a entirely functioning system, but wouldn't prevent the use o snapshots and other btrfs features on /boot.

Just a thought.
Comment 89 Adam Williamson 2015-04-07 16:53:53 EDT
I've verified that this is 'addressed' in https://admin.fedoraproject.org/updates/libblockdev-0.9-1.fc22,python-blivet-1.0.7-1.fc22,anaconda-22.20.9-1.fc22 , by /boot-on-btrfs being disallowed.
Comment 90 Bernardo Donadio 2015-04-07 17:45:32 EDT
Should then existing F21 users of /boot on btrfs subvolume change the partition layout in order to perform an upgrade to F22?
Comment 91 Adam Williamson 2015-04-07 17:48:09 EDT
AFAIK this bug is no different from F21, grubby is in the same state in both. So if you've got things working in F21 somehow, then you should be fine with F22.
Comment 92 Bernardo Donadio 2015-04-07 19:20:32 EDT
That's the issue: I can't update to F22 because grubby isn't able to write the fedup entry on grub (also, kernel updates aren't possible either).

Yes yes, I know how to do manually, but there should be an official set of steps to transition from a /boot btrfs subvolume to a standalone non-btrfs partition, shouldn't it?

Meaning: this HOWTO should be in the F22 release notes.
Comment 93 Chris Murphy 2015-04-07 21:20:22 EDT
(In reply to Bernardo Donadio from comment #92)
> That's the issue: I can't update to F22 because grubby isn't able to write
> the fedup entry on grub (also, kernel updates aren't possible either).

Yes, see comment 78. The behavior you describe is also blocker.

(In reply to awilliam@redhat.com from comment #89)
> I've verified that this is 'addressed' in
> https://admin.fedoraproject.org/updates/libblockdev-0.9-1.fc22,python-blivet-
> 1.0.7-1.fc22,anaconda-22.20.9-1.fc22 , by /boot-on-btrfs being disallowed.

That only addresses new Fedora 22 installs. It doesn't address Fedora 21 updates or upgrades, so that cat's already out of the bag, the only way to actually address this for all three failure types is to fix grubby. And as we know there are tested patches avaiable that have just not been merged for reasons unknown.
Comment 94 Adam Williamson 2015-04-07 23:58:42 EDT
Fixing grubby in F21 is not an F22 release blocker.
Comment 95 Bernardo Donadio 2015-04-08 00:39:09 EDT
"The installed system must be able to download and install updates with the default console package manager."

With this bug, a system with a valid F21 configuration is unable to update to F22 trough normal means.

Should the update process be broken, forcing a very disruptive change (as any repartition is) or a complete reinstall onto systems to be updated?

IMHO, this is a blocker per the definition above.
Comment 96 Chris Murphy 2015-04-08 01:29:52 EDT
The more appropriate criterion is the one for upgrades "must be possible to successfully complete an upgrade". This has been used to block version n in order to compel fixes in n-1 components necessary to complete the upgrade. This just happened with bug 1185604.
Comment 97 Adam Williamson 2015-04-08 02:59:16 EDT
We do that when upgrade is completely broken, not conditional failures. Upgrade's never been 100%, and we've never considered it as such.

21 Final went out with this broken, no-one died. 22 Beta is going to go out with it less broken. I don't see the value in blocking the Beta release for this.
Comment 98 Chris Murphy 2015-04-08 10:13:34 EDT
I don't see the value in holding up the merging of a tested fix without comment, and no alternative mechanism to achieve that goal has been presented.
Comment 99 Adam Williamson 2015-04-08 10:28:18 EDT
So I checked with pjones, and the reason he's not taking the patches is because he thinks they are a bad approach. The correct way to divine information about the filesystem is not to parse path names, but to ask the filesystem. (I suspect the bit of the patch that's *explicitly labelled as being a big hack* might also be a problem.)
Comment 100 Philip 2015-04-12 10:42:42 EDT
Created attachment 1013654 [details]
Suggestion for grubby (based on 031447907d40441e7c8778bb4b9feb496659a632)

Previous suggestions included changing new-kernel-pkg or calling grub2-mkconfig directly. I have taken a look at the grubby code. I'm not too familiar with grubby, I hope I'm not repeating something.

One affected system is a default Fedora 21 installation (except that btrfs was selected), so the root filesystem is on a subvolume called "root". I get the same errors: "access to /root/boot/vmlinuz... failed" / "unable to find a suitable template"
So grubby is checking if the old image exists, which is a path starting with the volume name (/root/...) rather than a volume-relative path because grub sees the whole btrfs array, in which each volume is a top-level directory.
So I have written a wrapper function for access() that tries to find out if the image path is on a subvolume if it's not directly accessible. It cuts the path into pieces, checks if the first piece is a subvolume and if the second piece exists and returns true if both conditions are met. In the case of "/root/boot/vmlinuz", it cuts out "root", checks if / is a subvolume and if /boot/vmlinuz exists. On my affected system, both conditions are met, so the old image has been found and grubby does not abort. Also, it remembers (yes, it's ugly, but it's a hack) that volume name "root", which is later prepended to the kernel path. I haven't checked other paths like that of initramfs, but with those modifications, the error message is gone and the full image path is written to the condig file.

This is by no means a complete patch, I just wanted to see if I could easily get rid of the error (and I did not want to call /usr/sbin/btrfs and parse output). For example, I don't check if the path, which might be a btrfs subvolume, is a btrfs filesystem at all (that might be necessary). Remembering the root volume when checking an image path is bad. And I'm sure there are more parts of the program that have to be changed as well.
Anyway, I'm putting my test diff here, hoping it will be helpful for solving this bug.

Build call to get more debug output:
make RPM_OPT_FLAGS=-DDEBUG=1

Example call:
./grubby --debug --grub2 -c /boot/grub2/grub.cfg --add-kernel=/boot/vmlinuz-3.19.1-201.fc21.x86_64 --copy-default --make-default --title "Fedora ($(date +%T)) 21 (Twenty One)" --args=root=UUID=8cbc4d66-0b24-4e2d-835f-6145446e82da  --remove-kernel="TITLE=Fedora (3.19.1-201.fc21.x86_64) 21 (Twenty One)"

Debug output if image found on subvolume:
image /root/boot/vmlinuz-3.19.1-201.fc21.x86_64 exists in subvolume / (root)
Comment 101 Petr Schindler 2015-04-13 14:45:55 EDT
Discussed at today's blocker review meeting [1].

we agreed on the meeting that disabling /boot on btrfs is enough to address this blocker. We are just waiting on confirmation that it is disabled on RC1/RC2.

[1] http://meetbot.fedoraproject.org/fedora-blocker-review/2015-04-13/
Comment 102 Adam Williamson 2015-04-13 20:58:09 EDT
Confirmed that in RC1 /boot is not allowed to be on a btrfs subvol. Per the decision of Ze Committee, dropping blocker status.
Comment 103 Neal Gompa 2016-02-01 22:32:36 EST
I get that this was not a blocker for F22, but it's quite aggravating that after all this time, we *STILL* can't have /boot as a btrfs subvolume. openSUSE is able to pull it off, and has been doing so for over a year and a half.

How are we going to better test out btrfs in Fedora and get the ball rolling on integrating it better if we can't even do stuff like that?
Comment 104 Adam Williamson 2016-02-02 03:46:03 EST
I'm not saying it wouldn't be handy, but is it really that important? You can test btrfs just fine by using it for your actual data, is there any benefit to using it for the boot partition?
Comment 105 Neal Gompa 2016-02-02 11:05:42 EST
Yes, actually. Being able to snapshot /boot (whether it's a subvolume or not) is useful if you want to be able to actually do full rollback capabilities. Without being able to boot to a Btrfs volume, it's not possible to do something like SUSE's "boot to older snapshot in read-only mode" feature, which comes in quite handy when things get accidentally torched beyond normal levels of screwed up.
Comment 106 Dusty Mabe 2016-02-10 18:50:00 EST
I use full system snapshot/rollback, so it is useful. In order to achieve the setup I wanted I made it from scratch.

Neal, In order to get something close to SUSE you might want to follow the steps I did (for Fedora 22):

http://dustymabe.com/2015/07/14/fedora-btrfssnapper-part-1-system-preparation/
http://dustymabe.com/2015/07/19/fedora-btrfssnapper-part-2-full-system-snapshotrollback/
Comment 107 Jan Kurik 2016-02-24 10:30:16 EST
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase
Comment 108 Naftuli Tzvi Kay 2016-03-02 15:34:52 EST
I'm also affected on Fedora 23.

I have been using the following partition setup on Ubuntu for a number of years:

 - /dev/sda1: VFAT EFI partition.
 - /dev/sda4: BTRFS Linux Boot partition.
 - /dev/sda5: LUKS Encrypted Disk

Inside /dev/sda5 is an LVM volume group with with two logical volumes:

 - /dev/mapper/vg-root: A BTRFS partition.
 - /dev/mapper/vg-swap: A swap partition.

I have two subvolumes on /dev/mapper/vg-root:
 - @: my root subvolume, which will be mounted at /.
 - @home: my home subvolume, which will be mounted at /home.

I'm not able to use the BTRFS filesystem at /dev/mapper/vg-root, subvolume @, as a root partition, it will not accept it.
Comment 109 Naftuli Tzvi Kay 2016-03-02 15:35:44 EST
Also, I'm using rEFInd as a boot manager, not GRUB, so I think that there should be a way to install without a boot loader and let it pass.
Comment 110 john getsoian 2016-08-13 21:40:33 EDT
(In reply to Neal Gompa from comment #105)
> Yes, actually. Being able to snapshot /boot (whether it's a subvolume or
> not) is useful if you want to be able to actually do full rollback
> capabilities. Without being able to boot to a Btrfs volume, it's not
> possible to do something like SUSE's "boot to older snapshot in read-only
> mode" feature, which comes in quite handy when things get accidentally
> torched beyond normal levels of screwed up.

agree with this! It really defeats the potential virtues of BTRFS to have to maintain a separate restoration process for /boot when an update set that includes a new kernel hoses your system.

Note You need to log in before you can comment on or make changes to this bug.