Red Hat Bugzilla – Bug 1289752
atomic installation on btrfs results in "failed to write boot loader configuration" error
Last modified: 2017-08-15 02:35:08 EDT
Description of problem:
Uncertain if this is a supported configuration, but
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Boot media, custom partitioning
2. Choose Btrfs partition scheme
3. Remove swap partition (it's a VM)
4. Begin Installation
Installation mostly proceeds, but then a dialog appears "failed to write boot loader configuration".
On reboot I get a grub prompt, due to lack of grub.cfg on the boot volume in /grub2/.
Bootloader config should be written.
No error with automatic partitioning, or custom partitioning using lvmthinp. No error with a non-atomic ISO installing to Btrfs. So it's a combination Atomic + Btrfs problem.
14:03:21,782 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
14:03:21,881 INFO program: /usr/sbin/grub2-probe: error: cannot find a device for / (is /dev mounted?).
14:03:21,882 DEBUG program: Return code: 1
Clicked submit too soon.
Created attachment 1103724 [details]
Created attachment 1103725 [details]
Created attachment 1103726 [details]
Created attachment 1103729 [details]
results of mount command in the install environment
Pretty sure this is better set to grub2, since the failure happens in grub2 due to some confusion with the combination of btrfs and ostree. Others have hit this, see bug 1224560.
Created attachment 1201077 [details]
I've chrooted to the ostree deployment, and grub2-probe via grub2-mkconfig fails with this same message. This is a script showing 'strace grub2-probe /' and 'grub2-probe -v /' and also 'mount' all within the chroot.
I'm not sure the strace is useful, /dev/ is mounted here, it's visible with the mount command, but somehow it still fails. On a non-ostree Btrfs 'grub2-probe /' returns Btrfs; as does even 'grub2-probe /home/chris'.
'grub2-probe --device /dev/vda3' returns Btrfs, although I don't know if that's a valid work around.
Last tested with Fedora-Atomic-dvd-x86_64-25-20160912.n.0.iso which has ostree 2016.7.
Created attachment 1201091 [details]
OK I did a default atomic dvd installation, converted ext4 to btrfs, and only changed fstab to reflect btrfs instead of ext4. It boots, and 'grub2-probe /' works as does grub2-mkconfig. However, this is really curious, when I check mount I get this:
/dev/mapper/fedora--atomic-root on /sysroot type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/)
/dev/mapper/fedora--atomic-root on / type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0)
/dev/mapper/fedora--atomic-root on /var type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/var)
/dev/mapper/fedora--atomic-root on /usr type btrfs (ro,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0/usr)
/dev/mapper/fedora--atomic-root on /var/lib/docker/devicemapper type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/var/lib/docker/devicemapper)
subvolid=5 is correct, but there is no such subvol as those listed, those are all bind mounts. Btrfs subvolumes can be mounted directly, behind the scenes these are bind mounts. So what's happening here is Btrfs is seeing the ostree bind mount and lists it as a subvolume path.
Anaconda creates a root subvolume for its Btrfs installs, and through the GUI there is no way to avoid its creation and usage. I'm going to guess that the ostree subsequent bind mounts undo/confuse/conflict with the Btrfs one, causing the grub2-probe problem.
Good news is that I'm probably wrong, because putting the installation on a subvolume, fixing up the grub.cfg to use rootflags=subvol=<snapshotname>, fixing up fstab in that snapshot to have a mount option subvol=<snapshotname>, and rebooting and it works. And 'grub2-probe /' still returns Btrfs. So maybe this is an assembling problem just in the installation environment? Chroot related?
[chris@localhost ~]$ btrfs sub list -t /
ID gen top level path
-- --- --------- ----
258 61 5 snapsubvolid5
[chris@localhost ~]$ cat /etc/fstab
/dev/mapper/fedora--atomic-root / btrfs subvol=snapsubvolid5
$ mount | grep btrfs
/dev/mapper/fedora--atomic-root on /sysroot type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5)
/dev/mapper/fedora--atomic-root on / type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0)
/dev/mapper/fedora--atomic-root on /var type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/var)
/dev/mapper/fedora--atomic-root on /usr type btrfs (ro,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0/usr)
/dev/mapper/fedora--atomic-root on /var/lib/docker/devicemapper type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/var/lib/docker/devicemapper)
The "installing EFI bootloader fails" for:
0. Atomic installation
1. custom partitioning
4. swap partition removed,
is still an issue for Fedora-Atomic-ostree-x86_64-25-20170106.0. I can confirm that the issue is not there when using Fedora-Server-dvd-x86_64-25-1.3 as install medium using an identical custom partitioning.
Created attachment 1240938 [details]
Screenshot - Fedora Atomic 25 installation - Manual partitioning - New Fedora-Atomic 25 installation - Btrfs
*** Bug 1415846 has been marked as a duplicate of this bug. ***
Still applicable. Is going to need some scoping work - what specific BTRFS partition layouts are intended, etc. (For example, is /boot a separate partition or not)
Moving to ostree since I doubt the Anaconda team would look at this, and in any changes would likely need to land in libostree first.
(In reply to Colin Walters from comment #14)
> Still applicable. Is going to need some scoping work - what specific BTRFS
> partition layouts are intended, etc. (For example, is /boot a separate
> partition or not)
Bug happens with the existing anaconda enforcement of a separate /boot as ext4.
Anaconda disallows /boot on Btrfs due to a grubby bug  which ostree installs don't depend on. Disallowing it is a bit of a pain for any sort of snapshotting and complete btrfs send/receive based backup (snapper, btrbk)
Short and to the point
Long and crazy
I've asked upstream about this grub2-probe failure, and whether double bind mount of root fs, or possibly it's a chroot problem.
I'm still working on figuring this out, problem still happens with upstream GRUB built from current git.
Question I have now is, how is the installation environment (anaconda) doing assembly? Is there a chroot? And what is the exact chroot command being used? There is no chroot indicated in any of anaconda's logs. The program.log shows each mount command, including the bind and rbind ones. And then there's a normal grub-mkconfig, but that can't possibly work without it being in a chroot.
It's messy; anaconda indeed does a chroot, but then ostree itself *also* does a chroot, which is potentially what's messing things up here.
Big picture I'd like to drop os-prober and such, and basically not run any grub code at all to generate the bootloader config. That dramatically simplifies things and actually means grub no longer needs to be shipped with the OS.
I didn't know about the ostree chroot, so I didn't include that in my replicated environment and yet I can reproduce the error in this bug. I'm using
grub-probe / while in that chroot fails, but outside the chroot 'grub-probe /mnt/sysimage/ostree/deploy/fedora-workstation/deploy/e52f5de4c5d3f52211ab216d5bf5331b81c929e967f9cbfcba3ad85e45f8fd37.0/' returns btrfs. Whereas if the file system is ext4, grub-probe returns ext2 regardless of whether it's run on this path outside the chroot, or on / while inside the chroot.
If this big picture thing isn't happening soon I'll persist down this rabbit hole, to get this bug fixed, even if it means the work goes unused later.
Also what does "grub no longer needs to be shipped" mean? Dropping grub entirely in favor of which bootloader? Or just including a smaller subset, grub2-install for BIOS, and grubx64.efi for UEFI?
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle.
Changing version to '27'.