Bug 1289752 - atomic installation on btrfs results in "failed to write boot loader configuration" error
atomic installation on btrfs results in "failed to write boot loader configur...
Status: ASSIGNED
Product: Fedora
Classification: Fedora
Component: ostree (Show other bugs)
27
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Colin Walters
Fedora Extras Quality Assurance
:
: 1415846 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-08 16:28 EST by Chris Murphy
Modified: 2017-08-15 02:35 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
anaconda.log (33.92 KB, text/plain)
2015-12-08 16:31 EST, Chris Murphy
no flags Details
program.log (151.38 KB, text/plain)
2015-12-08 16:31 EST, Chris Murphy
no flags Details
storage.log (137.42 KB, text/plain)
2015-12-08 16:31 EST, Chris Murphy
no flags Details
mounts (9.89 KB, text/plain)
2015-12-08 16:39 EST, Chris Murphy
no flags Details
strace grub2-probe (62.29 KB, text/plain)
2016-09-14 22:37 EDT, Chris Murphy
no flags Details
mount ext2convertedbtrfs (3.25 KB, text/plain)
2016-09-15 00:42 EDT, Chris Murphy
no flags Details
Screenshot - Fedora Atomic 25 installation - Manual partitioning - New Fedora-Atomic 25 installation - Btrfs (1.13 MB, image/jpeg)
2017-01-15 08:36 EST, Ceriel Jacobs
no flags Details

  None (edit)
Description Chris Murphy 2015-12-08 16:28:40 EST
Description of problem:

Uncertain if this is a supported configuration, but 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Boot media, custom partitioning
2. Choose Btrfs partition scheme
3. Remove swap partition (it's a VM)
4. Begin Installation

Actual results:

Installation mostly proceeds, but then a dialog appears "failed to write boot loader configuration".

On reboot I get a grub prompt, due to lack of grub.cfg on the boot volume in /grub2/.


Expected results:

Bootloader config should be written.

Additional info:

No error with automatic partitioning, or custom partitioning using lvmthinp. No error with a non-atomic ISO installing to Btrfs. So it's a combination Atomic + Btrfs problem.

From program.log

14:03:21,782 INFO program: Running... grub2-mkconfig -o /boot/grub2/grub.cfg
14:03:21,881 INFO program: /usr/sbin/grub2-probe: error: cannot find a device for / (is /dev mounted?).
14:03:21,882 DEBUG program: Return code: 1
Comment 1 Chris Murphy 2015-12-08 16:30:59 EST
Clicked submit too soon.

Fedora-Cloud_Atomic-x86_64-23.iso
anaconda 23.19.10-1
Comment 2 Chris Murphy 2015-12-08 16:31 EST
Created attachment 1103724 [details]
anaconda.log
Comment 3 Chris Murphy 2015-12-08 16:31 EST
Created attachment 1103725 [details]
program.log
Comment 4 Chris Murphy 2015-12-08 16:31 EST
Created attachment 1103726 [details]
storage.log
Comment 5 Chris Murphy 2015-12-08 16:39 EST
Created attachment 1103729 [details]
mounts

results of mount command in the install environment
Comment 6 Chris Murphy 2016-02-13 16:10:51 EST
Pretty sure this is better set to grub2, since the failure happens in grub2 due to some confusion with the combination of btrfs and ostree. Others have hit this, see bug 1224560.
Comment 7 Chris Murphy 2016-09-14 22:37 EDT
Created attachment 1201077 [details]
strace grub2-probe

I've chrooted to the ostree deployment, and grub2-probe via grub2-mkconfig fails with this same message. This is a script showing 'strace grub2-probe /' and 'grub2-probe -v /' and also 'mount' all within the chroot.

I'm not sure the strace is useful, /dev/ is mounted here, it's visible with the mount command, but somehow it still fails. On a non-ostree Btrfs 'grub2-probe /' returns Btrfs; as does even 'grub2-probe /home/chris'.
Comment 8 Chris Murphy 2016-09-14 22:40:56 EDT
'grub2-probe --device /dev/vda3' returns Btrfs, although I don't know if that's a valid work around.

Last tested with Fedora-Atomic-dvd-x86_64-25-20160912.n.0.iso which has ostree 2016.7.
Comment 9 Chris Murphy 2016-09-15 00:42 EDT
Created attachment 1201091 [details]
mount ext2convertedbtrfs

OK I did a default atomic dvd installation, converted ext4 to btrfs, and only changed fstab to reflect btrfs instead of ext4. It boots, and 'grub2-probe /' works as does grub2-mkconfig. However, this is really curious, when I check mount I get this:

/dev/mapper/fedora--atomic-root on /sysroot type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/)
/dev/mapper/fedora--atomic-root on / type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0)
/dev/mapper/fedora--atomic-root on /var type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/var)
/dev/mapper/fedora--atomic-root on /usr type btrfs (ro,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0/usr)
/dev/mapper/fedora--atomic-root on /var/lib/docker/devicemapper type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/ostree/deploy/fedora-atomic/var/lib/docker/devicemapper)

subvolid=5 is correct, but there is no such subvol as those listed, those are all bind mounts. Btrfs subvolumes can be mounted directly, behind the scenes these are bind mounts. So what's happening here is Btrfs is seeing the ostree bind mount and lists it as a subvolume path.

Anaconda creates a root subvolume for its Btrfs installs, and through the GUI there is no way to avoid its creation and usage. I'm going to guess that the ostree subsequent bind mounts undo/confuse/conflict with the Btrfs one, causing the grub2-probe problem.
Comment 10 Chris Murphy 2016-09-15 01:11:09 EDT
Good news is that I'm probably wrong, because putting the installation on a subvolume, fixing up the grub.cfg to use rootflags=subvol=<snapshotname>, fixing up fstab in that snapshot to have a mount option subvol=<snapshotname>, and rebooting and it works. And 'grub2-probe /' still returns Btrfs. So maybe this is an assembling problem just in the installation environment? Chroot related?

[chris@localhost ~]$ btrfs sub list -t /
ID	gen	top level	path	
--	---	---------	----	
258	61	5		snapsubvolid5

[chris@localhost ~]$ cat /etc/fstab
/dev/mapper/fedora--atomic-root /         btrfs  subvol=snapsubvolid5

$ mount | grep btrfs
/dev/mapper/fedora--atomic-root on /sysroot type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5)
/dev/mapper/fedora--atomic-root on / type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0)
/dev/mapper/fedora--atomic-root on /var type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/var)
/dev/mapper/fedora--atomic-root on /usr type btrfs (ro,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/deploy/861c7ec50f181cbed578f8a8842d8ac7d228ca42cd97089b9c6643ef187129f2.0/usr)
/dev/mapper/fedora--atomic-root on /var/lib/docker/devicemapper type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/snapsubvolid5/ostree/deploy/fedora-atomic/var/lib/docker/devicemapper)
Comment 11 Ceriel Jacobs 2017-01-14 09:34:48 EST
The "installing EFI bootloader fails" for:
0. Atomic installation
1. custom partitioning
2. btrfs
3. efi
4. swap partition removed, 

is still an issue for Fedora-Atomic-ostree-x86_64-25-20170106.0. I can confirm that the issue is not there when using Fedora-Server-dvd-x86_64-25-1.3 as install medium using an identical custom partitioning.
Comment 12 Ceriel Jacobs 2017-01-15 08:36 EST
Created attachment 1240938 [details]
Screenshot - Fedora Atomic 25 installation - Manual partitioning - New Fedora-Atomic 25 installation - Btrfs
Comment 13 Nathaniel McCallum 2017-05-17 15:11:03 EDT
*** Bug 1415846 has been marked as a duplicate of this bug. ***
Comment 14 Colin Walters 2017-05-17 16:51:12 EDT
Still applicable.  Is going to need some scoping work - what specific BTRFS partition layouts are intended, etc.  (For example, is /boot a separate partition or not)
Comment 15 Colin Walters 2017-05-17 16:53:07 EDT
Moving to ostree since I doubt the Anaconda team would look at this, and in any  changes would likely need to land in libostree first.
Comment 16 Chris Murphy 2017-07-03 19:11:44 EDT
(In reply to Colin Walters from comment #14)
> Still applicable.  Is going to need some scoping work - what specific BTRFS
> partition layouts are intended, etc.  (For example, is /boot a separate
> partition or not)

Bug happens with the existing anaconda enforcement of a separate /boot as ext4.

Anaconda disallows /boot on Btrfs due to a grubby bug [1] which ostree installs don't depend on. Disallowing it is a bit of a pain for any sort of snapshotting and complete btrfs send/receive based backup (snapper, btrbk)


[1]
Short and to the point
https://github.com/rhboot/grubby/issues/22
Long and crazy
https://bugzilla.redhat.com/show_bug.cgi?id=864198
Comment 17 Chris Murphy 2017-08-03 12:47:37 EDT
I've asked upstream about this grub2-probe failure, and whether double bind mount of root fs, or possibly it's a chroot problem.
http://lists.gnu.org/archive/html/grub-devel/2017-08/msg00007.html
Comment 18 Chris Murphy 2017-08-08 13:39:21 EDT
I'm still working on figuring this out, problem still happens with upstream GRUB built from current git.

Question I have now is, how is the installation environment (anaconda) doing assembly? Is there a chroot? And what is the exact chroot command being used? There is no chroot indicated in any of anaconda's logs. The program.log shows each mount command, including the bind and rbind ones. And then there's a normal grub-mkconfig, but that can't possibly work without it being in a chroot.
Comment 19 Colin Walters 2017-08-08 13:43:05 EDT
It's messy; anaconda indeed does a chroot, but then ostree itself *also* does a chroot, which is potentially what's messing things up here.

Big picture I'd like to drop os-prober and such, and basically not run any grub code at all to generate the bootloader config.  That dramatically simplifies things and actually means grub no longer needs to be shipped with the OS.
Comment 20 Chris Murphy 2017-08-08 14:36:46 EDT
I didn't know about the ostree chroot, so I didn't include that in my replicated environment and yet I can reproduce the error in this bug. I'm using

chroot /mnt/sysimage/ostree/deploy/fedora-workstation/deploy/e52f5de4c5d3f52211ab216d5bf5331b81c929e967f9cbfcba3ad85e45f8fd37.0/

grub-probe / while in that chroot fails, but outside the chroot 'grub-probe /mnt/sysimage/ostree/deploy/fedora-workstation/deploy/e52f5de4c5d3f52211ab216d5bf5331b81c929e967f9cbfcba3ad85e45f8fd37.0/' returns btrfs. Whereas if the file system is ext4, grub-probe returns ext2 regardless of whether it's run on this path outside the chroot, or on / while inside the chroot.

If this big picture thing isn't happening soon I'll persist down this rabbit hole, to get this bug fixed, even if it means the work goes unused later.

Also what does "grub no longer needs to be shipped" mean? Dropping grub entirely in favor of which bootloader? Or just including a smaller subset, grub2-install for BIOS, and grubx64.efi for UEFI?
Comment 21 Jan Kurik 2017-08-15 02:35:08 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle.
Changing version to '27'.

Note You need to log in before you can comment on or make changes to this bug.