Bug 1753485 - silverblue on btrfs, missing rootflags param causes startup failure
Summary: silverblue on btrfs, missing rootflags param causes startup failure
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm-ostree
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Colin Walters
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1829682 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-19 05:36 UTC by Chris Murphy
Modified: 2020-08-02 13:16 UTC (History)
31 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-31 23:49:55 UTC
Type: Bug


Attachments (Terms of Use)
anaconda.log (54.69 KB, text/plain)
2019-09-19 05:44 UTC, Chris Murphy
no flags Details
fstab (890 bytes, text/plain)
2019-09-19 05:44 UTC, Chris Murphy
no flags Details
grub.cfg (6.48 KB, text/plain)
2019-09-19 05:45 UTC, Chris Murphy
no flags Details
journal (214.97 KB, text/plain)
2019-09-19 05:46 UTC, Chris Murphy
no flags Details
program.log (55.93 KB, text/plain)
2019-09-19 05:46 UTC, Chris Murphy
no flags Details
rdsosreport (222.56 KB, text/plain)
2019-09-19 05:46 UTC, Chris Murphy
no flags Details
storage.log (218.09 KB, text/plain)
2019-09-19 05:47 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2019-09-19 05:36:50 UTC
Description of problem:

Now that bug 1289752 appears fixed, following a successful installation, the reboot fails during startup assembly.


Version-Release number of selected component (if applicable):
Fedora-Silverblue-ostree-x86_64-31-20190918.n.0.iso
anaconda 31.22.3-2.fc31


How reproducible:
Always

Steps to Reproduce:
1. Boot the media, Custom partitioning using Btrfs preset, change /home mountpoint to /var/home mountpoint, install, reboot.
2.
3.

Actual results:

Early failure during assembly.

[    4.103947] localhost ostree-prepare-root[549]: ostree-prepare-root: Couldn't find specified OSTree root '/sysroot//ostree/boot.0/fedora/bbee8b268e6e44783cbb5ade38b63e1c1e56ed69a79c37132651eb965737309d/0': No such file or directory


Expected results:

Should startup normally


Additional info:

From the grub.cfg

linuxefi /ostree/fedora-bbee8b268e6e44783cbb5ade38b63e1c1e56ed69a79c37132651eb965737309d/vmlinuz-5.3.0-0.rc6.git0.1.fc31.x86_64 resume=UUID=48082841-04e6-4eb3-b8e2-5be89daed652 rhgb quiet root=UUID=00682c4d-5d89-47f3-b92a-08d0a06e4ebd ostree=/ostree/boot.0/fedora/bbee8b268e6e44783cbb5ade38b63e1c1e56ed69a79c37132651eb965737309d/0


This is missing 'rootflags=subvol=root' found with conventional Anaconda installations for lives and netinstalls on Btrfs. So somehow the ostree specific anaconda code is omitting this hint, and therefore the 'root' subvol isn't mounted, and hence why ostree-prepare-root can't find what it's looking for.

Comment 1 Chris Murphy 2019-09-19 05:44:45 UTC
Created attachment 1616561 [details]
anaconda.log

Comment 2 Chris Murphy 2019-09-19 05:44:58 UTC
Created attachment 1616562 [details]
fstab

Comment 3 Chris Murphy 2019-09-19 05:45:37 UTC
Created attachment 1616563 [details]
grub.cfg

Comment 4 Chris Murphy 2019-09-19 05:46:11 UTC
Created attachment 1616564 [details]
journal

journal of failed startup

Comment 5 Chris Murphy 2019-09-19 05:46:24 UTC
Created attachment 1616565 [details]
program.log

Comment 6 Chris Murphy 2019-09-19 05:46:43 UTC
Created attachment 1616566 [details]
rdsosreport

Also from failed startup

Comment 7 Chris Murphy 2019-09-19 05:47:01 UTC
Created attachment 1616567 [details]
storage.log

Comment 8 Chris Murphy 2019-09-22 01:34:10 UTC
Adding 'rootflags=subvol=root' manually by editing the grub entry does work, system starts up, and I get to g-i-s as expected.

[root@install ~]# grub2-editenv list
menu_auto_hide=1
boot_success=0
kernelopts=root=UUID=00682c4d-5d89-47f3-b92a-08d0a06e4ebd ro rootflags=subvol=root/ostree/deploy/fedora/deploy/fd3fe4bf58ab46d86fb8b1692d76b8c24fcbd55a806d4fc87f50f2a20ec10562.0 resume=UUID=48082841-04e6-4eb3-b8e2-5be89daed652 rhgb quiet
boot_indeterminate=5
[root@install ~]# mount | grep btrfs
/dev/vda4 on /sysroot type btrfs (rw,relatime,seclabel,space_cache,subvolid=256,subvol=/root)
/dev/vda4 on / type btrfs (rw,relatime,seclabel,space_cache,subvolid=256,subvol=/root/ostree/deploy/fedora/deploy/fd3fe4bf58ab46d86fb8b1692d76b8c24fcbd55a806d4fc87f50f2a20ec10562.0)
/dev/vda4 on /usr type btrfs (ro,relatime,seclabel,space_cache,subvolid=256,subvol=/root/ostree/deploy/fedora/deploy/fd3fe4bf58ab46d86fb8b1692d76b8c24fcbd55a806d4fc87f50f2a20ec10562.0/usr)
/dev/vda4 on /var type btrfs (rw,relatime,seclabel,space_cache,subvolid=259,subvol=/var)
/dev/vda4 on /var/home type btrfs (rw,relatime,seclabel,space_cache,subvolid=258,subvol=/home00)
[root@install ~]# 

That's interesting for a few reasons. 
1. The rootflags in the grubenv isn't correct.
2. The (incorrect) rootflags in the grubenv doesn't show up in either of the grub menu entries, suggesting grubenv isn't being read by this version of grub (?)
3. The 2nd and 3rd mount entries' subvolid=256 is correct, but the subvol= values are not pointing to subvolumes, but rather bind mounts.

The 3rd problem is known and reported to upstream kernel devs. Two possible work arounds:
a. trust /etc/fstab's entry for / to figure out what rootflags option to use in grubenv, and ignore the confusing mount info until hopefully kernel devs figure out a way to fix it
b. create subvolumes for anything that will be bind mounted, so that the mount info actually is correct, i.e. fd3fe4bf58ab46d86fb8b1692d76b8c24fcbd55a806d4fc87f50f2a20ec10562.0 and fd3fe4bf58ab46d86fb8b1692d76b8c24fcbd55a806d4fc87f50f2a20ec10562.0/usr would be subvolumes instead of dirs

It's also reasonable to first figure out whether rpm-ostree should try to take advantage of Btrfs subvolumes and snapshotting, or if it should treat it as a generic file system as much as possible and only handle exceptions where necessary.

Comment 9 Chris Murphy 2019-09-22 01:53:28 UTC
OK I sorta see what's going on here, there's the older rpm-ostree style of BLS snippet that doesn't contain 'options $kernelopts' but rather contains an explicit options line.

And also, I think it's the grub2-mkconfig code that uses mount info to determine the btrfs subvolume, which ends up being wrong due assuming it should be mounting to / during boot, rather than to /sysroot.

Comment 10 Chris Murphy 2019-11-30 03:09:45 UTC
Still a bug in Fedora-Workstation-Live-x86_64-Rawhide-20191129.n.0.iso, and the same work around does work.

Comment 11 Chris Murphy 2019-12-01 01:58:07 UTC
On second thought, this is not an installer problem. It's a silverblue / rpm-ostree problem, in that they aren't yet using the Fedora BLS feature, like the other editions and spins. I'm not sure what component it should be set to, so I'll set it to rpm-ostree for now.

Comment 12 Jonathan Lebon 2019-12-02 21:28:39 UTC
Note on Fedora 31, one can use of BLS entries. And in fact, we have the opposite problem, where both BLS entries and traditional grub2 menu entries show up (see https://fedoraproject.org/wiki/Common_F31_bugs#On_Fedora_Silverblue.2FIoT.2C_the_GRUB_menu_shows_duplicate_entries for details).

> OK I sorta see what's going on here, there's the older rpm-ostree style of BLS snippet that doesn't contain 'options $kernelopts' but rather contains an explicit options line.

Hmm, are you suggesting that our BLS entries should include `$kernelopts`? Does that get fed from `GRUB_CMDLINE_LINUX` in /etc/default/grub? I don't actually see that variable on my F31 Silverblue:

```
[root@lux ~]# grub2-editenv list
boot_success=1
boot_indeterminate=0
[root@lux ~]#
```

Note at the very least, we'd still need to append the `ostree=` specific karg for that deployment.

Comment 13 Chris Murphy 2019-12-02 22:18:01 UTC
> Hmm, are you suggesting that our BLS entries should include `$kernelopts`?

Yes but...

> Does that get fed from `GRUB_CMDLINE_LINUX` in /etc/default/grub? I don't
> actually see that variable on my F31 Silverblue:

I think the GRUB blscfg.mod gets the variable's value from grubenv, and inserts it when building the menu entries. Javier? :D

Whereas on Silverblue the BLS snippets are consumed by grub2-mkconfig /etc/grub.d/30_ostree to get them into grub.cfg, during the time when GRUB couldn't directly read BLS snippets. And hence now the duplicates.

I defer to Javier who has more knowledge of the long term plan for this, including grubenv basically being unreliable on non-UEFI (or specifically non-FAT file systems) as a way to communicate environment variables. Also $kernelopts and looking at grubenv for it is really a Fedora specific invention, it's not found in either upstream GRUB or the upstream BLS spec. I guess the alternative is to explicitly write out the command line options per snippet, but what's the policy for creating new snippets? Use the most recent conf as a template? Use a separate conf template file? Is this discoverable, self-describing, or otherwise just rearranging the deck chairs?

It's possible what Fedora CoreOS is doing is relevant as well. And what boot counting and fallback is going to look like.

Fedora Workstation example
[chris@flap ~]$ sudo cat /boot/loader/entries/ce3f1eade82d42bd891a8c15714b13cf-5.4.0-2.fc32.x86_64.conf
title Fedora (5.4.0-2.fc32.x86_64) 32 (Rawhide)
version 5.4.0-2.fc32.x86_64
linux /boot/vmlinuz-5.4.0-2.fc32.x86_64
initrd /boot/initramfs-5.4.0-2.fc32.x86_64.img
options $kernelopts
grub_users $grub_users
grub_arg --unrestricted
grub_class kernel

[chris@flap ~]$ sudo cat /boot/efi/EFI/fedora/grubenv
# GRUB Environment Block
saved_entry=ce3f1eade82d42bd891a8c15714b13cf-5.3.13-300.fc31.x86_64
kernelopts=root=UUID=32291cea-4dee-4e0d-bdf3-813314e2ab10 ro rootflags=subvol=root rhgb quiet
boot_success=1
boot_indeterminate=0
...

Comment 14 Ben Cotton 2020-02-11 17:42:39 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 15 Andrey 2020-02-11 20:31:44 UTC
In case the system was installed to BTRFS subvolume with '/' mount point, the option to add is 'rootflags=subvol=00' for me.
'rootflags=subvol=root' doesn't work.

Comment 16 Chris Murphy 2020-02-11 20:34:57 UTC
Right, it needs to be set to the name of the subvolume used for sysroot. Ordinarily Anaconda defaults to naming this subvolume 'root' but I've variably seen it named '00' or even 'root00'.

Comment 17 Davide Cavalca 2020-07-08 04:04:59 UTC
I can confirm this repros with Fedora-Silverblue-ostree-x86_64-Rawhide-20200607.n.0.iso in a VM (using bios, not EFI). In my case I was able to successfully boot by manually adding rootflags=subvol=root to the kernel cmdline in the grub menu. 

/boot/grub2/grub.cfg (which points to /boot/loader.0/grub.cfg) does define kernelopts early on:

if [ -z "${kernelopts}" ]; then
  set kernelopts="root=UUID=6bb5b38c-4427-4628-b88b-59689abd340b ro rootflags=subvol=root/ostree/deploy/fedora/deploy/aa252edc5bcd385d5dc37899b90e404671a237e7f69b6a4091f2bbb5ca7682d7.0 resume=UUID=428a03ff-dee7-4eb0-a25f-a79d01ccf572 rhgb quiet "
fi

but doesn't reference it in the actual cmdline:

linux16 /ostree/fedora-d46f5670fdc7213af90050d974ad6833d22f89c0cbaffa0e0196dd0e9a62f9ab/vmlinuz-5.7.0-1.fc33.x86_64 resume=UUID=428a03ff-dee7-4eb0-a25f-a79d01ccf572 rhgb quiet root=UUID=6bb5b38c-4427-4628-b88b-59689abd340b ostree=/ostree/boot.0/fedora/d46f5670fdc7213af90050d974ad6833d22f89c0cbaffa0e0196dd0e9a62f9ab/0

Also note that in the kernelopts above:

rootflags=subvol=root/ostree/deploy/fedora/deploy/aa252edc5bcd385d5dc37899b90e404671a237e7f69b6a4091f2bbb5ca7682d7.0

is incorrect -- it should just be rootflags=subvol=root, per:

[root@localhost loader]# btrfs subvol list /
ID 256 gen 1774 top level 5 path root
ID 258 gen 1771 top level 5 path home

I'm not terribly familiar with Silverblue and ostree, so I'm not sure how to debug this further.

Comment 18 Javier Martinez Canillas 2020-07-08 12:36:41 UTC
(In reply to Chris Murphy from comment #13)
> > Hmm, are you suggesting that our BLS entries should include `$kernelopts`?
> 
> Yes but...
> 
> > Does that get fed from `GRUB_CMDLINE_LINUX` in /etc/default/grub? I don't
> > actually see that variable on my F31 Silverblue:
> 
> I think the GRUB blscfg.mod gets the variable's value from grubenv, and
> inserts it when building the menu entries. Javier? :D
>

Sorry, I missed before that you mentioned me in this BZ.
 
> Whereas on Silverblue the BLS snippets are consumed by grub2-mkconfig
> /etc/grub.d/30_ostree to get them into grub.cfg, during the time when GRUB
> couldn't directly read BLS snippets. And hence now the duplicates.
>

There seems to be some confusion on how all this works on ostree-based and
non-ostree-based variants, so I'll try to clarify some points.

1- The BLS snippets used by Silverblue (or any other ostree-based variant)
   are generated by ostree. I'm not completely sure from where ostree gets
   the kernel cmdline, but IIRC is just from the current BLS snippet (plus
   any additional option provided by the user, i.e with rpm-ostree kargs).

   In traditional (non-ostree) Fedora the BLS snippets are generated by the
   kernel-install grub2 script. These snippets don't have the kernel cmdline
   but instead just have a $kernelopts variable and the cmdline is stored in
   the grubenv file. So the variable isn't relevant for ostree-based distros.
 
   I think the reason why this variable is set even for Silverblue is that
   grub2-mkconfig doesn't take into account the ostree case and sets the
   variable unconditionally.

2- The blscfg module in GRUB just parses the snippets in /boot/loader/entries,
   it doesn't care who created those. If there are variables there, it tries
   to expand them.

   But as mentioned, in the case of Silverblue the BLS snippets don't have any
   variable to expand and contain the full kernel cmdline in the options key.

3- Since GRUB didn't have support to parse the BLS snippets, for ostree a GRUB
   /etc/grub.d/15_ostree script was installed that reads the BLS snippets and
   adds menuentry commands to the GRUB config file.

   That's why there are duplicated entries. Because the blscfg command is added
   to the GRUB config file, the BLS entries are parsed and used to populate the
   boot menu. But then the menuentry commands added by 15_ostree will populate
   the same boot entries again.

   You can call now the grub2-switch-to-blscfg script to mark that the installed
   GRUB supports BLS and so the /etc/grub.d/15_ostree will avoid adding entries.

   But this has restrictions, it only works on EFI and if /boot is a mountpoint
   due GRUB not being updated for legacy BIOS and ostree always creating a BLS
   snippet with paths relative to a boot partition.
  
4- In the case of traditional Fedora, the kernel-install grub2 script calls
   grub2-mkrelpath to figure out the correct path for the kernel and initrd
   images. This is needed because GRUB can only mount a btrfs root subvolume
   so the paths needs to be fixed if these images are in a btrfs subvolume.

   I don't know if ostree does something like that, but I guess that doesn't
   given that only supports /boot being a mountpoint as mentioned before.

5- The kernel cmdline root param set in the $kernelopts is generated by the
   grub2-mkconfig script. Part of it is figured out by the script (i.e: root
   param) and the rest comes from the GRUB_CMDLINE_LINUX in /etc/default/grub.

   I don't know if GRUB is to blame here (grub2-mkconfig setting the wrong
   rootflags) or Anaconda (GRUB_CMDLINE_LINUX not having the correct values).

   Since in Silverblue the BLS snippets are generated by ostree, then again
   I don't know if the wrong kernel cmdline is caused by Anaconda or ostree
   in that case.

> I defer to Javier who has more knowledge of the long term plan for this,
> including grubenv basically being unreliable on non-UEFI (or specifically
> non-FAT file systems) as a way to communicate environment variables. Also
> $kernelopts and looking at grubenv for it is really a Fedora specific
> invention, it's not found in either upstream GRUB or the upstream BLS spec.
> I guess the alternative is to explicitly write out the command line options
> per snippet, but what's the policy for creating new snippets? Use the most
> recent conf as a template? Use a separate conf template file? Is this
> discoverable, self-describing, or otherwise just rearranging the deck chairs?
>

As you know the $kernelopts variable was already dropped in F33. The goal was
to have a single place where the cmdline could be stored and to avoid mangling
the BLS snippets before installing them.

But this caused more harm than good and it also broke the BLS contract, since
the options key used to store the cmdline is a known key described in the spec.

I didn't want to push that for F32 since is an intrusive change to be made in
a released version.

As of from where the cmdline will be taken for new snippets, the kernel-install
script mentions that this will be read from the following paths in order:

/etc/kernel/cmdline
/usr/lib/kernel/cmdline
/proc/cmdline

The first two are never shipped by any package nor created by Anaconda, so in
practice /proc/cmdline will be used unless a user creates one of those files.

Now all this is only relevant for traditional Fedora anyways. It would be good
if someone more familiar with ostree could confirm that the cmdline does come
from the current BLS deployment and how well ostree plays with a btrfs setup.

Comment 19 Davide Cavalca 2020-07-08 16:54:03 UTC
So looking at my test VM, I don't see rootflags in /etc/default/grub at all:

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="resume=UUID=0a15aae0-27bc-4046-adc8-39a4ed926726 rhgb quiet"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

There's also no /etc/kernel/cmdline and /usr/lib/kernel/cmdline. Running grub2-mkconfig produces a file identical to the /boot/grub/grub.cfg on disk (i.e. like the one I pasted in the previous comment). Looking at /etc/grub.d/10_linux, I'm pretty sure this is the snippet responsible here:

case x"$GRUB_FS" in
    xbtrfs)
        if [ "x${SUSE_BTRFS_SNAPSHOT_BOOTING}" = "xtrue" ]; then
        GRUB_CMDLINE_LINUX="${GRUB_CMDLINE_LINUX} \${extra_cmdline}"
        else
        rootsubvol="`make_system_path_relative_to_its_root /`"
        rootsubvol="${rootsubvol#/}"
        if [ "x${rootsubvol}" != x ]; then
            GRUB_CMDLINE_LINUX="rootflags=subvol=${rootsubvol} ${GRUB_CMDLINE_LINUX}"
        fi
        fi;;
    xzfs)
        rpool=`${grub_probe} --device ${GRUB_DEVICE} --target=fs_label 2>/dev/null || true`
        bootfs="`make_system_path_relative_to_its_root / | sed -e "s,@$,,"`"
        LINUX_ROOT_DEVICE="ZFS=${rpool}${bootfs%/}"
        ;;
esac

Note the call to make_system_path_relative_to_its_root which itself calls grub2-mkrelpath, and indeed:

[root@localhost ~]# grub2-mkrelpath /
/root/ostree/deploy/fedora/deploy/aa252edc5bcd385d5dc37899b90e404671a237e7f69b6a4091f2bbb5ca7682d7.0

So there's likely multiple bugs in play here.

Comment 20 Davide Cavalca 2020-07-08 17:15:47 UTC
Ok, so I think the reason the rootflags isn't getting propagated down to the linux16 call is because update_bls_cmdline (in 10_linux) doesn't do anything, because it loops over the output of get_sorted_bls (also in 10_linux), which is empty. Now, *that* happens because it loops over BLS configs like this:

    files=($(for bls in ${blsdir}/${machine_id}-*.conf; do
        echo "$bls"
        if ! [[ -e "${bls}" ]] ; then
            echo "bad $bls"
            continue
        fi
        bls="${bls%.conf}"
        bls="${bls##*/}"
        echo "${bls}"
    done | ${kernel_sort} 2>/dev/null | tac)) || :

leading it to try and read stuff like /boot/loader/entries/1920fd7c72174e0c8d44a615d6897381-*.conf which isn't a thing on silverblue (we have /boot/loader/entries/ostree-1-fedora.conf instead). If I hack this to read the right file, grub2-mkconfig then fails with:

  error: No ostree= kernel argument found

in 15_ostree. Specifically, that seems to come out of ostree admin instutil grub2-generate.

Comment 21 Javier Martinez Canillas 2020-07-08 17:22:45 UTC
(In reply to Davide Cavalca from comment #19)

[snip]
 
> 
> [root@localhost ~]# grub2-mkrelpath /
> /root/ostree/deploy/fedora/deploy/
> aa252edc5bcd385d5dc37899b90e404671a237e7f69b6a4091f2bbb5ca7682d7.0
> 
> So there's likely multiple bugs in play here.

Right, so that explains the wrong rootflags value.

Comment 22 Davide Cavalca 2020-07-08 17:30:08 UTC
So, to recap:
- we need to fix the grub2-mkrelpath logic in 10_linux (probably to look for /sysroot instead of / on ostree systems)
- we need to fix get_sorted_bls in 10_linux to not look for machineid configs on ostree systems
- we need to add the ostree= flag to the bls configs on disk (probably in update_bls_cmdline in 10_linux) so that "ostree admin instutil grub2-generate" can work properly

Comment 23 Javier Martinez Canillas 2020-07-08 17:36:26 UTC
(In reply to Davide Cavalca from comment #20)
> Ok, so I think the reason the rootflags isn't getting propagated down to the
> linux16 call is because update_bls_cmdline (in 10_linux) doesn't do
> anything, because it loops over the output of get_sorted_bls (also in
> 10_linux), which is empty. Now, *that* happens because it loops over BLS
> configs like this:
> 
>     files=($(for bls in ${blsdir}/${machine_id}-*.conf; do
>         echo "$bls"
>         if ! [[ -e "${bls}" ]] ; then
>             echo "bad $bls"
>             continue
>         fi
>         bls="${bls%.conf}"
>         bls="${bls##*/}"
>         echo "${bls}"
>     done | ${kernel_sort} 2>/dev/null | tac)) || :
> 
> leading it to try and read stuff like
> /boot/loader/entries/1920fd7c72174e0c8d44a615d6897381-*.conf which isn't a

That's the correct behavior I think. The reason why that function exists is to allow
users to modify the cmdline by changing GRUB_CMDLINE_LINUX in /etc/default/grub and
run grub2-mkconfig.

So even when is not necessary to re-generate the grub.cfg on a BLS configuration,
since the information about the menu entries is not in that file, users expect that
a call to grub2-mkconfig will honor GRUB_CMDLINE_LINUX and update accordingly.

But that's not the case for Silverblue, since the BLS snippets are not managed by
the GRUB scripts at all and instead are managed by ostree. In fact if it does, it
will strip the ostree param as you found in your test.

> thing on silverblue (we have /boot/loader/entries/ostree-1-fedora.conf
> instead). If I hack this to read the right file, grub2-mkconfig then fails
> with:
> 
>   error: No ostree= kernel argument found
> 
> in 15_ostree. Specifically, that seems to come out of ostree admin instutil
> grub2-generate.

Yes, ostree admin instutil grub2-generate calls grub2-mkconfig which in turn will
call /etc/grub.d/15_ostree to add the menuentry commands using the information in
the /boot/loader/entries/ snippets.

But that's really not needed since as mentioned before GRUB now has the blscfg
module to parse the BLS snippets directly.

The bug I think is that there are no rootflags param in the BLS snippets generated
by ostree. As mentioned I don't know if that is a problem of ostree or Anaconda.
Whatever component fills the value that is in the options field should take into
account that / is a btrfs filesystem and add that cmdline param.

That's for the missing rootflags, then there is the other bug that the value isn't
correct, which I don't know if a bug in grub2-mkrelpath or the fact that ostree
enters into a chroot and that confuses grub2-mkrelpath.

Comment 24 Davide Cavalca 2020-07-08 18:28:28 UTC
> That's the correct behavior I think. The reason why that function exists is to allow
> users to modify the cmdline by changing GRUB_CMDLINE_LINUX in /etc/default/grub and
> run grub2-mkconfig.

True, but that loop still needs to read from the right configs.

With this patch:

--- 10_linux	2020-07-08 11:06:08.598586306 -0700
+++ /etc/grub.d/10_linux	2020-07-08 11:13:58.170256686 -0700
@@ -69,7 +69,11 @@
 	if [ "x${SUSE_BTRFS_SNAPSHOT_BOOTING}" = "xtrue" ]; then
 	GRUB_CMDLINE_LINUX="${GRUB_CMDLINE_LINUX} \${extra_cmdline}"
 	else
-	rootsubvol="`make_system_path_relative_to_its_root /`"
+        if [ -d /ostree/repo ] && [ -d /sysroot ]; then
+  		rootsubvol="`make_system_path_relative_to_its_root /sysroot`"
+	else
+  		rootsubvol="`make_system_path_relative_to_its_root /`"
+	fi
 	rootsubvol="${rootsubvol#/}"
 	if [ "x${rootsubvol}" != x ]; then
 	    GRUB_CMDLINE_LINUX="rootflags=subvol=${rootsubvol} ${GRUB_CMDLINE_LINUX}"
@@ -138,13 +142,19 @@
 
 get_sorted_bls()
 {
-    if ! [ -d "${blsdir}" ] || ! [ -e /etc/machine-id ]; then
+    if ! [ -d "${blsdir}" ]; then
         return
     fi
 
-    read machine_id < /etc/machine-id
-    if [ -z "${machine_id}" ]; then
-        return
+    if [ -d /ostree/repo ]; then
+        machine_id=ostree
+    elif ! [ -e /etc/machine-id ]; then
+       return
+    else
+       read machine_id < /etc/machine-id
+       if [ -z "${machine_id}" ]; then
+           return
+       fi
     fi
 
     local IFS=$'\n'

and deleting /etc/grub/15_ostree it *almost* works -- but the BLS config is still missing the ostree= entry as you mentioned. Do you happen to know how to generate that?

Comment 25 Javier Martinez Canillas 2020-07-08 18:47:39 UTC
(In reply to Davide Cavalca from comment #24)
> > That's the correct behavior I think. The reason why that function exists is to allow
> > users to modify the cmdline by changing GRUB_CMDLINE_LINUX in /etc/default/grub and
> > run grub2-mkconfig.
> 
> True, but that loop still needs to read from the right configs.
> 

No, because everything that's in the 10_linux script about BLS snippets is only
valid the non-ostree case. For ostree-based distros the BLS snippets should only
be modified by ostree.

> With this patch:
> 
> --- 10_linux	2020-07-08 11:06:08.598586306 -0700
> +++ /etc/grub.d/10_linux	2020-07-08 11:13:58.170256686 -0700
> @@ -69,7 +69,11 @@
>  	if [ "x${SUSE_BTRFS_SNAPSHOT_BOOTING}" = "xtrue" ]; then
>  	GRUB_CMDLINE_LINUX="${GRUB_CMDLINE_LINUX} \${extra_cmdline}"
>  	else
> -	rootsubvol="`make_system_path_relative_to_its_root /`"
> +        if [ -d /ostree/repo ] && [ -d /sysroot ]; then
> +  		rootsubvol="`make_system_path_relative_to_its_root /sysroot`"
> +	else
> +  		rootsubvol="`make_system_path_relative_to_its_root /`"
> +	fi

This part might be correct, but I still need to dig how the cmdline is set for
the ostree BLS snippets with Silverblue.

>  	rootsubvol="${rootsubvol#/}"
>  	if [ "x${rootsubvol}" != x ]; then
>  	    GRUB_CMDLINE_LINUX="rootflags=subvol=${rootsubvol}
> ${GRUB_CMDLINE_LINUX}"
> @@ -138,13 +142,19 @@
>  
>  get_sorted_bls()
>  {
> -    if ! [ -d "${blsdir}" ] || ! [ -e /etc/machine-id ]; then
> +    if ! [ -d "${blsdir}" ]; then
>          return
>      fi
>  
> -    read machine_id < /etc/machine-id
> -    if [ -z "${machine_id}" ]; then
> -        return
> +    if [ -d /ostree/repo ]; then
> +        machine_id=ostree
> +    elif ! [ -e /etc/machine-id ]; then
> +       return
> +    else
> +       read machine_id < /etc/machine-id
> +       if [ -z "${machine_id}" ]; then
> +           return
> +       fi
>      fi
>  
>      local IFS=$'\n'
> 
> and deleting /etc/grub/15_ostree it *almost* works -- but the BLS config is
> still missing the ostree= entry as you mentioned. Do you happen to know how
> to generate that?

That's why I mentioned that the 10_linux script shouldn't modify the BLS snippets
generated by ostree. Since the ostree param is something that only ostree knows
how to set, because it has the information about the different ostree deployments.

So I think that instead of trying to set the root + GRUB_CMDLINE_LINUX + rootflags
+ ostree from the 10_linux script, what we should do is to make ostree to set the
rootflags (or Anaconda if ostree gets a cmdline that gets carried over deployments
from the installer).

Comment 26 Davide Cavalca 2020-07-08 20:03:41 UTC
Looks like anaconda does it here: https://github.com/rhinstaller/anaconda/blob/b6e5205560f9544780f7f8540ad478958665e9d0/pyanaconda/payload/rpmostreepayload.py#L450-L453 by calling ostree set_kargs_args. However, I can't find any reference to that command in the ostree repo...

Comment 27 Davide Cavalca 2020-07-08 20:09:51 UTC
Oh, I can't read. the actual command ran by anaconda is ostree admin instutil set-kargs which is totally a thing: https://github.com/ostreedev/ostree/blob/be2572bf68090a5e277338d2613d3c7d53b0c9e8/src/ostree/ot-admin-instutil-builtin-set-kargs.c

Comment 28 Davide Cavalca 2020-07-08 20:12:03 UTC
And further down the rabbit hole, it looks like https://github.com/ostreedev/ostree/blob/2ca2b88f51c3131c3aa2322fe26bae2cee7e76fa/src/libostree/ostree-bootconfig-parser.c#L143 is where the actual logic for reading and writing BLS configs lives

Comment 29 Jonathan Lebon 2020-07-08 20:21:49 UTC
`ostree admin instutil set-kargs` is documented here: https://github.com/ostreedev/ostree/blob/cb2ecd1459605335b3a48c79d03225d7a1c2cc65/man/ostree-admin-instutil.xml#L79-L89

But yes, for Silverblue the kargs are set by Anaconda at install time. So I think the logic in 10_linux related to rootflags should be moved there.

Note also that it's possible to configure OSTree-based systems to not use grub2-mkconfig at all now that GRUB has native support for BLS. E.g. for FCOS, we use bootloader=none, which means that OSTree simply just writes the BLS configs:

https://github.com/ostreedev/ostree/pull/1814
https://github.com/coreos/coreos-assembler/blob/293ed6dbf24fab25db6349418346b209f09065e3/src/create_disk.sh#L315

This avoids the notorious os-prober code as well, which has caused problems in the past.

Comment 30 Davide Cavalca 2020-07-08 20:24:12 UTC
Put up https://github.com/rhinstaller/anaconda/pull/2720 as a tentative fix to set rootflags properly in Anaconda.

Comment 31 Javier Martinez Canillas 2020-07-08 22:05:06 UTC
(In reply to Davide Cavalca from comment #30)
> Put up https://github.com/rhinstaller/anaconda/pull/2720 as a tentative fix
> to set rootflags properly in Anaconda.

Thanks for digging out. Your changes for Anaconda makes sense to me.

Comment 32 Tomas Kovar 2020-07-13 18:26:27 UTC
Hello all,

I did test for several different installs of Silverblue on btrfs. They all have common that they are made from `Fedora-Silverblue-ostree-x86_64-32-1.6.iso`, using UEFI, GPT partition scheme, en-us locale for both installer and installed system.

1. /boot on separate ext4 partition, / on main btrfs volume
===========================================================
layout:

sda1  vfat  /boot/efi
sda2  ext4  /boot
sda3  btrfs /
  subvolume home  /var/home

results:

* Installation: OK
* GRUB grub2-mkdocnfig entry: boots, no `rootflags=subvol=` entry (not needed in this scenario)
* GRUB BLS entry: boots, no `rootflags=subvol=` entry (not needed in this scenario)
* Root switch: OK
* Conclusion: System installs, boots and is fully usable.
* To fix: Nothing, everything is fine

2. /boot on separate ext4 partition, / on root subvolume
========================================================
layout:

sda1  vfat  /boot/efi
sda2  ext4  /boot
sda3  btrfs (not-mounted)
  subvolume root  /
  subvolume home  /var/home

results:

* Installation: OK
* GRUB grub2-mkdocnfig entry: boots, no `rootflags=subvol=` entry
* GRUB BLS entry: boots, no `rootflags=subvol=` entry
* Root switch: failed:

  ostree-prepare-root[1505]: ostree-prepare-root: Couldn't find specified OSTree root '/sysroot/ostree/boot.0/fedora/df242220d53f150e3ee8e641879667329d200c2361cc79d02e8d7fa7e346823b/0': No such file or directory

* Conclusion: System installs, bootloader loads kernel and initrd, but while booting, it fails to switch root. Can be made to fully boot by manually adding `rootflags=subvol=root` kernel parameter
* To fix: Davide's fix in comment 30 should fix this

3. /boot as subdirectory, / on main btrfs volume 
================================================
layout:

sda1  vfat  /boot/efi
sda2  btrfs /
  subvolume home  /var/home

results:

* Installation: Installer won't allow to proceed:

  /boot file system cannot be of type btrfs volume

* GRUB grub2-mkdocnfig entry: N/A
* GRUB BLS entry: N/A
* Root switch: N/A
* Conclusion: Installer won't allow to proceed in this configuration.
* To fix: decide if this should be allowed configuration; if yes, then allow this scenario in anaconda; if no, document it as non-allowed configuration?

4. /boot as subdirectory, / on root subvolume
=============================================
layout:

sda1  vfat  /boot/efi
sda2  btrfs (not-mounted)
  subvolume root  /
  subvolume home  /var/home

results:

* Installation: failed with error:

  The following error occured while installing. This is a fatal error and installation will be aborted.
  
  mount ['--bind', '/mnt/sysimage/boot/efi', '/mnt/sysroot/boot/efi'] exited with code 32

  Error in anaconda program-log:

  Running... mount --bind /mnt/sysimage/boot/efi /mnt/sysroot/boot/efi
  mount: /mnt/sysroot/boot/efi: mount point does not exist.
  Return code: 32

* GRUB grub2-mkdocnfig entry: N/A
* GRUB BLS entry: N/A
* Root switch: N/A
* Conclusion: System failed to install.
* To fix: Anaconda?

This scenario would be probably most interesting for those interested in running Silverblue on btrfs with non-encrypted root.

5. /boot as subvolume, / on main btrfs volume
=============================================
layout:

sda1  vfat  /boot/efi
sda2  btrfs /
  subvolume boot  /boot
  subvolume home  /var/home

results:

* Installation: OK
* GRUB grub2-mkdocnfig entry: incorrect path to kernel and initrd, missing the subvolume path: `($root)/ostree/...` -> `($root)/boot/ostree/...`, no `rooflags=subvol=` entry (not needed in this scenario). (In the changed path, /boot/ is subvolume name relative to the it's root, not /boot mount point.)
* GRUB BLS entry: incorrect path to kernel and initrd, missing the subvolume path: `/ostree/...` -> `/boot/ostree/...`, no `rootflags=subvol=` entry (not needed in this scenario)
* Root switch: OK
* Conclusion: Can be installed, can be made to boot, by manually editing grub.cfg/BLS entry paths. Paths won't be preserved by updates/running `grub2-mkconfig` or `ostree admin instutil grub2-generate`.
* To fix: `ostree admin instutil grub2-generate` and `/etc/grub.d/10_linux` should find correct path relative to the root of the boot device.

6. /boot as subvolume, / on root subvolume
==========================================
layout:

sda1  vfat  /boot/efi
sda2  btrfs (not-mounted)
  subvolume root  /
  subvolume boot  /boot
  subvolume home  /var/home

results:

* Installation: OK
* GRUB grub2-mkdocnfig entry: incorrect path to kernel and initrd, missing the subvolume path: `($root)/ostree/...` -> `($root)/boot/ostree/...`, no `rooflags=subvol=` entry
* GRUB BLS entry: incorrect path to kernel and initrd, missing the subvolume path: `/ostree/...` -> `/boot/ostree/...`, no `rootflags=subvol=` entry
* Root switch: failed:

  ostree-prepare-root[1505]: ostree-prepare-root: Couldn't find specified OSTree root '/sysroot/ostree/boot.0/fedora/df242220d53f150e3ee8e641879667329d200c2361cc79d02e8d7fa7e346823b/0': No such file or directory

* Conclusion: Can be installed, can be made to boot, by manually editing grub.cfg/BLS entry paths and by adding `rootflags=subvol=root` kernel parameter. Paths won't be preserved by updates/running `grub2-mkconfig` or `ostree admin instutil grub2-generate`.
* To fix: same as #2 + #5

This scenario would be probably most interesting for those interested in running Silverblue on btrfs with encrypted root and non-encrypted boot.

Comment 33 Chris Murphy 2020-07-13 20:05:28 UTC
Tomas this is super useful information. Is this in the Custom interface, or the Advanced-Custom (blivet-gui) interface?

I'd say in the current paradigm, both custom and advanced-custom need to enforce creation of subvolumes for any user defined mountpoints. Otherwise it rapidly gets impossibly complicated. Therefore 1, 3, 5 being allowed suggests a need for additional safeguarding. My suggestion is a new bug report, based on a to be determined Silverblue Rawhide build once all changes to date have actually landed in a compose - and use just the simplest of the three examples for the bug report along with Anaconda logs attached. 

4 and 6 might already be fixed in Rawhide due to other fixes related to /boot on Btrfs.

Comment 34 Tomas Kovar 2020-07-14 20:47:29 UTC
They all were configured using Advanced-Custom (blivet-gui) interface.

I also retried all of them with latest Rawhide build available (20200607), with exactly same results.

Comment 35 Neal Gompa 2020-07-14 23:15:53 UTC
The PR that Davide made has landed in anaconda-33.22-1.fc33, which should be in tonight's Rawhide compose: https://koji.fedoraproject.org/koji/buildinfo?buildID=1541717

Comment 36 Chris Murphy 2020-07-16 05:51:51 UTC
I don't think this is related, but I'm not certain.
https://bugzilla.redhat.com/show_bug.cgi?id=1857530

Comment 37 kxra 2020-07-24 15:10:59 UTC
It should be noted that the custom interface, when choosing to automatically configure a btrfs partitioning layout, a scheme like #2 is created. It was a little difficult to track down this ticket for a manual workaround

Comment 38 Chris Murphy 2020-07-31 23:49:06 UTC
This is fixed in Fedora-Silverblue-ostree-x86_64-Rawhide-20200731.n.0.iso. Clean default install succeeds, reboots without user intervention, and I was able to 'rpm-ostree install' to layer a package, reboot.

Comment 39 Chris Murphy 2020-07-31 23:54:08 UTC
BTW I'm not sure if it's possible to backport this fix to Fedora 32? Or if folks who rebase Silverblue to 33 inherit this fix? Or how that part of this works.

Comment 40 Junjie Yuan 2020-08-01 12:17:37 UTC
*** Bug 1829682 has been marked as a duplicate of this bug. ***

Comment 41 Tomas Kovar 2020-08-02 13:16:51 UTC
With Rawhide 2020-07-31, the results are following:

#1 and #2 install fine.

#3 and #4 fail at installation. New bug was filed: https://bugzilla.redhat.com/show_bug.cgi?id=1862784.

#5 and #6 fail at reboot, due to incorrect BLS/grub.cfg paths. New bug was filed: https://bugzilla.redhat.com/show_bug.cgi?id=1862783


Note You need to log in before you can comment on or make changes to this bug.