2256843 – Booting from local ISO file no longer works in Fedora 39

Bug 2256843 - Booting from local ISO file no longer works in Fedora 39

Summary: Booting from local ISO file no longer works in Fedora 39

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	39
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2184978
TreeView+	depends on / blocked

Reported:	2024-01-04 18:34 UTC by Jonathan Billings
Modified:	2024-11-27 22:34 UTC (History)
CC List:	25 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2024-11-27 22:34:04 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Jonathan Billings 2024-01-04 18:34:53 UTC

Hello,

If I extract the vmlinuz and initrd from a Fedora minimal boot ISO, I can then set up a boot entry that has kernel parameters that look like this:

inst.repo=hd:UUID=whatever:/fedora.iso



Reproducible: Always

Steps to Reproduce:
1. On an existing linux system with an XFS or ext4 /boot, download a fedora minimal ISO, extract the vmlinuz and initrd.img and place them in /boot, along with the ISO. (call it fedora.iso)
2. Identify the UUID of the /boot volume. (lets just call it $BOOTUUID)
3. Create a boot entry (potentially in /boot/loader/entries/ that boots that vmlinuz and initrd, and add the kernel parameter: inst.repo=hd:UUID=$BOOTUUID:/fedora.iso
4. Boot the new boot entry.
Actual Results:  
In Fedora 38, this would pull down the stage2 installer and start the boot from there.

In Fedora 39, I get this error:
[    5.213316] dracut-initqueue[1311]: mount: /: not mount point or bad option.
[    5.213347] dracut-initqueue[1311]:        dmesg(1) may have more information after failed mount system call.
[    5.214527] dracut-initqueue[1312]: mount: /run/install/isodir: bad option: moving from a mount residing under a shared mount is unsupported.


Expected Results:  
Booting into the Fedora installer with no issues.

I have a script to automate migrating people's laptops from RHEL to Fedora (reloading in place) and it's been working fine for over 6 months using Fedora 37 and 38.  Fedora 39 seems to be when it stopped working.

Comment 1 Lukáš Nykrýn 2024-01-23 11:53:44 UTC

Can you please add rd.debug to the kernel cmdline, reproduce the issue and post the logs here? In ideal case, both from working and broken setup.

Comment 2 Jonathan Billings 2024-01-23 16:16:43 UTC

I've created a serial console on my VM and dumped the output to a file.  Some additional information which I didn't realize was pertinent: I have a kickstart on the same device and filesystem as the ISO image that is read by the inst.stage2.


I followed these steps for both the Fedora 38 and Fedora 39 netinst ISO on a CentOS 9 VM

1.) Downloaded Fedora-Everything-netinst-x86_64-38-1.6.iso and Fedora-Everything-netinst-x86_64-39-1.5.iso
2.) Install the 'libcdio' package (which includes /usr/bin/iso-read)
3.) Copy the ISO I'm testing to /boot/fedora.iso
4.) Run: iso-read -i /boot/fedora.iso --extract /images/pxeboot/vmlinuz --output-file /boot/vmlinuz
5.) Run: iso-read -i /boot/fedora.iso --extract /images/pxeboot/initrd.img --output-file /boot/initrd.img
6.) Copy a kickstart file to /boot/kickstart.cfg.  I intentionally put one with a syntax error so the installer errors out before loading. This is fine for the test because the error in Fedora 39 happens during dracut-initqueue, well before we start parsing the kickstart.
7.) Create a BLS entry so grub2 can load the new install:
# Get the machine-id
MACHINE_ID=$(cat /etc/machine-id)
# Get UUID of /boot
BOOT_UUID=$( findmnt -no UUID /boot )
# Write boot entry
cat > /boot/loader/entries/${MACHINE_ID}-99-fedora.conf <<EOF
title Install Fedora
version 1.0
linux /vmlinuz
initrd /initrd.img
options inst.stage2=hd:UUID=${BOOT_UUID}:/fedora.iso inst.ks=hd:UUID=${BOOT_UUID}:/kickstartcfg rd.debug console=ttyS1
id fedora-test
grub_users \$grub_users
grub_arg --unrestricted
grub_class kernel
EOF
8.) Add a serial device (in this example, ttyS1) that writes to a file.
9.) Reboot into the "Install Fedora" boot entry in GRUB2.
10.) Capture the serial output.

I will attach the two serial log outputs.

Comment 3 Jonathan Billings 2024-01-23 16:18:02 UTC

Created attachment 2009941 [details]
Fedora 38 netinst boot with kickstart

Comment 4 Jonathan Billings 2024-01-23 16:18:33 UTC

Created attachment 2009942 [details]
Fedora 39 netinst with kickstart

Comment 5 Jonathan Billings 2024-01-23 16:22:50 UTC

You can see that line 4176 in the Fedora 38 boot log, it runs 'mount --make-rprivate /' with no error, but on line 4205 of the Fedora 39 boot log, it runs 'mount --make-rprivate' and mount errors out with: mount: /run/install/isodir: bad option; moving a mount residing under a shared mount is unsupported.

Comment 6 Lukáš Nykrýn 2024-01-24 15:29:18 UTC

I've pinged util-linux maintainer to look at that. BUt honestly I have a feeling that this is a red herring. rprivate is the default. Also I know nothing about that part of the code, since that is probably called from the anaconda dracut module.

Comment 7 Lukáš Nykrýn 2024-01-24 15:51:56 UTC

Ok, I was wrong; it is where things go south.

[    7.505027] dracut-initqueue[1137]: + mount --make-rprivate /
[    7.611779] loop: module loaded
[    7.505093] dracut-initqueue[1178]: mount: /: not mount point or bad option.
[    7.505104] dracut-initqueue[1178]:        dmesg(1) may have more information after failed mount system call.
[    7.505118] dracut-initqueue[1137]: + mount --move /run/install/repo /run/install/isodir
[    7.506342] dracut-initqueue[1179]: mount: /run/install/isodir: bad option; moving a mount residing under a shared mount is unsupported.
[    7.506360] dracut-initqueue[1179]:        dmesg(1) may have more information after failed mount system call.
[    7.506375] dracut-initqueue[1137]: + iso=/run/install/isodir//fedora.iso
[    7.506387] dracut-initqueue[1137]: + mount -o loop,ro /run/install/isodir//fedora.iso /run/install/repo
[    7.518671] dracut-initqueue[1180]: mount: /run/install/repo: failed to setup loop device for /run/install/isodir//fedora.iso.

I will need some help from Karel; let's move it to util-linux

Comment 8 Lukáš Nykrýn 2024-01-24 16:28:29 UTC

Btw I was partly wrong about private being default. Systemd remounts it to be shared

https://github.com/systemd/systemd/blob/main/src/shared/mount-setup.c#L553

Comment 9 Karel Zak 2024-01-24 21:26:30 UTC

It would be nice to have strace output from the mount call (--make-rprivate), or define LIBMOUNT_DEBUG=all for the script ;-)

Comment 10 Karel Zak 2024-01-26 10:14:46 UTC

OK, I'm able to reproduce the problem. The problem is mount_setattr() syscall, which ends with EINVAL. In the same situation, mount(2) is successful ... not sure why.

A simple workaround is to call mount(8) with "LIBMOUNT_FORCE_MOUNT2=always mount --make-rprivate /". The variable disables the new mount kernel API.

Comment 11 Karel Zak 2024-01-30 14:48:38 UTC

Just for the record.

The simplest way to reproduce the problem is to reboot arbitrary Fedora 39 and add "rd.break" to the kernel command line. It will stop booting before the real system root is mounted, then you can use "mount --make-rprivate /" to see the problem.

Example (with strace):

# mount --make-rprivate /
   
open_tree(AT_FDCWD, "/", OPEN_TREE_CLOEXEC) = 3     
mount_setattr(-1, NULL, 0, NULL, 0)     = -1 EINVAL (Invalid argument)
mount_setattr(3, "", AT_EMPTY_PATH|AT_RECURSIVE, {attr_set=0, attr_clr=0, propagation=MS_PRIVATE, userns_
   
mount: /: not mount point or bad option.       
       dmesg(1) may have more information after failed mount system call.
+++ exited with 32 +++
   

The same situation but with mount(2) syscall:
   
# LIBMOUNT_FORCE_MOUNT2=always mount --make-rprivate /
   
mount("none", "/", NULL, MS_REC|MS_PRIVATE, NULL) = 0
+++ exited with 0 +++ 
   
# findmnt -o+PROPAGATION
TARGET                   SOURCE           FSTYPE     OPTIONS                                                            PROPAGATION
/                        rootfs           rootfs     rw                                                                 private
|-/proc                  proc             proc       rw,nosuid,nodev,noexec,relatime                                    private
|-/sys                   sysfs            sysfs      rw,nosuid,nodev,noexec,relatime                                    private
| |-/sys/kernel/security securityfs       securityfs rw,nosuid,nodev,noexec,relatime                                    private
| |-/sys/fs/cgroup       cgroup2          cgroup2    rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot    private
| |-/sys/fs/pstore       pstore           pstore     rw,nosuid,nodev,noexec,relatime                                    private
| |-/sys/fs/bpf          bpf              bpf        rw,nosuid,nodev,noexec,relatime,mode=700                           private
| `-/sys/kernel/config   configfs         configfs   rw,nosuid,nodev,noexec,relatime                                    private
|-/dev                   devtmpfs         devtmpfs   rw,nosuid,size=4096k,nr_inodes=246475,mode=755,inode64             private
| |-/dev/shm             tmpfs            tmpfs      rw,nosuid,nodev,inode64                                            private
| `-/dev/pts             devpts           devpts     rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000              private
|-/run                   tmpfs            tmpfs      rw,nosuid,nodev,size=400108k,nr_inodes=819200,mode=755,inode64     private
`-/sysroot               /dev/vda3[/root] btrfs      ro,relatime,discard=async,space_cache=v2,subvolid=257,subvol=/root private


I've tested it with ext4 and btrfs, and the result is the same as expected.

Comment 12 Karel Zak 2024-01-30 14:58:35 UTC

Ah... the strace output without truncation:

open_tree(AT_FDCWD, "/", OPEN_TREE_CLOEXEC) = 3
mount_setattr(-1, NULL, 0, NULL, 0)     = -1 EINVAL (Invalid argument)
mount_setattr(3, "", AT_EMPTY_PATH|AT_RECURSIVE, {attr_set=0, attr_clr=0, propagation=MS_PRIVATE, userns_fd=0}, 32) = -1 EINVAL (Invalid argument)

Note that the first mount_setattr(-1, ...) call is just a libmount test to verify that the kernel supports the new mount API.

Comment 13 Christian Brauner 2024-02-05 13:52:23 UTC

So the only reason I can currently see for this is that check_mnt() fails. And for that to be the case the caller must be in a different mount namespace than the mount.
So when that script runs does it somehow unshare or create a mount namespace?

Comment 14 Christian Brauner 2024-02-06 10:34:01 UTC

Ok, figure it out afaict: https://lore.kernel.org/all/20240206-vfs-mount-rootfs-v1-1-19b335eee133@kernel.org

Comment 15 Karel Zak 2024-02-07 14:03:07 UTC

VFS issue, moving to the kernel.

Comment 16 Aoife Moloney 2024-11-13 10:24:05 UTC

This message is a reminder that Fedora Linux 39 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 39 on 2024-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '39'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 39 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 17 Aoife Moloney 2024-11-27 22:34:04 UTC

Fedora Linux 39 entered end-of-life (EOL) status on 2024-11-26.

Fedora Linux 39 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.

acaringi
adscvr
airlied
alciregi
bskeggs
christianvanbrauner
dracut-maint-list
hdegoede
hpa
jamacku
jarod
josef
kernel-maint
kparal
kzak
linville
lnykryn
masami256
mchehab
ngompa13
nixuser
pstourac
ptalbert
pvalena
steved