Hello, If I extract the vmlinuz and initrd from a Fedora minimal boot ISO, I can then set up a boot entry that has kernel parameters that look like this: inst.repo=hd:UUID=whatever:/fedora.iso Reproducible: Always Steps to Reproduce: 1. On an existing linux system with an XFS or ext4 /boot, download a fedora minimal ISO, extract the vmlinuz and initrd.img and place them in /boot, along with the ISO. (call it fedora.iso) 2. Identify the UUID of the /boot volume. (lets just call it $BOOTUUID) 3. Create a boot entry (potentially in /boot/loader/entries/ that boots that vmlinuz and initrd, and add the kernel parameter: inst.repo=hd:UUID=$BOOTUUID:/fedora.iso 4. Boot the new boot entry. Actual Results: In Fedora 38, this would pull down the stage2 installer and start the boot from there. In Fedora 39, I get this error: [ 5.213316] dracut-initqueue[1311]: mount: /: not mount point or bad option. [ 5.213347] dracut-initqueue[1311]: dmesg(1) may have more information after failed mount system call. [ 5.214527] dracut-initqueue[1312]: mount: /run/install/isodir: bad option: moving from a mount residing under a shared mount is unsupported. Expected Results: Booting into the Fedora installer with no issues. I have a script to automate migrating people's laptops from RHEL to Fedora (reloading in place) and it's been working fine for over 6 months using Fedora 37 and 38. Fedora 39 seems to be when it stopped working.
Can you please add rd.debug to the kernel cmdline, reproduce the issue and post the logs here? In ideal case, both from working and broken setup.
I've created a serial console on my VM and dumped the output to a file. Some additional information which I didn't realize was pertinent: I have a kickstart on the same device and filesystem as the ISO image that is read by the inst.stage2. I followed these steps for both the Fedora 38 and Fedora 39 netinst ISO on a CentOS 9 VM 1.) Downloaded Fedora-Everything-netinst-x86_64-38-1.6.iso and Fedora-Everything-netinst-x86_64-39-1.5.iso 2.) Install the 'libcdio' package (which includes /usr/bin/iso-read) 3.) Copy the ISO I'm testing to /boot/fedora.iso 4.) Run: iso-read -i /boot/fedora.iso --extract /images/pxeboot/vmlinuz --output-file /boot/vmlinuz 5.) Run: iso-read -i /boot/fedora.iso --extract /images/pxeboot/initrd.img --output-file /boot/initrd.img 6.) Copy a kickstart file to /boot/kickstart.cfg. I intentionally put one with a syntax error so the installer errors out before loading. This is fine for the test because the error in Fedora 39 happens during dracut-initqueue, well before we start parsing the kickstart. 7.) Create a BLS entry so grub2 can load the new install: # Get the machine-id MACHINE_ID=$(cat /etc/machine-id) # Get UUID of /boot BOOT_UUID=$( findmnt -no UUID /boot ) # Write boot entry cat > /boot/loader/entries/${MACHINE_ID}-99-fedora.conf <<EOF title Install Fedora version 1.0 linux /vmlinuz initrd /initrd.img options inst.stage2=hd:UUID=${BOOT_UUID}:/fedora.iso inst.ks=hd:UUID=${BOOT_UUID}:/kickstartcfg rd.debug console=ttyS1 id fedora-test grub_users \$grub_users grub_arg --unrestricted grub_class kernel EOF 8.) Add a serial device (in this example, ttyS1) that writes to a file. 9.) Reboot into the "Install Fedora" boot entry in GRUB2. 10.) Capture the serial output. I will attach the two serial log outputs.
Created attachment 2009941 [details] Fedora 38 netinst boot with kickstart
Created attachment 2009942 [details] Fedora 39 netinst with kickstart
You can see that line 4176 in the Fedora 38 boot log, it runs 'mount --make-rprivate /' with no error, but on line 4205 of the Fedora 39 boot log, it runs 'mount --make-rprivate' and mount errors out with: mount: /run/install/isodir: bad option; moving a mount residing under a shared mount is unsupported.
I've pinged util-linux maintainer to look at that. BUt honestly I have a feeling that this is a red herring. rprivate is the default. Also I know nothing about that part of the code, since that is probably called from the anaconda dracut module.
Ok, I was wrong; it is where things go south. [ 7.505027] dracut-initqueue[1137]: + mount --make-rprivate / [ 7.611779] loop: module loaded [ 7.505093] dracut-initqueue[1178]: mount: /: not mount point or bad option. [ 7.505104] dracut-initqueue[1178]: dmesg(1) may have more information after failed mount system call. [ 7.505118] dracut-initqueue[1137]: + mount --move /run/install/repo /run/install/isodir [ 7.506342] dracut-initqueue[1179]: mount: /run/install/isodir: bad option; moving a mount residing under a shared mount is unsupported. [ 7.506360] dracut-initqueue[1179]: dmesg(1) may have more information after failed mount system call. [ 7.506375] dracut-initqueue[1137]: + iso=/run/install/isodir//fedora.iso [ 7.506387] dracut-initqueue[1137]: + mount -o loop,ro /run/install/isodir//fedora.iso /run/install/repo [ 7.518671] dracut-initqueue[1180]: mount: /run/install/repo: failed to setup loop device for /run/install/isodir//fedora.iso. I will need some help from Karel; let's move it to util-linux
Btw I was partly wrong about private being default. Systemd remounts it to be shared https://github.com/systemd/systemd/blob/main/src/shared/mount-setup.c#L553
It would be nice to have strace output from the mount call (--make-rprivate), or define LIBMOUNT_DEBUG=all for the script ;-)
OK, I'm able to reproduce the problem. The problem is mount_setattr() syscall, which ends with EINVAL. In the same situation, mount(2) is successful ... not sure why. A simple workaround is to call mount(8) with "LIBMOUNT_FORCE_MOUNT2=always mount --make-rprivate /". The variable disables the new mount kernel API.
Just for the record. The simplest way to reproduce the problem is to reboot arbitrary Fedora 39 and add "rd.break" to the kernel command line. It will stop booting before the real system root is mounted, then you can use "mount --make-rprivate /" to see the problem. Example (with strace): # mount --make-rprivate / open_tree(AT_FDCWD, "/", OPEN_TREE_CLOEXEC) = 3 mount_setattr(-1, NULL, 0, NULL, 0) = -1 EINVAL (Invalid argument) mount_setattr(3, "", AT_EMPTY_PATH|AT_RECURSIVE, {attr_set=0, attr_clr=0, propagation=MS_PRIVATE, userns_ mount: /: not mount point or bad option. dmesg(1) may have more information after failed mount system call. +++ exited with 32 +++ The same situation but with mount(2) syscall: # LIBMOUNT_FORCE_MOUNT2=always mount --make-rprivate / mount("none", "/", NULL, MS_REC|MS_PRIVATE, NULL) = 0 +++ exited with 0 +++ # findmnt -o+PROPAGATION TARGET SOURCE FSTYPE OPTIONS PROPAGATION / rootfs rootfs rw private |-/proc proc proc rw,nosuid,nodev,noexec,relatime private |-/sys sysfs sysfs rw,nosuid,nodev,noexec,relatime private | |-/sys/kernel/security securityfs securityfs rw,nosuid,nodev,noexec,relatime private | |-/sys/fs/cgroup cgroup2 cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot private | |-/sys/fs/pstore pstore pstore rw,nosuid,nodev,noexec,relatime private | |-/sys/fs/bpf bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 private | `-/sys/kernel/config configfs configfs rw,nosuid,nodev,noexec,relatime private |-/dev devtmpfs devtmpfs rw,nosuid,size=4096k,nr_inodes=246475,mode=755,inode64 private | |-/dev/shm tmpfs tmpfs rw,nosuid,nodev,inode64 private | `-/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 private |-/run tmpfs tmpfs rw,nosuid,nodev,size=400108k,nr_inodes=819200,mode=755,inode64 private `-/sysroot /dev/vda3[/root] btrfs ro,relatime,discard=async,space_cache=v2,subvolid=257,subvol=/root private I've tested it with ext4 and btrfs, and the result is the same as expected.
Ah... the strace output without truncation: open_tree(AT_FDCWD, "/", OPEN_TREE_CLOEXEC) = 3 mount_setattr(-1, NULL, 0, NULL, 0) = -1 EINVAL (Invalid argument) mount_setattr(3, "", AT_EMPTY_PATH|AT_RECURSIVE, {attr_set=0, attr_clr=0, propagation=MS_PRIVATE, userns_fd=0}, 32) = -1 EINVAL (Invalid argument) Note that the first mount_setattr(-1, ...) call is just a libmount test to verify that the kernel supports the new mount API.
So the only reason I can currently see for this is that check_mnt() fails. And for that to be the case the caller must be in a different mount namespace than the mount. So when that script runs does it somehow unshare or create a mount namespace?
Ok, figure it out afaict: https://lore.kernel.org/all/20240206-vfs-mount-rootfs-v1-1-19b335eee133@kernel.org
VFS issue, moving to the kernel.