Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2007085

Summary: Intermittent failure mounting /run/media/iso when booting live ISO from USB stick
Product: OpenShift Container Platform Reporter: Benjamin Gilbert <bgilbert>
Component: RHCOSAssignee: Benjamin Gilbert <bgilbert>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: dornelas, hhei, jligon, miabbott, mrussell, nstielau, sasha
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2007089 (view as bug list) Environment:
Last Closed: 2022-03-10 16:12:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2004596    
Bug Blocks: 2007089    

Description Benjamin Gilbert 2021-09-23 05:57:11 UTC
When booting the RHCOS live ISO image from a USB stick, boot sometimes fails with the following error:

[    9.410095] systemd[1]: Reached target Initrd Root Device.
[    9.415226] systemd[1]: Mounting /run/media/iso...
[    9.498191] ISOFS: Unable to identify CD-ROM format.
[    9.499552] mount[722]: mount: /run/media/iso: wrong fs type, bad option, bad superblock on /dev/sda2, missing codepage or helper program, or other error.
Failed to mount /run/media/iso.
See 'systemctl status run-media-iso.mount' for details.

35coreos-live/live-generator creates run-media-iso.mount, which mounts the ISO from /dev/disk/by-label/rhcos-<version>.  The latter should point to /dev/sda1, but in this case it's pointing to /dev/sda2, which is efiboot.img.  /dev/sda2 does not have a filesystem label, but because of https://github.com/systemd/systemd/issues/14408, 60-persistent-storage.rules falls back to using the filesystem label from /dev/sda (the whole-disk device) when evaluating symlinks for /dev/sda2.  As a result, /dev/sda1 and /dev/sda2 race to own the /dev/disk/by-label symlink, and if sda2 wins, the mount fails.

The race occurs on BIOS, UEFI El Torito, and UEFI GPT boots; kola testiso iso-as-disk has reproduced this on both BIOS and UEFI.  It does not affect CD or virtual CD boot because CD-ROM devices are not partitionable.  It does not affect Fedora CoreOS because the udev rules on FCOS include https://github.com/systemd/systemd/pull/14485.

The problem affects RHCOS 4.7 and up, but not the existing 4.6 ISO because the hybrid partition table in that release does not include a partition for efiboot.img.  However, the fix for bug 2004679 (https://github.com/coreos/coreos-assembler/pull/2439) backports the hybrid ESP to 4.6, so currently the next bootimage bump will cause 4.6 to regress.

To verify a fix for this bug, boot with a `udev.log_priority=debug` karg and ensure `journalctl -u systemd-udevd.service` has no mention of a `/dev/disk/by-label/rhcos-*` symlink being applied to `/dev/vda2` or `/dev/sda2`.

Comment 1 Benjamin Gilbert 2021-09-23 05:58:46 UTC
*** Bug 1996882 has been marked as a duplicate of this bug. ***

Comment 2 Benjamin Gilbert 2021-09-24 22:56:57 UTC
Fixed in 410.84.202109241829-0 on x86_64, and other arches are not affected.

Comment 3 RHCOS Bug Bot 2021-09-24 22:57:18 UTC
This bug has been reported fixed in a new RHCOS build.  Do not move this bug to MODIFIED until the fix has landed in a new bootimage.

Comment 4 HuijingHei 2021-10-12 07:29:57 UTC
PreVerified on rhcos 410.84.202110081440-0

Downloads live ISO and wrote to USB drive with
`dd if=rhcos-410.84.202110081440-0-live.x86_64.iso of=/dev/sdb oflag=sync status=progress`

Then booted USB drive and edited the kernel arguments to include
`udev.log_priority=debug`

`journalctl -u systemd-udevd.service --no-pager | grep rhcos` returned no symlinks pointing to /dev/sdb2 or /dev/vdb2

Comment 5 RHCOS Bug Bot 2021-10-20 17:53:45 UTC
The fix for this bug has landed in a bootimage bump, as tracked in bug 2004596 (now in status MODIFIED).  Moving this bug to MODIFIED.

Comment 7 Michael Nguyen 2021-10-25 14:23:55 UTC
Moving to verified as the boot image has been landed and verified on https://bugzilla.redhat.com/show_bug.cgi?id=2004596

Comment 10 errata-xmlrpc 2022-03-10 16:12:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056