Bug 2007085 - Intermittent failure mounting /run/media/iso when booting live ISO from USB stick
Summary: Intermittent failure mounting /run/media/iso when booting live ISO from USB s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.10
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: Benjamin Gilbert
QA Contact: Michael Nguyen
URL:
Whiteboard:
: 1996882 (view as bug list)
Depends On: 2004596
Blocks: 2007089
TreeView+ depends on / blocked
 
Reported: 2021-09-23 05:57 UTC by Benjamin Gilbert
Modified: 2022-03-10 16:13 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2007089 (view as bug list)
Environment:
Last Closed: 2022-03-10 16:12:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github coreos coreos-assembler pull 2458 0 None open buildextend-live: fix `/run/media/iso` mount flake booting from disk 2021-09-23 06:38:39 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:13:13 UTC

Description Benjamin Gilbert 2021-09-23 05:57:11 UTC
When booting the RHCOS live ISO image from a USB stick, boot sometimes fails with the following error:

[    9.410095] systemd[1]: Reached target Initrd Root Device.
[    9.415226] systemd[1]: Mounting /run/media/iso...
[    9.498191] ISOFS: Unable to identify CD-ROM format.
[    9.499552] mount[722]: mount: /run/media/iso: wrong fs type, bad option, bad superblock on /dev/sda2, missing codepage or helper program, or other error.
Failed to mount /run/media/iso.
See 'systemctl status run-media-iso.mount' for details.

35coreos-live/live-generator creates run-media-iso.mount, which mounts the ISO from /dev/disk/by-label/rhcos-<version>.  The latter should point to /dev/sda1, but in this case it's pointing to /dev/sda2, which is efiboot.img.  /dev/sda2 does not have a filesystem label, but because of https://github.com/systemd/systemd/issues/14408, 60-persistent-storage.rules falls back to using the filesystem label from /dev/sda (the whole-disk device) when evaluating symlinks for /dev/sda2.  As a result, /dev/sda1 and /dev/sda2 race to own the /dev/disk/by-label symlink, and if sda2 wins, the mount fails.

The race occurs on BIOS, UEFI El Torito, and UEFI GPT boots; kola testiso iso-as-disk has reproduced this on both BIOS and UEFI.  It does not affect CD or virtual CD boot because CD-ROM devices are not partitionable.  It does not affect Fedora CoreOS because the udev rules on FCOS include https://github.com/systemd/systemd/pull/14485.

The problem affects RHCOS 4.7 and up, but not the existing 4.6 ISO because the hybrid partition table in that release does not include a partition for efiboot.img.  However, the fix for bug 2004679 (https://github.com/coreos/coreos-assembler/pull/2439) backports the hybrid ESP to 4.6, so currently the next bootimage bump will cause 4.6 to regress.

To verify a fix for this bug, boot with a `udev.log_priority=debug` karg and ensure `journalctl -u systemd-udevd.service` has no mention of a `/dev/disk/by-label/rhcos-*` symlink being applied to `/dev/vda2` or `/dev/sda2`.

Comment 1 Benjamin Gilbert 2021-09-23 05:58:46 UTC
*** Bug 1996882 has been marked as a duplicate of this bug. ***

Comment 2 Benjamin Gilbert 2021-09-24 22:56:57 UTC
Fixed in 410.84.202109241829-0 on x86_64, and other arches are not affected.

Comment 3 RHCOS Bug Bot 2021-09-24 22:57:18 UTC
This bug has been reported fixed in a new RHCOS build.  Do not move this bug to MODIFIED until the fix has landed in a new bootimage.

Comment 4 HuijingHei 2021-10-12 07:29:57 UTC
PreVerified on rhcos 410.84.202110081440-0

Downloads live ISO and wrote to USB drive with
`dd if=rhcos-410.84.202110081440-0-live.x86_64.iso of=/dev/sdb oflag=sync status=progress`

Then booted USB drive and edited the kernel arguments to include
`udev.log_priority=debug`

`journalctl -u systemd-udevd.service --no-pager | grep rhcos` returned no symlinks pointing to /dev/sdb2 or /dev/vdb2

Comment 5 RHCOS Bug Bot 2021-10-20 17:53:45 UTC
The fix for this bug has landed in a bootimage bump, as tracked in bug 2004596 (now in status MODIFIED).  Moving this bug to MODIFIED.

Comment 7 Michael Nguyen 2021-10-25 14:23:55 UTC
Moving to verified as the boot image has been landed and verified on https://bugzilla.redhat.com/show_bug.cgi?id=2004596

Comment 10 errata-xmlrpc 2022-03-10 16:12:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.