Bug 1909455
Summary: | Boot disk RAID will not boot if the primary disk enumerates but fails I/O | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Benjamin Gilbert <bgilbert> |
Component: | RHCOS | Assignee: | Benjamin Gilbert <bgilbert> |
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.7 | CC: | bbreard, imcleod, jligon, nstielau |
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-02-24 15:47:16 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1915617 |
Description
Benjamin Gilbert
2020-12-20 05:07:30 UTC
Unable to simulate disk I/O error so just verified RAID /boot and grub.cfg file contains the correct bits. [core@cosa-devsh ~]$ rpm-ostree status State: idle Deployments: * ostree://8e87a86b9444784ab29e7917fa82e00d5e356f18b19449946b687ee8dc27c51a Version: 47.83.202101161239-0 (2021-01-16T12:43:01Z) [core@cosa-devsh ~]$ lsblk -f NAME FSTYPE LABEL UUID MOUNTPOINT sr0 vda |-vda1 |-vda2 vfat esp-1 925B-A4E7 |-vda3 linux_raid_member any:md-boot 719af5c2-ad77-c76d-5bf7-386f2615494c | `-md127 ext4 boot 7b8a382d-3039-4910-bc03-82b2775c2a64 /boot `-vda4 linux_raid_member any:md-root fc5fb428-9c1c-15ee-c51e-45258bc646fe `-md126 xfs root 0f752e48-64b5-4db0-907a-e736e1d2313e /sysroot vdb |-vdb1 |-vdb2 vfat esp-2 925B-FC96 |-vdb3 linux_raid_member any:md-boot 719af5c2-ad77-c76d-5bf7-386f2615494c | `-md127 ext4 boot 7b8a382d-3039-4910-bc03-82b2775c2a64 /boot `-vdb4 linux_raid_member any:md-root fc5fb428-9c1c-15ee-c51e-45258bc646fe `-md126 xfs root 0f752e48-64b5-4db0-907a-e736e1d2313e /sysroot vdc |-vdc1 |-vdc2 vfat EFI-SYSTEM F811-ED3D |-vdc3 ext4 boot 07ca1891-f27a-421d-a2f9-70326ca46858 `-vdc4 xfs root 910678ff-f77e-4a7d-8d53-86f2ac47a823 [core@cosa-devsh ~]$ cat /boot/grub2/grub.cfg set pager=1 # petitboot doesn't support -e and doesn't support an empty path part if [ -d (md/md-boot)/grub2 ]; then # fcct currently creates /boot RAID with superblock 1.0, which allows # component partitions to be read directly as filesystems. This is # necessary because transposefs doesn't yet rerun grub2-install on BIOS, # so GRUB still expects /boot to be a partition on the first disk. # # There are two consequences: # 1. On BIOS and UEFI, the search command might pick an individual RAID # component, but we want it to use the full RAID in case there are bad # sectors etc. The undocumented --hint option is supposed to support # this sort of override, but it doesn't seem to work, so we set $boot # directly. # 2. On BIOS, the "normal" module has already been loaded from an # individual RAID component, and $prefix still points there. We want # future module loads to come from the RAID, so we reset $prefix. # (On UEFI, the stub grub.cfg has already set $prefix properly.) set boot=md/md-boot set prefix=($boot)/grub2 else search --label boot --set boot fi set root=$boot if [ -f ${config_directory}/grubenv ]; then load_env -f ${config_directory}/grubenv elif [ -s $prefix/grubenv ]; then load_env fi if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi function load_video { if [ x$feature_all_video_module = xy ]; then insmod all_video else insmod efi_gop insmod efi_uga insmod ieee1275_fb insmod vbe insmod vga insmod video_bochs insmod video_cirrus fi } serial --speed=115200 terminal_input serial console terminal_output serial console if [ x$feature_timeout_style = xy ] ; then set timeout_style=menu set timeout=1 # Fallback normal timeout code in case the timeout_style feature is # unavailable. else set timeout=1 fi # Determine if this is a first boot and set the ${ignition_firstboot} variable # which is used in the kernel command line. set ignition_firstboot="" if [ -f "/ignition.firstboot" ]; then # Default networking parameters to be used with ignition. set ignition_network_kcmdline='' # Source in the `ignition.firstboot` file which could override the # above $ignition_network_kcmdline with static networking config. # This override feature is also by coreos-installer to persist static # networking config provided during install to the first boot of the machine. source "/ignition.firstboot" set ignition_firstboot="ignition.firstboot ${ignition_network_kcmdline}" fi blscfg [core@cosa-devsh ~]$ rpm-ostree status State: idle Deployments: * ostree://8e87a86b9444784ab29e7917fa82e00d5e356f18b19449946b687ee8dc27c51a Version: 47.83.202101161239-0 (2021-01-16T12:43:01Z) [core@cosa-devsh ~]$ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |