Bug 1909455 - Boot disk RAID will not boot if the primary disk enumerates but fails I/O
Summary: Boot disk RAID will not boot if the primary disk enumerates but fails I/O
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Benjamin Gilbert
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks: 1915617
TreeView+ depends on / blocked
 
Reported: 2020-12-20 05:07 UTC by Benjamin Gilbert
Modified: 2021-02-24 15:47 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:47:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github coreos coreos-assembler pull 1979 0 None closed grub: read from md/md-boot if it exists 2021-01-25 15:22:21 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:47:59 UTC

Description Benjamin Gilbert 2020-12-20 05:07:30 UTC
The GRUB configuration for boot disk RAID assumes that if the primary disk fails, it'll drop off the bus entirely.  If the disk enumerates but fails I/O, GRUB will fail when reading data from the first disk.

We need to reconfigure GRUB to treat /boot as a RAID, rather than reading directly from the first replica.  We can entirely fix this on UEFI and reduce the exposure window on BIOS.  Fixing it entirely on BIOS will require bootupd to support reinstalling BIOS GRUB, and is out of scope for this bug.

Comment 2 Michael Nguyen 2021-01-25 17:50:29 UTC
Unable to simulate disk I/O error so just verified  RAID /boot and grub.cfg file contains the correct bits.

[core@cosa-devsh ~]$ rpm-ostree status
State: idle
Deployments:
* ostree://8e87a86b9444784ab29e7917fa82e00d5e356f18b19449946b687ee8dc27c51a
                   Version: 47.83.202101161239-0 (2021-01-16T12:43:01Z)

[core@cosa-devsh ~]$ lsblk -f
NAME      FSTYPE            LABEL       UUID                                 MOUNTPOINT
sr0                                                                          
vda                                                                          
|-vda1                                                                       
|-vda2    vfat              esp-1       925B-A4E7                            
|-vda3    linux_raid_member any:md-boot 719af5c2-ad77-c76d-5bf7-386f2615494c 
| `-md127 ext4              boot        7b8a382d-3039-4910-bc03-82b2775c2a64 /boot
`-vda4    linux_raid_member any:md-root fc5fb428-9c1c-15ee-c51e-45258bc646fe 
  `-md126 xfs               root        0f752e48-64b5-4db0-907a-e736e1d2313e /sysroot
vdb                                                                          
|-vdb1                                                                       
|-vdb2    vfat              esp-2       925B-FC96                            
|-vdb3    linux_raid_member any:md-boot 719af5c2-ad77-c76d-5bf7-386f2615494c 
| `-md127 ext4              boot        7b8a382d-3039-4910-bc03-82b2775c2a64 /boot
`-vdb4    linux_raid_member any:md-root fc5fb428-9c1c-15ee-c51e-45258bc646fe 
  `-md126 xfs               root        0f752e48-64b5-4db0-907a-e736e1d2313e /sysroot
vdc                                                                          
|-vdc1                                                                       
|-vdc2    vfat              EFI-SYSTEM  F811-ED3D                            
|-vdc3    ext4              boot        07ca1891-f27a-421d-a2f9-70326ca46858 
`-vdc4    xfs               root        910678ff-f77e-4a7d-8d53-86f2ac47a823 

[core@cosa-devsh ~]$ cat /boot/grub2/grub.cfg 
set pager=1
# petitboot doesn't support -e and doesn't support an empty path part
if [ -d (md/md-boot)/grub2 ]; then
  # fcct currently creates /boot RAID with superblock 1.0, which allows
  # component partitions to be read directly as filesystems.  This is
  # necessary because transposefs doesn't yet rerun grub2-install on BIOS,
  # so GRUB still expects /boot to be a partition on the first disk.
  #
  # There are two consequences:
  # 1. On BIOS and UEFI, the search command might pick an individual RAID
  #    component, but we want it to use the full RAID in case there are bad
  #    sectors etc.  The undocumented --hint option is supposed to support
  #    this sort of override, but it doesn't seem to work, so we set $boot
  #    directly.
  # 2. On BIOS, the "normal" module has already been loaded from an
  #    individual RAID component, and $prefix still points there.  We want
  #    future module loads to come from the RAID, so we reset $prefix.
  #    (On UEFI, the stub grub.cfg has already set $prefix properly.)
  set boot=md/md-boot
  set prefix=($boot)/grub2
else
  search --label boot --set boot
fi
set root=$boot

if [ -f ${config_directory}/grubenv ]; then
  load_env -f ${config_directory}/grubenv
elif [ -s $prefix/grubenv ]; then
  load_env
fi

if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
else
  menuentry_id_option=""
fi

function load_video {
  if [ x$feature_all_video_module = xy ]; then
    insmod all_video
  else
    insmod efi_gop
    insmod efi_uga
    insmod ieee1275_fb
    insmod vbe
    insmod vga
    insmod video_bochs
    insmod video_cirrus
  fi
}

serial --speed=115200
terminal_input serial console
terminal_output serial console
if [ x$feature_timeout_style = xy ] ; then
  set timeout_style=menu
  set timeout=1
# Fallback normal timeout code in case the timeout_style feature is
# unavailable.
else
  set timeout=1
fi

# Determine if this is a first boot and set the ${ignition_firstboot} variable
# which is used in the kernel command line.
set ignition_firstboot=""
if [ -f "/ignition.firstboot" ]; then
    # Default networking parameters to be used with ignition.
    set ignition_network_kcmdline=''

    # Source in the `ignition.firstboot` file which could override the
    # above $ignition_network_kcmdline with static networking config.
    # This override feature is also by coreos-installer to persist static
    # networking config provided during install to the first boot of the machine.
    source "/ignition.firstboot"

    set ignition_firstboot="ignition.firstboot ${ignition_network_kcmdline}"
fi

blscfg
[core@cosa-devsh ~]$ rpm-ostree status
State: idle
Deployments:
* ostree://8e87a86b9444784ab29e7917fa82e00d5e356f18b19449946b687ee8dc27c51a
                   Version: 47.83.202101161239-0 (2021-01-16T12:43:01Z)
[core@cosa-devsh ~]$

Comment 5 errata-xmlrpc 2021-02-24 15:47:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.