Bug 1968513 - Booting rhcos 4.8.0-fc.5 on a Dell R740 resulted in a grub failure
Summary: Booting rhcos 4.8.0-fc.5 on a Dell R740 resulted in a grub failure
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Bare Metal Hardware Provisioning
Version: 4.8
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: 4.9.0
Assignee: Derek Higgins
QA Contact: Amit Ugol
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-07 13:24 UTC by Bob Fournier
Modified: 2021-07-22 15:14 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-22 15:14:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Grub failure console screenshot (173.49 KB, image/png)
2021-06-07 13:24 UTC, Bob Fournier
no flags Details

Description Bob Fournier 2021-06-07 13:24:44 UTC
Created attachment 1789221 [details]
Grub failure console screenshot

Description of problem:

We have a cluster of 5 R740s (3 masters and 2 workers) in a Baremetal IPI setup. One of the workers failed to boot with the grub error in the attached screenshot

Version-Release number of selected component (if applicable):

bootstrapOSImage: rhcos-48.84.202105190318-0-qemu.x86_64.qcow2.gz?sha256=84683a75c0e3d164c1d4a95448e142490a0bf91ff07076bff2b3bbc209c6c368#
clusterOSImage: rhcos-48.84.202105190318-0-openstack.x86_64.qcow2.gz?sha256=37a156f9f2b0efded45cb3cd5688aa2d42c26873a534951484e96f546a6b2c84#

How reproducible:

Occurred on 1 of 5 systems. We are retrying the deployment and will update the results here.

Comment 1 Derek Higgins 2021-06-09 10:40:25 UTC
This isn't happening on all reboots, but when it does it appear as though
input is been sent to the grub menu screen causing is to enter the grub console.

I can then scroll through this text in the grub console history with the up arrow.

I've reboot iDrac to see if it is somehow responsible for sending this text to the grub menu. 
I haven't see the problem occur since the reboot, I'll update here once I'm sure the problem
isn't coming back.

Comment 2 Tomas Sedovic 2021-06-11 11:47:27 UTC
Moving back from RHEL to OCP/Bare Metal/Ironic for now.

We've discovered this issue while investigating https://bugzilla.redhat.com/show_bug.cgi?id=1966129. The workaround we plan to use (https://review.opendev.org/c/openstack/ironic-python-agent/+/795862) might resolve the issue or change the behaviour.

We will take a look again once it's merged and investigate further.

Comment 3 Derek Higgins 2021-07-22 15:14:11 UTC
Closing this, it looks likely to be a iDrac issue,
We've seen it occur on another Dell R740 (same symptoms in grub) and again restarting iDrac made the problem go away.


Note You need to log in before you can comment on or make changes to this bug.