Bug 2021650

Summary: Installation failed - worker is not able to boot
Product: OpenShift Container Platform Reporter: Pierre Blanc <pblanc>
Component: Bare Metal Hardware ProvisioningAssignee: Tomas Sedovic <tsedovic>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: Amit Ugol <augol>
Status: CLOSED WORKSFORME Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bfournie, fdaencar
Version: 4.8   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-15 10:50:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Openshift install log none

Description Pierre Blanc 2021-11-09 20:32:39 UTC
Created attachment 1840942 [details]
Openshift install log

Version:
4.8.19

Platform:
OpenShift 4.8.19 with DCI on Intel HW.

What happened?
When we deploy OCP version 4.8.19, the worker node can't boot and stay in a black screen.
With the version 4.8.18, we don't have this issue.

This deployment uses the same hardware of this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=2007040
We fixed the #2007040 with the rebuild of the image.

Comment 2 Pierre Blanc 2021-11-09 20:35:04 UTC
Comment on attachment 1840942 [details]
Openshift install log

You can ignore this empty file

Comment 3 Dmitry Tantsur 2021-11-10 10:58:12 UTC
Please provide a complete must-gather, there is nothing we can deduce from just seeing very high-level installer messages.

When you're saying "can't boot" it: 1) fails to PXE to load the service ramdisk, 2) fails to select the correct boot device on instance loading, 3) crashes when booting CoreOS, 4) anything else?

I'm not sure I fully understand your statement about bug 2007040. Has it affected you? Are you sure it's not the same issue?

Comment 5 Pierre Blanc 2021-11-10 15:21:03 UTC
The node is able to get the image with pxe and it is installed, but at the CoreOS booting just after the grub selection it crashes.

Yes the #2007040 affected me, that why I rebuild CoreOs image with the new shim.

Comment 9 Pierre Blanc 2021-11-12 14:53:30 UTC
We tried to deploy in 4.8.20 and in 4.9, and it works perfectly.

So we had the issue only on 4.8.19

Comment 10 Dmitry Tantsur 2021-11-15 10:50:31 UTC
> We tried to deploy in 4.8.20 and in 4.9, and it works perfectly.

Thank you for testing, I assume we can now close this bug since the latest versions are fixed.

Comment 11 Pierre Blanc 2021-11-16 15:56:38 UTC
Yes, it is fixed on the last version.

Thank you for your help.