Bug 1955747

Summary: minimal ISO doesn't boot on Dell PowerEdge R340
Product: OpenShift Container Platform Reporter: Nick Carboni <ncarboni>
Component: assisted-installerAssignee: Nick Carboni <ncarboni>
assisted-installer sub component: assisted-service QA Contact: Yuri Obshansky <yobshans>
Status: CLOSED NOTABUG Docs Contact:
Severity: unspecified    
Priority: high CC: aos-bugs, uschlute
Version: 4.8   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AI-Team-Core
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-07 13:58:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nick Carboni 2021-04-30 18:04:26 UTC
Description of problem:
The minimal iso created during a cluster install fails to boot with no visible error printed to the console.

Assisted service was deployed using the community operator from operator hub, and a cluster installation was started using the kube native API.

Version-Release number of selected component (if applicable):
      SERVICE_IMAGE:         quay.io/ocpmetal/assisted-service@sha256:330245cb266d2cc53fa0009fe949b0667cbaef3f72a548735734caf5c553eec6
      DATABASE_IMAGE:        quay.io/ocpmetal/postgresql-12-centos7@sha256:94727d70e0afbf4e167e078744f3a10ac9d82edc553d57b0ecbb5443264f07e1
      AGENT_IMAGE:           quay.io/ocpmetal/assisted-installer-agent@sha256:d5da56e329a87d8729c80fa9a7ddd221c10a9fddf39df6e909bb99dfee79fa2c
      CONTROLLER_IMAGE:      quay.io/ocpmetal/assisted-installer-controller@sha256:11094031bf79dce9d21ed6efed12af4dab9f7002bdac14b68283a29811048756
      INSTALLER_IMAGE:       quay.io/ocpmetal/assisted-installer@sha256:badacb6c9764c3d79cb4067941128601954edeade8f86e65c33c68bf5c7d4d18
      OPENSHIFT_VERSIONS:    {"4.6":{"display_name":"4.6.16","release_image":"quay.io/openshift-release-dev/ocp-release:4.6.16-x86_64","rhcos_image":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.6/4.6.8/rhcos-4.6.8-x86_64-live.x86_64.iso","rhcos_version":"46.82.202012051820-0","support_level":"production"},"4.7":{"display_name":"4.7.5","release_image":"quay.io/openshift-release-dev/ocp-release:4.7.5-x86_64","rhcos_image":"https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.0/rhcos-4.7.0-x86_64-live.x86_64.iso","rhcos_version":"47.83.202102090044-0","support_level":"production"}}


How reproducible:
100% on this specific hardware. Not reproducible using a UEFI VM

Steps to Reproduce:
1. Deploy assisted service
2. Create clusterdeployment, infraenv
3. Download minimal iso
4. boot the iso

Actual results:
Boot from CD fails and host proceeds to boot from a separate bootable drive.

Expected results:
Boot into live iso.

Additional info:

Hardware info:
"PowerEdge R340 , 64 GB memory , 223 x 2 disks . IDRAC version Firmware version 4.40.00.00 and I’m mounting the iso through the Virtual Media feature as remote file share as CD"

Comment 2 Ulrich Schlueter 2021-05-06 10:23:45 UTC
Problem also occurs with one ironic provided ISO, but things work properly when changing the ISO_IMAGE_TYPE to full-iso in the assisted-service.

Comment 3 Nick Carboni 2021-05-06 13:00:12 UTC
Can you give some more details on the (In reply to Ulrich Schlueter from comment #2)
> Problem also occurs with one ironic provided ISO, but things work properly
> when changing the ISO_IMAGE_TYPE to full-iso in the assisted-service.

Can you give some more details about the "ironic provided ISO"?
How was it created? Is that something we can get access to?

Comment 4 Ulrich Schlueter 2021-05-06 15:02:38 UTC
Looks like this was server related after all. The server showed additional (corrupted) entries in the UEFI boot sequence that most likely confused the boot process (trying to boot from a non-existing virtual CD, not the one shown in the UI) . Removing these entries help with the ironic case.

Next step retesting with the minimal-iso

Comment 6 Ulrich Schlueter 2021-05-06 15:30:01 UTC
Re-testing with the minimal ISO worked. 

So the corruption in the boot sequence seems to have caused the observed symptoms. Unexplained things: 1) why the boot process worked with the full iso. 2) what created the rogue entries.

Anyway, from my perspective this BZ can be closed.

Comment 7 Nick Carboni 2021-05-07 13:58:55 UTC
Closing this as it was due to UEFI configuration issues on the host.