Bug 1960696

Summary: Scaling a remote worker - loading the boot image takes a long time (more than 7 minutes) and it appears as the boot is stuck.
Product: OpenShift Container Platform Reporter: Alexander Chuzhoy <sasha>
Component: DocumentationAssignee: Tony Mulqueen <tmulquee>
Status: CLOSED CURRENTRELEASE QA Contact: Polina Rabinovich <prabinov>
Severity: high Docs Contact: Tomas 'Sheldon' Radej <tradej>
Priority: low    
Version: 4.8CC: aos-bugs, jokerman, ohochman, pablo.iranzo, prabinov, rpittau, sdasu, tmulquee, tradej
Target Milestone: ---Keywords: Triaged, UserExperience
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-26 09:11:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1831748    
Attachments:
Description Flags
This is what the user gets for that long period of time during boot. none

Description Alexander Chuzhoy 2021-05-14 15:42:36 UTC
Version:
4.8.0-0.nightly-2021-05-13-002125

used hardware
Dell PowerEdge R640
Bios version 1.6.13
idrac version: 4.40.10.00
iDRAC 9


Scenario: scaling a worker in remote subnet.

Observation: loading the boot image takes a long time (7 minutes) and it appears as the boot is stuck. The image is attached.
Bad user experience.

Comment 1 Alexander Chuzhoy 2021-05-14 16:52:16 UTC
Created attachment 1783261 [details]
This is what the user gets for that long period of time during boot.

Comment 2 Alexander Chuzhoy 2021-05-14 20:33:13 UTC
same is expected with SNO

Comment 3 Omri Hochman 2021-05-14 20:36:56 UTC
when we say "remote" it's when we have some latency, let's check the latency to the "remote setup" ?  
also, I would like to know how does that impact SNO deployment.  


( I think most customer real-life scenario could be considered remote )

Comment 5 Tomas Sedovic 2021-05-17 08:20:47 UTC
Does the boot succeed eventually (despite taking a long time), or is it completely stuck?

Comment 6 Dmitry Tantsur 2021-05-17 10:28:56 UTC
I assume you're using virtual media? Do you have other hardware vendors/models to try?

Comment 7 Alexander Chuzhoy 2021-05-17 13:48:29 UTC
The boot succeeds after 7 minutes. Want to assure users don't reboot, assuming it's stuck.
Using virtual media.

Comment 8 sdasu 2021-05-18 16:32:44 UTC
1. Boot msgs are something that is controlled by the HW vendor not something we have control over.
2. Try downgrading firmware to that is listed as supported in OpenShift documentation.[https://docs.openshift.com/container-platform/4.7/installing/installing_bare_metal_ipi/ipi-install-prerequisites.html#ipi-install-firmware-requirements-for-installing-with-virtual-media_ipi-install-prerequisites]
3. Needs documentation update to say that boot times can be expected to be long without indication of progress.

Comment 11 Tony Mulqueen 2021-08-24 11:55:16 UTC
Made the changes in the following PR: https://github.com/openshift/openshift-docs/pull/35750

Comment 13 Tony Mulqueen 2021-09-29 13:32:56 UTC
Made the changes in the following PR: https://bugzilla.redhat.com/show_bug.cgi?id=1960696

Comment 18 Polina Rabinovich 2021-10-03 09:13:09 UTC
Verified - The warning was added with a needed message. 
https://github.com/openshift/openshift-docs/pull/35794#issuecomment-905664647

Comment 19 Tony Mulqueen 2021-10-08 12:24:16 UTC
I had some discussion with a UK colleague and he suggested more explicit feedback.
I have staged, but not committed the following text:

Boot times can take up to 10 minutes without any indication of progress. Do not assume the boot is stuck, or abort or reboot during this installation, as this could require troubleshooting on the next installation.

The current wording is at https://deploy-preview-37233--osdocs.netlify.app/openshift-enterprise/latest/installing/installing_bare_metal_ipi/ipi-install-prerequisites?utm_source=github&utm_campaign=bot_dp#ipi-install-firmware-requirements-for-installing-with-virtual-media_ipi-install-prerequisites 
@ale

Comment 20 Alexander Chuzhoy 2021-10-08 13:16:58 UTC
@(In reply to Tony Mulqueen from comment #19)
> I had some discussion with a UK colleague and he suggested more explicit
> feedback.
> I have staged, but not committed the following text:
> 
> Boot times can take up to 10 minutes without any indication of progress. Do
> not assume the boot is stuck, or abort or reboot during this installation,
> as this could require troubleshooting on the next installation.
> 
> The current wording is at
> https://deploy-preview-37233--osdocs.netlify.app/openshift-enterprise/latest/
> installing/installing_bare_metal_ipi/ipi-install-
> prerequisites?utm_source=github&utm_campaign=bot_dp#ipi-install-firmware-
> requirements-for-installing-with-virtual-media_ipi-install-prerequisites 
> @ale

The new wording is better IMHO. This way the deploying person(s) won't way for hours/days to start debugging.
Thanks!

Comment 24 Polina Rabinovich 2021-10-20 19:55:03 UTC
Can you provide link to the new PR?

Comment 25 Tony Mulqueen 2021-10-21 09:26:35 UTC
@Polina: here is the link to the PR: https://github.com/openshift/openshift-docs/pull/37233

Sheldon moved this back to ON_QA because we made some further changes after VERIFIED was approved by QE. The changes were done to approve the text, as agreed by Alexander, but we need them once more VERIFIED.

If the changes meet with your approval please move back to VERIFIED and I will issue a merge request.

Comment 26 Polina Rabinovich 2021-10-21 10:12:31 UTC
I'm moving back to VERIFIED. I checked the changes.

Comment 27 Tony Mulqueen 2021-10-21 14:29:53 UTC
Received notice of merge: 
Michael Burke <notifications>
2:45 PM (42 minutes ago)
to State, openshift/openshift-docs, me

Merged #35583 into enterprise-4.6.

Comment 28 Red Hat Bugzilla 2023-09-15 01:06:37 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days