Bug 2090816

Summary: OCP 4.8 Baremetal IPI installation failure: "Bootstrap failed to complete: timed out waiting for the condition"
Product: OpenShift Container Platform Reporter: Kaushal Sathe <ksathe>
Component: InstallerAssignee: Honza Pokorny <hpokorny>
Installer sub component: OpenShift on Bare Metal IPI QA Contact: Adina Wolff <awolff>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: high CC: agawand, awolff, cgaynor, derekh, dhellmann, dwest, ealcaniz, eglottma, hpokorny, jhajyahy, kurathod, openshift-bugs-escalate, pibanezr, pmannidi, racedoro, rpittau, shardy, tsedovic, vkochuku
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2097753 (view as bug list) Environment:
Last Closed: 2022-08-10 11:14:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2097753    

Comment 24 Honza Pokorny 2022-06-15 14:24:10 UTC
Setting back to POST because we have another PR

Comment 28 Jad Haj Yahya 2022-06-20 09:16:27 UTC
Deployment using ​4.11.0-0.nightly-2022-06-17-154644 build failed with errors mentioned in BZ:

06-20 11:59:03.473  level=error msg=Bootstrap failed to complete: timed out waiting for the condition
06-20 11:59:03.473  level=error msg=Failed to wait for bootstrapping to complete.

Comment 29 Adina Wolff 2022-06-20 09:20:41 UTC
@jhajyahy Can you access a few more log lines above the ones you pasted? so that we can see how long the bootstrapping timeout was for?

Comment 30 Jad Haj Yahya 2022-06-20 09:34:22 UTC
time="2022-06-20T03:58:57-04:00" level=info msg="API v1.24.0+25f9057 up"
time="2022-06-20T03:58:57-04:00" level=debug msg="Loading Install Config..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading SSH Key..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading Base Domain..."
time="2022-06-20T03:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading Cluster Name..."
time="2022-06-20T03:58:57-04:00" level=debug msg="    Loading Base Domain..."
time="2022-06-20T03:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading Networking..."
time="2022-06-20T03:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading Pull Secret..."
time="2022-06-20T03:58:57-04:00" level=debug msg="  Loading Platform..."
time="2022-06-20T03:58:57-04:00" level=debug msg="Using Install Config loaded from state file"
time="2022-06-20T03:58:57-04:00" level=info msg="Waiting up to 1h0m0s (until 4:58AM) for bootstrapping to complete..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Fetching Bootstrap SSH Key Pair..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Loading Bootstrap SSH Key Pair..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Using Bootstrap SSH Key Pair loaded from state file"
time="2022-06-20T04:58:57-04:00" level=debug msg="Reusing previously-fetched Bootstrap SSH Key Pair"
time="2022-06-20T04:58:57-04:00" level=debug msg="Fetching Install Config..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Loading Install Config..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading SSH Key..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading Base Domain..."
time="2022-06-20T04:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading Cluster Name..."
time="2022-06-20T04:58:57-04:00" level=debug msg="    Loading Base Domain..."
time="2022-06-20T04:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading Networking..."
time="2022-06-20T04:58:57-04:00" level=debug msg="    Loading Platform..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading Pull Secret..."
time="2022-06-20T04:58:57-04:00" level=debug msg="  Loading Platform..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Using Install Config loaded from state file"
time="2022-06-20T04:58:57-04:00" level=debug msg="Reusing previously-fetched Install Config"
time="2022-06-20T04:58:57-04:00" level=error msg="Attempted to gather debug logs after installation failure: bootstrap host address and at least one control plane host address must be provided"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator cloud-controller-manager TrustedCABundleControllerControllerAvailable is True with AsExpected: Trusted CA Bundle Controller works as expected"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator cloud-controller-manager TrustedCABundleControllerControllerDegraded is False with AsExpected: Trusted CA Bundle Controller works as expected"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator cloud-controller-manager CloudConfigControllerAvailable is True with AsExpected: Cloud Config Controller works as expected"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator cloud-controller-manager CloudConfigControllerDegraded is False with AsExpected: Cloud Config Controller works as expected"
time="2022-06-20T04:58:57-04:00" level=error msg="Cluster operator network Degraded is True with RolloutHung: DaemonSet \"/openshift-ovn-kubernetes/ovnkube-node\" rollout is not making progress - last change 2022-06-20T08:09:08Z\nDaemonSet \"/openshift-ovn-kubernetes/ovnkube-master\" rollout is not making progress - pod ovnkube-master-nqt74 is in CrashLoopBackOff State\nDaemonSet \"/openshift-ovn-kubernetes/ovnkube-master\" rollout is not making progress - pod ovnkube-master-tztjq is in CrashLoopBackOff State\nDaemonSet \"/openshift-ovn-kubernetes/ovnkube-master\" rollout is not making progress - pod ovnkube-master-xmfns is in CrashLoopBackOff State\nDaemonSet \"/openshift-ovn-kubernetes/ovnkube-master\" rollout is not making progress - last change 2022-06-20T08:09:07Z"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator network ManagementStateDegraded is False with : "
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator network Progressing is True with Deploying: DaemonSet \"/openshift-ovn-kubernetes/ovnkube-node\" is not available (awaiting 3 nodes)\nDaemonSet \"/openshift-network-diagnostics/network-check-target\" is waiting for other operators to become ready\nDaemonSet \"/openshift-multus/network-metrics-daemon\" is waiting for other operators to become ready\nDaemonSet \"/openshift-multus/multus-admission-controller\" is waiting for other operators to become ready\nDaemonSet \"/openshift-ovn-kubernetes/ovnkube-master\" is not available (awaiting 3 nodes)\nDeployment \"/openshift-network-diagnostics/network-check-source\" is waiting for other operators to become ready"
time="2022-06-20T04:58:57-04:00" level=info msg="Cluster operator network Available is False with Startup: The network is starting up"
time="2022-06-20T04:58:57-04:00" level=error msg="Bootstrap failed to complete: timed out waiting for the condition"
time="2022-06-20T04:58:57-04:00" level=error msg="Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane."

Comment 31 Jad Haj Yahya 2022-06-20 10:05:45 UTC
Here is the message that indicates wait time was increased:

time="2022-06-20T03:58:57-04:00" level=info msg="Waiting up to 1h0m0s (until 4:58AM) for bootstrapping to complete..."
time="2022-06-20T04:58:57-04:00" level=debug msg="Fetching Bootstrap SSH Key Pair..."

Comment 33 Jad Haj Yahya 2022-06-21 08:32:48 UTC
IPI on BM job passed with build 4.11.0-0.nightly-2022-06-21-040754:

https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install/113830/

BZ can be closed as verified

Comment 34 Adina Wolff 2022-06-21 10:37:04 UTC
Tested successfully 4.11.0-0.nightly-2022-06-21-040754 (not an official build)

Comment 36 errata-xmlrpc 2022-08-10 11:14:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Comment 37 Red Hat Bugzilla 2023-09-15 01:55:15 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days