Bug 1702104

Summary: failed to bootstrap the cluster because control plane components are not ready
Product: OpenShift Container Platform Reporter: Erica von Buelow <evb>
Component: MasterAssignee: David Eads <deads>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: high    
Version: 4.1.0CC: aos-bugs, deads, jokerman, mmccomas, sdodson, yinzhou
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:47:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Erica von Buelow 2019-04-23 00:02:17 UTC
Description of problem:

bootstrap logs for this cluster
https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/591/pull-ci-openshift-machine-config-operator-master-e2e-aws-op/1678/artifacts/e2e-aws-op/installer/bootstrap-logs.tar.gz

"Note the error in our operator pod.  We need to wait for all the required, non-revision configmaps and secrets from starter.go in each operator.  It's racy and sometimes we lose."


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 zhou ying 2019-04-29 09:40:53 UTC
Hi Erica von Buelow:

    Could you please give more info about the failure, I can't reproduce the issue with all the payloads after 4.1.0-0.nightly-2019-04-25-121505 . Thanks a lot .

Comment 5 Xingxing Xia 2019-05-08 09:29:41 UTC
First, checked above PR landed in payloads whose build is >= 4.1.0-0.nightly-2019-05-02-004418.
(In reply to Erica von Buelow from comment #0)
> It's racy and sometimes we lose.
Per this, checked all 4.1.0-0.nightly payloads >= 4.1.0-0.nightly-2019-05-02-004418 in https://openshift-release.svc.ci.openshift.org/ as of now. Most are Accepted, meaning not hitting this bug. There are only a few "Rejected" ones. Checked all these "Rejected" ones' artifacts/e2e-aws/bootstrap/bootkube.service , didn't see the error like comment 1:
...
Error: error while checking pod status: timed out waiting for the condition
Apr 22 21:31:43 ip-10-0-1-229 bootkube.sh[1523]: Tearing down temporary bootstrap control plane
...

In our many daily cluster creations, above error is not hit either.
So moving to VERIFIED.

Comment 7 errata-xmlrpc 2019-06-04 10:47:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758