Bug 1777036
Summary: | Recent builds booting to Emergency mode during install on AWS: e.g. 4.3.0-0.nightly-2019-11-21-122827 and 4.3.0-0.nightly-2019-11-22-050018 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Abhinav Dahiya <adahiya> |
Component: | RHCOS | Assignee: | Colin Walters <walters> |
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.3.0 | CC: | adahiya, bbreard, behoward, chris.liles, dustymabe, imcleod, jialiu, jligon, miabbott, mifiedle, mnguyen, nstielau, walters, xxia |
Target Milestone: | --- | Keywords: | TestBlocker |
Target Release: | 4.3.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1775728 | Environment: | |
Last Closed: | 2020-01-23 11:14:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1775728 | ||
Bug Blocks: |
Description
Abhinav Dahiya
2019-11-26 19:31:00 UTC
Agree, e.g. this installation https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Launch%20Environment%20Flexy/72806/console where it failed with: ... level=debug msg="Still waiting for the Kubernetes API: Get https://...:6443/version?timeout=32s: dial tcp 52.78.20.8:6443: connect: connection refused" level=debug msg="Still waiting for the Kubernetes API: Get https://...:6443/version?timeout=32s: dial tcp 52.78.20.8:6443: connect: connection refused" level=error msg="Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get https://...:6443/apis/config.openshift.io/v1/clusteroperators: dial tcp ...:6443: connect: connection refused" ... level=info msg="Pulling debug logs from the bootstrap machine" level=error msg="Attempted to gather debug logs after installation failure:... level=fatal msg="Bootstrap failed to complete: waiting for Kubernetes API: context deadline exceeded" ... Checked it in AWS web console about the bootstrap VM's "System Log", it is same as above Description: ... [ 64.912104] systemd[1]: Started Emergency Shell. [ 64.919157] systemd[1]: Reached target Emergency Mode. The cherry-pick PR to 4.3 is still waiting to be merged. - https://github.com/openshift/installer/pull/2724 I installed 4.3.0-0.nightly-2019-12-04-054458 with no issues. The version of RHCOS in the bump is present in the build. If any one else is still seeing this issue please respond. Otherwise I will close the BZ as verified. $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-138-104.ec2.internal Ready master 22m v1.16.2 ip-10-0-141-21.ec2.internal Ready worker 15m v1.16.2 ip-10-0-152-248.ec2.internal Ready worker 15m v1.16.2 ip-10-0-158-38.ec2.internal Ready master 22m v1.16.2 ip-10-0-163-90.ec2.internal Ready master 22m v1.16.2 ip-10-0-173-72.ec2.internal Ready worker 14m v1.16.2 $ oc debug node/ip-10-0-138-104.ec2.internal Starting pod/ip-10-0-138-104ec2internal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: * pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:aa2271b7cdf177aa0368fdf854027a5c54d03b90f089701190b2533147d4469d CustomOrigin: Managed by machine-config-operator Version: 43.81.201912040340.0 (2019-12-04T03:45:20Z) ostree://e884477421640d1285c07a6dd9aaf01c9e125038ebbe6290a5e341eb3695a4d1 Version: 43.81.201911221453.0 (2019-11-22T14:58:44Z) sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ... $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2019-12-04-054458 True False 5m18s Cluster version is 4.3.0-0.nightly-2019-12-04-054458 This is working for me now. I think we can mark VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |