Bug 1690153
Summary: | clusteroperator/kube-scheduler changed Failing to True: NodeInstallerFailing: NodeInstallerFailing: 0 nodes are failing on revision | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ben Parees <bparees> | ||||
Component: | Node | Assignee: | ravig <rgudimet> | ||||
Status: | CLOSED ERRATA | QA Contact: | Weinan Liu <weinliu> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.1.0 | CC: | aos-bugs, gblomqui, jiajliu, jokerman, mmccomas, rcook, rgudimet, sjenning, weinliu, wsun | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.1.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-06-04 10:46:02 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ben Parees
2019-03-18 21:35:03 UTC
Hit this issue when upgrade from 4.0.0-0.nightly-2019-03-19-004004 to 4.0.0-0.nightly-2019-03-20-153904. Upgrade failed. { "lastTransitionTime": "2019-03-21T07:12:36Z", "message": "Cluster operator kube-scheduler is reporting a failure: NodeInstallerFailing: 0 nodes are failing on revision 6:\nNodeInstallerFailing: pods \"installer-6-ip-10-0-131-197.us-east-2.compute.internal\" not found", "reason": "ClusterOperatorFailing", "status": "True", "type": "Failing" }, { "lastTransitionTime": "2019-03-21T06:01:05Z", "message": "Unable to apply 4.0.0-0.nightly-2019-03-20-153904: the cluster operator kube-scheduler is failing", "reason": "ClusterOperatorFailing", "status": "True", "type": "Progressing" }, Created attachment 1546777 [details] Occurrences of this error in CI from 2019-03-19T12:28 to 2019-03-21T20:06 UTC This occurred in 15 of our 861 failures in *-e2e-aws* jobs across the whole CI system over the past 55 hours. Generated with [1]: $ deck-build-log-plot 'clusteroperator/kube-scheduler .* NodeInstallerFailing: 0 nodes are failing on revision' [1]: https://github.com/wking/openshift-release/tree/debug-scripts/deck-build-log *** Bug 1691600 has been marked as a duplicate of this bug. *** There have been some recent changes to library-go in this area: https://github.com/openshift/library-go/pull/313 https://github.com/openshift/library-go/pull/312 Could be that a bump in library-go could fix this. I believe this is fixed by https://github.com/openshift/library-go/pull/312 Please check if it could be verified. Verified to be fixed. [nathan@localhost 0410]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.0.0-0.11 True False 7m4s Cluster version is 4.0.0-0.11 (upgraded from 4.0.0-0.9) The message is enhanced during upgrade [nathan@localhost 0410]$ oc logs openshift-kube-scheduler-operator-5476946d7f-j8kdp|grep NodeInstallerFailing I0410 06:59:47.830854 1 status_controller.go:156] clusteroperator/kube-scheduler diff {"status":{"conditions":[{"lastTransitionTime":"2019-04-10T06:59:17Z","reason":"NodeInstallerFailingInstallerPodFailed","status":"True","type":"Failing"},{"lastTransitionTime":"2019-04-10T06:58:51Z","message":"Progressing: 3 nodes are at revision 7","reason":"Progressing","status":"True","type":"Progressing"},{"lastTransitionTime":"2019-04-10T06:58:37Z","message":"Available: 3 nodes are active; 3 nodes are at revision 7","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2019-04-10T06:58:37Z","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} I0410 07:00:34.676275 1 status_controller.go:156] clusteroperator/kube-scheduler diff {"status":{"conditions":[{"lastTransitionTime":"2019-04-10T06:59:17Z","reason":"NodeInstallerFailingInstallerPodFailed","status":"True","type":"Failing"},{"lastTransitionTime":"2019-04-10T06:58:51Z","message":"Progressing: 3 nodes are at revision 7","reason":"Progressing","status":"True","type":"Progressing"},{"lastTransitionTime":"2019-04-10T06:58:37Z","message":"Available: 3 nodes are active; 3 nodes are at revision 7","reason":"AsExpected","status":"True","type":"Available"},{"lastTransitionTime":"2019-04-10T06:58:37Z","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |