Bug 1781141
| Summary: | Upgrade from 4.2.9 to 4.3 failed for MCO: controller version mismatch | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> |
| Component: | RHCOS | Assignee: | Yu Qi Zhang <jerzhang> |
| Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.3.0 | CC: | jerzhang, miabbott, smilner |
| Target Milestone: | --- | Keywords: | TestBlocker |
| Target Release: | 4.3.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-01-23 11:18:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1778904 | ||
| Bug Blocks: | |||
Trying to retrieve MCO logs: # oc logs machine-config-operator-5c4c599bc7-7dzh7 > machine-config-operator-5c4c599bc7-7dzh7 Error from server: Get https://10.0.157.10:10250/containerLogs/openshift-machine-config-operator/machine-config-operator-5c4c599bc7-7dzh7/machine-config-operator: x509: certificate signed by unknown authority Tried to get logs via oc debug node and that failed too # oc debug node/ip-10-0-157-10.us-west-2.compute.internal Starting pod/ip-10-0-157-10us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.157.10 If you don't see a command prompt, try pressing enter. Removing debug pod ... Error from server: error dialing backend: x509: certificate signed by unknown authority Moving back to MODIFIED since it depends on un-merged PR https://github.com/openshift/installer/pull/2777 Verified I was able to successfully upgrade from 4.2.9 to 4.3.0-0.nightly-2019-12-18-145749. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True False 15m Cluster version is 4.2.9 $ oc adm upgrade --force --to-image=registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True False 15m Cluster version is 4.2.9 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 4s Working towards registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749: downloading update $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 7m39s Working towards 4.3.0-0.nightly-2019-12-18-145749: 65% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 30m Working towards 4.3.0-0.nightly-2019-12-18-145749: 84% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2019-12-18-145749 True False 2m9s Cluster version is 4.3.0-0.nightly-2019-12-18-145749 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |
Description of problem: Upgrading 4.2.9 to 4.3.0-0.nightly-2019-12-06-094536 fails: message: 'Unable to apply 4.3.0-0.nightly-2019-12-06-094536: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for rendered-master-348056f16abd630d3dced666f8bc9080 expected 2789973d61a0011415e2d019c09bbcb0f1bd3383 has d780d197a9c5848ba786982c0c4aaa7487297046, oc adm must-gather fails in this cluster: [root@ip-172-31-53-199 must-gather]# oc adm must-gather [must-gather ] OUT Using must-gather plugin-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a5f927c199fa9fa90a5793d012eda90cd4163d4d2ff4d0ad04534401faba5b24 [must-gather ] OUT namespace/openshift-must-gather-z27xz created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-97k45 created [must-gather ] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a5f927c199fa9fa90a5793d012eda90cd4163d4d2ff4d0ad04534401faba5b24 created [must-gather-ftwtk] OUT gather did not start: timed out waiting for the condition [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-97k45 deleted [must-gather ] OUT namespace/openshift-must-gather-z27xz deleted error: gather did not start for pod must-gather-ftwtk: timed out waiting for the condition I'll grab the MCO logs. Version-Release number of selected component (if applicable): Upgrading 4.2.9 to 4.3.0-0.nightly-2019-12-06-094536 How reproducible: Unknown