Description of problem: Upgrading 4.2.9 to 4.3.0-0.nightly-2019-12-06-094536 fails: message: 'Unable to apply 4.3.0-0.nightly-2019-12-06-094536: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for rendered-master-348056f16abd630d3dced666f8bc9080 expected 2789973d61a0011415e2d019c09bbcb0f1bd3383 has d780d197a9c5848ba786982c0c4aaa7487297046, oc adm must-gather fails in this cluster: [root@ip-172-31-53-199 must-gather]# oc adm must-gather [must-gather ] OUT Using must-gather plugin-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a5f927c199fa9fa90a5793d012eda90cd4163d4d2ff4d0ad04534401faba5b24 [must-gather ] OUT namespace/openshift-must-gather-z27xz created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-97k45 created [must-gather ] OUT pod for plug-in image quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a5f927c199fa9fa90a5793d012eda90cd4163d4d2ff4d0ad04534401faba5b24 created [must-gather-ftwtk] OUT gather did not start: timed out waiting for the condition [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-97k45 deleted [must-gather ] OUT namespace/openshift-must-gather-z27xz deleted error: gather did not start for pod must-gather-ftwtk: timed out waiting for the condition I'll grab the MCO logs. Version-Release number of selected component (if applicable): Upgrading 4.2.9 to 4.3.0-0.nightly-2019-12-06-094536 How reproducible: Unknown
Trying to retrieve MCO logs: # oc logs machine-config-operator-5c4c599bc7-7dzh7 > machine-config-operator-5c4c599bc7-7dzh7 Error from server: Get https://10.0.157.10:10250/containerLogs/openshift-machine-config-operator/machine-config-operator-5c4c599bc7-7dzh7/machine-config-operator: x509: certificate signed by unknown authority
Tried to get logs via oc debug node and that failed too # oc debug node/ip-10-0-157-10.us-west-2.compute.internal Starting pod/ip-10-0-157-10us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.157.10 If you don't see a command prompt, try pressing enter. Removing debug pod ... Error from server: error dialing backend: x509: certificate signed by unknown authority
Moving back to MODIFIED since it depends on un-merged PR https://github.com/openshift/installer/pull/2777
Verified I was able to successfully upgrade from 4.2.9 to 4.3.0-0.nightly-2019-12-18-145749. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True False 15m Cluster version is 4.2.9 $ oc adm upgrade --force --to-image=registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749 Updating to release image registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True False 15m Cluster version is 4.2.9 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 4s Working towards registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-18-145749: downloading update $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 7m39s Working towards 4.3.0-0.nightly-2019-12-18-145749: 65% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 30m Working towards 4.3.0-0.nightly-2019-12-18-145749: 84% complete $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-2019-12-18-145749 True False 2m9s Cluster version is 4.3.0-0.nightly-2019-12-18-145749
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062