Description of problem: After chain of upgrades 4.2.36 -> 4.3.40 -> 4.4.33-> 4.5.29 machine-config operator is in DEGRADED state. Version-Release number of selected component (if applicable): Cloud AWS (IPI) # oc get machines -n openshift-machine-api NAME PHASE TYPE REGION ZONE AGE qe-pr-aws421-42pdg-master-0 Running m4.xlarge us-east-2 us-east-2a 8h qe-pr-aws421-42pdg-master-1 Running m4.xlarge us-east-2 us-east-2b 8h qe-pr-aws421-42pdg-master-2 Running m4.xlarge us-east-2 us-east-2c 8h qe-pr-aws421-42pdg-worker-us-east-2a-h2fft Running m4.large us-east-2 us-east-2a 8h qe-pr-aws421-42pdg-worker-us-east-2b-7r2fj Running m4.large us-east-2 us-east-2b 8h qe-pr-aws421-42pdg-worker-us-east-2c-wvdvz Running m4.large us-east-2 us-east-2c 8h How reproducible: Unsure Steps to Reproduce: 1. Install an AWS cluster using quay.io/openshift-release-dev/ocp-release:4.2.36-x86_64 with 3 worker nodes 2. Upgrade cluster oc adm upgrade --to-image quay.io/openshift-release-dev/ocp-release:4.3.40-x86_64 --force --allow-explicit-upgrade --> PASSED oc adm upgrade --to-image quay.io/openshift-release-dev/ocp-release:4.4.33-x86_64 --force --allow-explicit-upgrade --> PASSED oc adm upgrade --to-image quay.io/openshift-release-dev/ocp-release:4.5.28-x86_64 --force --allow-explicit-upgrade --> FAILED Actual results: # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.33 True True 4h10m Working towards 4.5.28: 29% complete # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.28 True False False 8h cloud-credential 4.5.28 True False False 8h cluster-autoscaler 4.5.28 True False False 8h config-operator 4.5.28 True False False 3h50m console 4.5.28 True False False 3h20m csi-snapshot-controller 4.5.28 True False False 3h42m dns 4.5.28 True False False 8h etcd 4.5.28 True False False 5h7m image-registry 4.5.28 True False False 3h19m ingress 4.5.28 True False False 4h22m insights 4.5.28 True False False 8h kube-apiserver 4.5.28 True False False 5h5m kube-controller-manager 4.5.28 True False False 5h3m kube-scheduler 4.5.28 True False False 5h4m kube-storage-version-migrator 4.5.28 True False False 3h22m machine-api 4.5.28 True False False 8h machine-approver 4.5.28 True False False 3h45m machine-config 4.4.33 False True True 3h16m marketplace 4.5.28 True False False 3h44m monitoring 4.5.28 True False False 3h43m network 4.5.28 True False False 8h node-tuning 4.5.28 True False False 3h45m openshift-apiserver 4.5.28 True False True 3h24m openshift-controller-manager 4.5.28 True False False 3h45m openshift-samples 4.5.28 True False False 3h45m operator-lifecycle-manager 4.5.28 True False False 8h operator-lifecycle-manager-catalog 4.5.28 True False False 8h operator-lifecycle-manager-packageserver 4.5.28 True False False 3h20m service-ca 4.5.28 True False False 8h service-catalog-apiserver 4.4.33 True False False 8h service-catalog-controller-manager 4.4.33 True False False 8h storage 4.5.28 True False False 3h45m Expected results: Cluster Upgrades to 4.5.28 with no degraded cluster operators Additional info: # oc describe co machine-config Name: machine-config Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2021-02-03T15:14:52Z Generation: 1 Resource Version: 211644 Self Link: /apis/config.openshift.io/v1/clusteroperators/machine-config UID: 907cec28-6632-11eb-bb77-02af3b724eee Spec: Status: Conditions: Last Transition Time: 2021-02-03T20:36:22Z Message: Cluster not available for 4.5.28 Status: False Type: Available Last Transition Time: 2021-02-03T20:22:40Z Message: Working towards 4.5.28 Status: True Type: Progressing Last Transition Time: 2021-02-03T20:36:22Z Message: Unable to apply 4.5.28: timed out waiting for the condition during syncRequiredMachineConfigPools: pool master has not progressed to latest configuration: controller version mismatch for rendered-master-8717be8dde8d46fe6b908649d1d93068 expected 68dff1c13317ca2756c490c520d029dc67994224 has c96f5b0bfa95eabf4e4fe64068b14eef965f5e22, retrying Reason: RequiredPoolsFailed Status: True Type: Degraded Last Transition Time: 2021-02-03T15:15:58Z Reason: AsExpected Status: True Type: Upgradeable .... # oc describe co openshift-apiserver Name: openshift-apiserver Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2021-02-03T15:15:12Z Generation: 1 Resource Version: 156958 Self Link: /apis/config.openshift.io/v1/clusteroperators/openshift-apiserver UID: 9c6fb66c-6632-11eb-bb77-02af3b724eee Spec: Status: Conditions: Last Transition Time: 2021-02-03T20:47:47Z Message: APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver Reason: APIServerDeployment_UnavailablePod Status: True Type: Degraded Last Transition Time: 2021-02-03T20:02:32Z Reason: AsExpected Status: False Type: Progressing Last Transition Time: 2021-02-03T20:27:51Z Reason: AsExpected Status: True Type: Available Last Transition Time: 2021-02-03T15:15:12Z Reason: AsExpected Status: True Type: Upgradeable Extension: <nil> ....
I'm unable to reproduce this. I went through the supplied upgrade sequence with two AWS IPI clusters and both of them upgraded flawlessly. Given the age of this bug and that there is no additional information that can be gathered at this time, I'm going to close it. If you do manage to reproduce this, I'd very much appreciate it if you would take a must-gather and re-open this bug. Thank you.