Created attachment 1790367 [details] must-gather Description of problem: Upgrade from 4.7.14 -> 4.8.0-fc.8 works. Then downgrading back to 4.7.14 fails. Version-Release number of selected component (if applicable): 4.7.14 How reproducible: always (2 out of 2 tries) Steps to Reproduce: 1. Install IPI on GCP 2. Upgrade to 4.8.0-fc.8 works 3. Downgrade back to 4.7.14 fails Actual results: #1 ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.7.14: an unknown error has occurred: MultipleErrors Upgradeable=False Reason: AuthenticationConfig_WebhookTokenAuthenticatorConfigured Message: Cluster operator kube-apiserver cannot be upgraded between minor versions: AuthenticationConfigUpgradeable: upgrades are not allowed when authentication.config/cluster .spec.WebhookTokenAuthenticator is set #2 ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.7.14 True False False 48m baremetal 4.7.14 True False False 5h3m cloud-credential 4.7.14 True False False 5h8m cluster-autoscaler 4.7.14 True False False 5h2m config-operator 4.7.14 True False False 5h3m console 4.7.14 True False False 48m csi-snapshot-controller 4.7.14 True False True 5h3m dns 4.8.0-fc.8 True False False 128m etcd 4.7.14 True False False 5h2m image-registry 4.7.14 True False False 4h54m ingress 4.7.14 True False True 48m insights 4.7.14 True False False 4h55m kube-apiserver 4.7.14 True False False 5h kube-controller-manager 4.7.14 True False False 5h1m kube-scheduler 4.7.14 True False False 5h1m kube-storage-version-migrator 4.7.14 True False False 115m machine-api 4.7.14 True False False 4h52m machine-approver 4.7.14 True False False 5h2m machine-config 4.8.0-fc.8 True False False 101m marketplace 4.7.14 True False False 48m monitoring 4.7.14 True False False 47m network 4.8.0-fc.8 True False False 5h3m node-tuning 4.8.0-fc.8 True False False 136m openshift-apiserver 4.7.14 True False False 101m openshift-controller-manager 4.7.14 True False False 4h53m openshift-samples 4.7.14 True False False 49m operator-lifecycle-manager 4.7.14 True False False 5h2m operator-lifecycle-manager-catalog 4.7.14 True False False 5h2m operator-lifecycle-manager-packageserver 4.7.14 True False False 110m service-ca 4.8.0-fc.8 True False False 5h3m storage 4.7.14 True False False 106m #3 Must gather shows this When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: 35043d9a-d8c7-4fc5-b6ec-0937c9d3230e ClusterVersion: Updating to "4.7.14" from "4.8.0-fc.8" for About an hour: Unable to apply 4.7.14: an unknown error has occurred: MultipleErrors ClusterOperators: clusteroperator/csi-snapshot-controller is degraded because CSISnapshotStaticResourceControllerDegraded: "csi_controller_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: "webhook_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: clusteroperator/ingress is degraded because Some ingresscontrollers are degraded: ingresscontroller "default" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) clusteroperator/kube-apiserver is not upgradeable because AuthenticationConfigUpgradeable: upgrades are not allowed when authentication.config/cluster .spec.WebhookTokenAuthenticator is set Expected results: Additional info:
Previous roll back defect: https://bugzilla.redhat.com/show_bug.cgi?id=1860560
looks like https://github.com/openshift/cluster-authentication-operator/pull/418 was completely forgotten about
I tried with 4.7.0-0.nightly-2021-06-16-202127 -> 4.8.0-rc.0 -> 4.7.0-0.nightly-2021-06-16-202127 Downgrade failed with: Multiple errors are preventing progress: * Cluster operator csi-snapshot-controller is degraded * Cluster operator ingress is degraded * Could not update customresourcedefinition "operatorconditions.operators.coreos.com" (484 of 669): the object is invalid, possibly due to local cluster configuration ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.7.0-0.nightly-2021-06-16-202127 True False False 6h56m baremetal 4.7.0-0.nightly-2021-06-16-202127 True False False 9h cloud-credential 4.7.0-0.nightly-2021-06-16-202127 True False False 9h cluster-autoscaler 4.7.0-0.nightly-2021-06-16-202127 True False False 9h config-operator 4.7.0-0.nightly-2021-06-16-202127 True False False 9h console 4.7.0-0.nightly-2021-06-16-202127 True False False 6h53m csi-snapshot-controller 4.7.0-0.nightly-2021-06-16-202127 True False True 9h dns 4.8.0-rc.0 True False False 8h etcd 4.7.0-0.nightly-2021-06-16-202127 True False False 9h image-registry 4.7.0-0.nightly-2021-06-16-202127 True False False 9h ingress 4.7.0-0.nightly-2021-06-16-202127 True False True 6h57m insights 4.7.0-0.nightly-2021-06-16-202127 True False False 9h kube-apiserver 4.7.0-0.nightly-2021-06-16-202127 True False False 9h kube-controller-manager 4.7.0-0.nightly-2021-06-16-202127 True False False 9h kube-scheduler 4.7.0-0.nightly-2021-06-16-202127 True False False 9h kube-storage-version-migrator 4.7.0-0.nightly-2021-06-16-202127 True False False 8h machine-api 4.7.0-0.nightly-2021-06-16-202127 True False False 9h machine-approver 4.7.0-0.nightly-2021-06-16-202127 True False False 9h machine-config 4.8.0-rc.0 True False False 8h marketplace 4.7.0-0.nightly-2021-06-16-202127 True False False 6h56m monitoring 4.7.0-0.nightly-2021-06-16-202127 True False False 6h52m network 4.8.0-rc.0 True False False 9h node-tuning 4.8.0-rc.0 True False False 8h openshift-apiserver 4.7.0-0.nightly-2021-06-16-202127 True False False 8h openshift-controller-manager 4.7.0-0.nightly-2021-06-16-202127 True False False 9h openshift-samples 4.7.0-0.nightly-2021-06-16-202127 True False False 6h57m operator-lifecycle-manager 4.8.0-rc.0 True False False 9h operator-lifecycle-manager-catalog 4.8.0-rc.0 True False False 9h operator-lifecycle-manager-packageserver 4.8.0-rc.0 True False False 9h service-ca 4.8.0-rc.0 True False False 9h storage 4.7.0-0.nightly-2021-06-16-202127 True False False 8h must gather shows When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: 823226ca-6307-48d1-b113-e1e905971e1c ClusterVersion: Updating to "4.7.0-0.nightly-2021-06-16-202127" from "4.8.0-rc.0" for 8 hours: Unable to apply 4.7.0-0.nightly-2021-06-16-202127: an unknown error has occurred: MultipleErrors ClusterOperators: clusteroperator/csi-snapshot-controller is degraded because CSISnapshotStaticResourceControllerDegraded: "csi_controller_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: "webhook_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: clusteroperator/ingress is degraded because Some ingresscontrollers are degraded: ingresscontroller "default" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing)
From the Comment 6, 'clusteroperator/kube-apiserver is not upgradeable' issue is not reproduced.
Follow up BZ https://bugzilla.redhat.com/show_bug.cgi?id=1973986 https://bugzilla.redhat.com/show_bug.cgi?id=1973983
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.18 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2502