Description of problem: Upgrade from 4.7 nightly -> 4.8.0-fc.8 works. Then downgrading back to 4.7 nightly fails. Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2021-06-17-173140 How reproducible: always (2 out of 2 tries) Steps to Reproduce: 1. Install IPI on GCP 2. Upgrade to 4.8.0-fc.8 works 3. Downgrade back to 4.7 nightly fails OpenShift release version: 4.7.0-0.nightly-2021-06-17-173140 Cluster Platform: GCP Actual results: $ ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.7.0-0.nightly-2021-06-17-173140: an unknown error has occurred: MultipleErrors ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.7.0-0.nightly-2021-06-17-173140 True False False 155m baremetal 4.7.0-0.nightly-2021-06-17-173140 True False False 27h cloud-credential 4.7.0-0.nightly-2021-06-17-173140 True False False 27h cluster-autoscaler 4.7.0-0.nightly-2021-06-17-173140 True False False 27h config-operator 4.7.0-0.nightly-2021-06-17-173140 True False False 27h console 4.7.0-0.nightly-2021-06-17-173140 True False False 23h csi-snapshot-controller 4.7.0-0.nightly-2021-06-17-173140 True False True 27h dns 4.8.0-0.nightly-2021-06-18-055840 True False False 25h etcd 4.7.0-0.nightly-2021-06-17-173140 True False False 27h image-registry 4.7.0-0.nightly-2021-06-17-173140 True False False 27h ingress 4.7.0-0.nightly-2021-06-17-173140 True False True 23h insights 4.7.0-0.nightly-2021-06-17-173140 True False False 27h kube-apiserver 4.7.0-0.nightly-2021-06-17-173140 True False False 27h kube-controller-manager 4.7.0-0.nightly-2021-06-17-173140 True False False 27h kube-scheduler 4.7.0-0.nightly-2021-06-17-173140 True False False 27h kube-storage-version-migrator 4.7.0-0.nightly-2021-06-17-173140 True False False 24h machine-api 4.7.0-0.nightly-2021-06-17-173140 True False False 27h machine-approver 4.7.0-0.nightly-2021-06-17-173140 True False False 27h machine-config 4.8.0-0.nightly-2021-06-18-055840 True False False 27h marketplace 4.7.0-0.nightly-2021-06-17-173140 True False False 23h monitoring 4.7.0-0.nightly-2021-06-17-173140 True False False 23h network 4.8.0-0.nightly-2021-06-18-055840 True False False 27h node-tuning 4.8.0-0.nightly-2021-06-18-055840 True False False 25h openshift-apiserver 4.7.0-0.nightly-2021-06-17-173140 True False False 24h openshift-controller-manager 4.7.0-0.nightly-2021-06-17-173140 True False False 27h openshift-samples 4.7.0-0.nightly-2021-06-17-173140 True False False 23h operator-lifecycle-manager 4.8.0-0.nightly-2021-06-18-055840 True False False 27h operator-lifecycle-manager-catalog 4.8.0-0.nightly-2021-06-18-055840 True False False 27h operator-lifecycle-manager-packageserver 4.8.0-0.nightly-2021-06-18-055840 True False False 27h service-ca 4.8.0-0.nightly-2021-06-18-055840 True False False 27h storage 4.7.0-0.nightly-2021-06-17-173140 True False False 24h Expected results: Downgrade succeeds - while downgrade may not be officially supported, it has been working last few releases. Impact of the problem: Downgrade fails Additional info: must-gather shows When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: 9d668a63-310a-45b1-b5f6-0af9fe23caab ClusterVersion: Updating to "4.7.0-0.nightly-2021-06-17-173140" from "4.8.0-0.nightly-2021-06-18-055840" for 4 hours: Unable to apply 4.7.0-0.nightly-2021-06-17-173140: an unknown error has occurred: MultipleErrors ClusterOperators: clusteroperator/csi-snapshot-controller is degraded because CSISnapshotStaticResourceControllerDegraded: "csi_controller_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: "webhook_deployment_pdb.yaml" (string): the server could not find the requested resource CSISnapshotStaticResourceControllerDegraded: clusteroperator/ingress is degraded because Some ingresscontrollers are degraded: ingresscontroller "default" is degraded: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing) ** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report. You may also mark the bug private if you wish.
Previous downgrade bug https://bugzilla.redhat.com/show_bug.cgi?id=1971087
Must gather is bigger than allowed - available to share. Please let me know with whom I should share it with.
Shared must-gather with you @sgreene @mmasters
I have identified the problem and posted a fix https://github.com/openshift/cluster-ingress-operator/pull/627 The manual workaround would be to delete the command values set on the canary daemonset.
upgrade from 4.7.0-0.nightly-2021-06-30-221453 to 4.8.0-0.nightly-2021-07-01-185624 then download to 4.7.0-0.nightly-2021-06-30-221453, the co/ingress is not degraded. $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.7.0-0.nightly-2021-06-30-221453 True False False 80m baremetal 4.7.0-0.nightly-2021-06-30-221453 True False False 6h10m cloud-credential 4.7.0-0.nightly-2021-06-30-221453 True False False 6h15m cluster-autoscaler 4.7.0-0.nightly-2021-06-30-221453 True False False 6h9m config-operator 4.7.0-0.nightly-2021-06-30-221453 True False False 6h10m console 4.7.0-0.nightly-2021-06-30-221453 True False False 81m csi-snapshot-controller 4.7.0-0.nightly-2021-06-30-221453 True False True 6h9m dns 4.8.0-0.nightly-2021-07-01-185624 True False False 3h25m etcd 4.7.0-0.nightly-2021-06-30-221453 True False False 6h9m image-registry 4.7.0-0.nightly-2021-06-30-221453 True False False 6h ingress 4.7.0-0.nightly-2021-06-30-221453 True False False 82m insights 4.7.0-0.nightly-2021-06-30-221453 True False False 6h2m kube-apiserver 4.7.0-0.nightly-2021-06-30-221453 True False False 6h6m kube-controller-manager 4.7.0-0.nightly-2021-06-30-221453 True False False 6h8m kube-scheduler 4.7.0-0.nightly-2021-06-30-221453 True False False 6h7m kube-storage-version-migrator 4.7.0-0.nightly-2021-06-30-221453 True False False 3h9m machine-api 4.7.0-0.nightly-2021-06-30-221453 True False False 5h58m machine-approver 4.7.0-0.nightly-2021-06-30-221453 True False False 6h10m machine-config 4.8.0-0.nightly-2021-07-01-185624 True False False 178m marketplace 4.7.0-0.nightly-2021-06-30-221453 True False False 81m monitoring 4.7.0-0.nightly-2021-06-30-221453 True False False 81m network 4.8.0-0.nightly-2021-07-01-185624 True False False 6h10m node-tuning 4.7.0-0.nightly-2021-06-30-221453 True False False 82m openshift-apiserver 4.7.0-0.nightly-2021-06-30-221453 True False False 3h openshift-controller-manager 4.7.0-0.nightly-2021-06-30-221453 True False False 3h31m openshift-samples 4.7.0-0.nightly-2021-06-30-221453 True False False 82m operator-lifecycle-manager 4.7.0-0.nightly-2021-06-30-221453 True False False 6h9m operator-lifecycle-manager-catalog 4.7.0-0.nightly-2021-06-30-221453 True False False 6h9m operator-lifecycle-manager-packageserver 4.7.0-0.nightly-2021-06-30-221453 True False False 6h2m service-ca 4.8.0-0.nightly-2021-07-01-185624 True False False 6h10m storage 4.7.0-0.nightly-2021-06-30-221453 True False False 3h3m please note: the downgrade still stuck on co/csi-snapshot-controller and that issue is tracked by another BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1973986 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-07-01-185624 True True 102m Unable to apply 4.7.0-0.nightly-2021-06-30-221453: wait has exceeded 40 minutes for these operators: csi-snapshot-controller
OCP engineering has decided to not ship 4.7.20 due to a blocker. This bug will be shipped as part of next z-stream release 4.7.21 planned on July 27th
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.21 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2762