Description of problem: Upgrade from 4.5 to 4.6 then back to 4.5 doesn't complete. dns, machine-config, networking, storage stays at 4.6 How reproducible: Always Steps to Reproduce: 1. Start with a good 4.5 installation (I used 4.5.0-0.nightly-2020-09-04-102546) 2. Upgrade to 4.6 nightly (I used 4.6.0-fc.4) ./oc adm upgrade --to-image=quay.io/openshift-release-dev/ocp-release:4.6.0-fc.4-x86_64 --force --allow-explicit-upgrade 3. check that upgrade finish without problem 4. Downgrade again to 4.5 ./oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.5.0-0.nightly-2020-09-04-102546 --force --allow-explicit-upgrade Actual results: ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.5.0-0.nightly-2020-09-04-102546: the cluster operator storage has not yet successfully rolled out warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently installed version 4.5.0-0.nightly-2020-09-04-102546 not found in the "stable-4.5" channel ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.0-0.nightly-2020-09-04-102546 True False False 123m cloud-credential 4.5.0-0.nightly-2020-09-04-102546 True False False 3h32m cluster-autoscaler 4.5.0-0.nightly-2020-09-04-102546 True False False 3h23m config-operator 4.5.0-0.nightly-2020-09-04-102546 True False False 3h24m console 4.5.0-0.nightly-2020-09-04-102546 True False False 97m csi-snapshot-controller 4.5.0-0.nightly-2020-09-04-102546 True False False 125m dns 4.6.0-fc.4 True False False 142m etcd 4.5.0-0.nightly-2020-09-04-102546 True False False 3h28m image-registry 4.5.0-0.nightly-2020-09-04-102546 True False False 131m ingress 4.5.0-0.nightly-2020-09-04-102546 True False False 98m insights 4.5.0-0.nightly-2020-09-04-102546 True False False 3h25m kube-apiserver 4.5.0-0.nightly-2020-09-04-102546 True False False 3h27m kube-controller-manager 4.5.0-0.nightly-2020-09-04-102546 True False False 3h28m kube-scheduler 4.5.0-0.nightly-2020-09-04-102546 True False False 3h28m kube-storage-version-migrator 4.5.0-0.nightly-2020-09-04-102546 True False False 131m machine-api 4.5.0-0.nightly-2020-09-04-102546 True False False 3h22m machine-approver 4.5.0-0.nightly-2020-09-04-102546 True False False 3h27m machine-config 4.6.0-fc.4 True False False 122m marketplace 4.5.0-0.nightly-2020-09-04-102546 True False False 97m monitoring 4.5.0-0.nightly-2020-09-04-102546 True False False 124m network 4.6.0-fc.4 True False False 3h30m node-tuning 4.5.0-0.nightly-2020-09-04-102546 True False False 99m openshift-apiserver 4.5.0-0.nightly-2020-09-04-102546 True False False 99m openshift-controller-manager 4.5.0-0.nightly-2020-09-04-102546 True False False 3h24m openshift-samples 4.5.0-0.nightly-2020-09-04-102546 True False False 98m operator-lifecycle-manager 4.5.0-0.nightly-2020-09-04-102546 True False False 3h29m operator-lifecycle-manager-catalog 4.5.0-0.nightly-2020-09-04-102546 True False False 3h29m operator-lifecycle-manager-packageserver 4.5.0-0.nightly-2020-09-04-102546 True False False 98m service-ca 4.5.0-0.nightly-2020-09-04-102546 True False False 3h30m storage 4.6.0-fc.4 Expected results: Downgrade works and all operators are back to 4.5.
Downgrades are not supported
Vadim is correct about the fact that these are not supported. But it's still nice to understand how they break. These are timing out in CI [1], and bugs in the CI tooling means we don't gather post-test assets when we time out the test, so there are no must-gathers there. Hung, can you attach a must-gather from your run or a reproducer? And we'll follow up with the test-platform folks about the lack of CI gathers. Without a must-gather, it's hard to move forward. [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.5-to-4.6/1304491478306787328
Still no must-gather. [1] will help with the timing-out CI if we can get it working and land it. [1]: https://github.com/openshift/ci-tools/pull/1257
Sorry. I missed the request for must-gather. Reproduced it today with 4.5.14 -> 4.6.0-fc.9 -> 4.5.14. Have the must-gather. Let me know what I should do.
I reproduced this with 4.5.13 -> 4.6 -> 4.5.13 ./oc adm upgrade info: An upgrade is in progress. Unable to apply 4.5.13: the cluster operator storage has not yet successfully rolled out Updates: VERSION IMAGE 4.5.14 quay.io/openshift-release-dev/ocp-release@sha256:95cfe9273aecb9a0070176210477491c347f8e69e41759063642edf8bb8aceb6 ./oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE csi-snapshot-controller 4.5.13 True False False 101m dns 4.6.0-rc.4 True False False 124m machine-config 4.6.0-rc.4 True False False 98m marketplace 4.5.13 True False False 66m monitoring 4.5.13 True False False 65m network 4.6.0-rc.4 True False False 5h45m storage 4.6.0-rc.4 True False False 143m ./oc get clusterversion -o json|jq ".items[0].status.history" [ { "completionTime": null, "image": "quay.io/openshift-release-dev/ocp-release:4.5.13-x86_64", "startedTime": "2020-10-15T00:37:13Z", "state": "Partial", "verified": false, "version": "4.5.13" }, { "completionTime": "2020-10-15T00:11:58Z", "image": "quay.io/openshift-release-dev/ocp-release:4.6.0-rc.4-x86_64", "startedTime": "2020-10-14T23:05:55Z", "state": "Completed", "verified": false, "version": "4.6.0-rc.4" }, { "completionTime": "2020-10-14T20:27:28Z", "image": "quay.io/openshift-release-dev/ocp-release@sha256:8d104847fc2371a983f7cb01c7c0a3ab35b7381d6bf7ce355d9b32a08c0031f0", "startedTime": "2020-10-14T20:01:02Z", "state": "Completed", "verified": false, "version": "4.5.13" } ] I have the must-gather. Too big to attach here. Please let me know with whom it should be shared with. Thanks.
Closing this as duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1882394 *** This bug has been marked as a duplicate of bug 1882394 ***
I'm experiencing it when upgrading AWS cluster from 4.6.9 -> 4.7.0-fc.1 -> 4.6.9. Must gather is available online https://drive.google.com/file/d/1ykM5ikJb-SwDZ29dJqJcsMk76-AlydlG/view?usp=sharing. $ oc describe co/storage Status: Conditions: Last Transition Time: 2021-01-06T04:15:30Z Message: AWSEBSCSIDriverOperatorCRDegraded: ResourceSyncControllerDegraded: configmaps "kube-cloud-config" is forbidden: User "system:serviceaccount:openshift-cluster-csi-drivers:aws-ebs-csi-driver-operator" cannot get resource "configmaps" in API group "" in the namespace "openshift-config-managed" Reason: AWSEBSCSIDriverOperatorCR_ResourceSyncController_Error Status: True Type: Degraded
Yang This bug was raised for different issue i.e. "the cluster operator storage has not yet successfully rolled out". Please raise a new issue for the error you reported. *** This bug has been marked as a duplicate of bug 1882394 ***