+++ This bug was initially created as a clone of Bug #1991938 +++ Description of problem: Upgrading descheduler operator from 4.8 to 4.9 fails with error “CRD removes version v1beta1 that is listed as a stored version on the existing CRD” Version-Release number of selected component (if applicable): clusterkubedescheduleroperator.4.9.0-202108050954 How reproducible: Hit once Steps to Reproduce: 1. Install 4.6 descheduler operator 2. Upgrade from 4.6 -> 4.7 -> 4.8 -> 4.9 3. Actual results: Upgrading from 4.8 to 4.9 fails [knarra@knarra ~]$ oc get csv -n openshift-kube-descheduler-operator NAME DISPLAY VERSION REPLACES PHASE clusterkubedescheduleroperator.4.8.0-202107291502 Kube Descheduler Operator 4.8.0-202107291502 clusterkubedescheduleroperator.4.7.0-202107292319 Replacing clusterkubedescheduleroperator.4.9.0-202108050954 Kube Descheduler Operator 4.9.0-202108050954 clusterkubedescheduleroperator.4.8.0-202107291502 Pending elasticsearch-operator.5.2.0-28 OpenShift Elasticsearch Operator 5.2.0-28 elasticsearch-operator.5.1.1-27 Succeeded status: bundleLookups: - catalogSourceRef: name: qe-app-registry namespace: openshift-marketplace identifier: clusterkubedescheduleroperator.4.9.0-202108050954 path: registry-proxy.engineering.redhat.com/rh-osbs/openshift-ose-cluster-kube-descheduler-operator-bundle@sha256:f2be2b8652b6e98693cafc4ed311c95d2ce6af9c458bedad41341703dae3e1d1 properties: '{"properties":[{"type":"olm.gvk","value":{"group":"operator.openshift.io","kind":"KubeDescheduler","version":"v1"}},{"type":"olm.package","value":{"packageName":"cluster-kube-descheduler-operator","version":"4.9.0-202108050954"}}]}' replaces: clusterkubedescheduleroperator.4.8.0-202107291502 catalogSources: [] conditions: - lastTransitionTime: "2021-08-10T10:03:01Z" lastUpdateTime: "2021-08-10T10:03:01Z" message: 'risk of data loss updating "kubedeschedulers.operator.openshift.io": new CRD removes version v1beta1 that is listed as a stored version on the existing CRD' reason: InstallComponentFailed status: "False" type: Installed message: 'risk of data loss updating "kubedeschedulers.operator.openshift.io": new CRD removes version v1beta1 that is listed as a stored version on the existing CRD' phase: Failed plan: - resolving: clusterkubedescheduleroperator.4.9.0-202108050954 resource: group: operators.coreos.com kind: ClusterServiceVersion Expected results: Upgrade should be successful Additional info:
Moving the bug back to assigned state as i still hit the same issue. Conditions: Last Transition Time: 2021-08-24T16:21:59Z Last Update Time: 2021-08-24T16:21:59Z Message: risk of data loss updating "kubedeschedulers.operator.openshift.io": new CRD removes version v1beta1 that is listed as a stored version on the existing CRD Reason: InstallComponentFailed Status: False Type: Installed Message: risk of data loss updating "kubedeschedulers.operator.openshift.io": new CRD removes version v1beta1 that is listed as a stored version on the existing CRD Phase: Failed Plan: Resolving: clusterkubedescheduleroperator.4.9.0-202108221722 Resource: Group: operators.coreos.com Kind: ClusterServiceVersion Manifest: {"kind":"ConfigMap","name":"bba9d26db310eed6d2f206561382b49752abb4a151e9f83b2ed75ec314da13a","namespace":"openshift-marketplace","catalogSourceName":"qe-app-registry","catalogSourceNamespace":"openshift-marketplace","replaces":"clusterkubedescheduleroperator.4.8.0-202108181331","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"operator.openshift.io\",\"kind\":\"KubeDescheduler\",\"version\":\"v1\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"cluster-kube-descheduler-operator\",\"version\":\"4.9.0-202108221722\"}}]}"} Name: clusterkubedescheduleroperator.4.9.0-202108221722 Source Name: qe-app-registry Source Namespace: openshift-marketplace Version: v1alpha1 Status: Created Resolving: clusterkubedescheduleroperator.4.9.0-202108221722 Resource: Group: apiextensions.k8s.io Kind: CustomResourceDefinition Manifest: {"kind":"ConfigMap","name":"bba9d26db310eed6d2f206561382b49752abb4a151e9f83b2ed75ec314da13a","namespace":"openshift-marketplace","catalogSourceName":"qe-app-registry","catalogSourceNamespace":"openshift-marketplace","replaces":"clusterkubedescheduleroperator.4.8.0-202108181331","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"operator.openshift.io\",\"kind\":\"KubeDescheduler\",\"version\":\"v1\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"cluster-kube-descheduler-operator\",\"version\":\"4.9.0-202108221722\"}}]}"} Name: kubedeschedulers.operator.openshift.io Source Name: qe-app-registry Source Namespace: openshift-marketplace Version: v1 Status: Unknown
Is this a 4.9 blocker+ because it blocks upgrades of clusters with descheduler installed?
Compare https://github.com/openshift/cluster-kube-descheduler-operator/pull/210#issuecomment-906164149
From the cluster provided by Rama, etcd stores the cluster object as operator.openshift.io/v1, not operator.openshift.io/v1betav1 etcdctl get /kubernetes.io/operator.openshift.io/kubedeschedulers/openshift-kube-descheduler-operator/cluster /kubernetes.io/operator.openshift.io/kubedeschedulers/openshift-kube-descheduler-operator/cluster {"apiVersion":"operator.openshift.io/v1","kind":"KubeDescheduler","metadata":{"creationTimestamp":"2021-08-26T15:06:39Z","generation":2,"managedFields":[{"apiVersion":"operator.openshift.io/v1beta1","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{".":{},"f:deschedulingIntervalSeconds":{},"f:image":{}}},"manager":"Mozilla","operation":"Update","time":"2021-08-26T15:06:39Z"},{"apiVersion":"operator.openshift.io/v1beta1","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:logLevel":{},"f:operatorLogLevel":{}},"f:status":{".":{},"f:readyReplicas":{}}},"manager":"cluster-kube-descheduler-operator","operation":"Update","time":"2021-08-26T15:06:39Z"},{"apiVersion":"operator.openshift.io/v1beta1","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:profiles":{}}},"manager":"kubectl-edit","operation":"Update","time":"2021-08-26T16:22:42Z"},{"apiVersion":"operator.openshift.io/v1","fieldsType":"FieldsV1","fieldsV1":{"f:status":{"f:generations":{}}},"manager":"cluster-kube-descheduler-operator","operation":"Update","subresource":"status","time":"2021-08-27T07:07:42Z"}],"name":"cluster","namespace":"openshift-kube-descheduler-operator","uid":"0e6a6282-3efa-47ad-8e32-8177a1f5ce1b"},"spec":{"deschedulingIntervalSeconds":3600,"logLevel":"Normal","operatorLogLevel":"Normal","profiles":["LifecycleAndUtilization","TopologyAndDuplicates","LifecycleAndUtilization"]},"status":{"generations":[{"group":"apps","hash":"","lastGeneration":4,"name":"cluster","namespace":"openshift-kube-descheduler-operator","resource":"deployments"}],"readyReplicas":0}}
So one can do the following: 1. create migration manifest: ``` apiVersion: migration.k8s.io/v1alpha1 kind: StorageVersionMigration metadata: name: operator-kubedescheduler-storage-version-migration spec: resource: group: operator.openshift.io resource: kubedeschedulers version: v1beta1 ``` 2. update the crd and remove the v1beta1 from the .status.storedVersions field ``` oc proxy --port=8080 & curl -d '[{ "op": "replace", "path":"/status/storedVersions", "value": ["v1"] }]' -H "Content-Type: application/json-patch+json" -X PATCH http://localhost:8080/apis/apiextensions.k8s.io/v1/customresourcedefinitions/kubedeschedulers.operator.openshift.io/status ``` Although, at this point the installplan is in failed state and the catalog operator no longer retries to follow it to finish the upgrade.
The fix for this is in https://github.com/openshift/cluster-kube-descheduler-operator/pull/215 However we have 2 bugs: this one and https://bugzilla.redhat.com/show_bug.cgi?id=1991938 for 4.8, and the fix is only going into the 4.8 branch. So, since this is unrelated to 4.9 (we have identified that the fix needs to go into the 4.8 operator), I am going to close this BZ since it is blocking the 4.8 bug from merging. If there is any objection, feel free to reopen. *** This bug has been marked as a duplicate of bug 1991938 ***