Description of problem (please be detailed as possible and provide log snippests): Version of ocs-storagecluster should be set to 4.4.1 when using OCS 4.4.1 but it is set to 4.4.0. Version of all relevant components (if applicable): OCP 4.5.0-rc.7 OCS ocs-operator.v4.4.1-476.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? no Is there any workaround available to the best of your knowledge? no Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Check in terminal: $ oc get -n openshift-storage storagecluster NAME AGE PHASE CREATED AT VERSION ocs-storagecluster 86m Ready 2020-07-09T13:43:09Z 4.4.0 Actual results: Version is set to 4.4.0 Expected results: $ oc get -n openshift-storage storagecluster NAME AGE PHASE CREATED AT VERSION ocs-storagecluster 86m Ready 2020-07-09T13:43:09Z 4.4.1 Additional info:
was this installed with 4.4.0 and then upgraded to 4.4.1? @Filip? I thought we had a similar BZ recently where it was explained that this version has no relevance and is only set once upon installation and not changed in upgrades. But I can not find the BZ right now. @Jose?
No, it was without upgrade.
(In reply to Michael Adam from comment #2) > was this installed with 4.4.0 and then upgraded to 4.4.1? > > @Filip? > > I thought we had a similar BZ recently where it was explained that this > version has no relevance and is only set once upon installation and not > changed in upgrades. But I can not find the BZ right now. > > @Jose? I think this is the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1839988 but we are talking about a fresh installation here not an upgrade.
CSV picks up the proper version but storage cluster doesn't. [muagarwa@dhcp53-217 agarwal-mudit]$ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE lib-bucket-provisioner.v2.0.0 lib-bucket-provisioner 2.0.0 lib-bucket-provisioner.v1.0.0 Succeeded ocs-operator.v4.4.1 OpenShift Container Storage 4.4.1 Installing [muagarwa@dhcp53-217 agarwal-mudit]$ oc get storagecluster -A NAMESPACE NAME AGE PHASE CREATED AT VERSION openshift-storage ocs-storagecluster 8m5s Ready 2020-07-17T04:11:59Z 4.4.0 [muagarwa@dhcp53-217 agarwal-mudit]$ I guess storagecluster CR is created during OCS installation and at least GUI doesn't display minor versions as an option and hence this should be expected but I am no expert, moving it to ocs-operator. See also: https://bugzilla.redhat.com/show_bug.cgi?id=1839988
Looks like version comes from github.com/openshift/ocs-operator/version/version.go which is not updated for minor versions. In pkg/controller/storagecluster/reconcile.go: ----- // versionCheck populates the `.Spec.Version` field func versionCheck(sc *ocsv1.StorageCluster, reqLogger logr.Logger) error { if sc.Spec.Version == "" { sc.Spec.Version = version.Version -----
I am not sure I can verify this without additional scratch build for 4.9.0-129.ci. When I checked the version of storageCluster in one of the 4.9 build: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/dnd-jopinto-aws-perf-reruns/dnd-jopinto-aws-perf-reruns_20210927T103353/logs/deployment_1632739410/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-763a982dce2ab3c61da7f08e9bb4f02e8bb33f85430a09f6d37484656f608242/namespaces/openshift-storage/oc_output/storagecluster.yaml It's: version: 4.9.0 After upgrade from 4.8 to 4.9 I still see it's reported as 4.8.0 here: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-013vue1cslv33-uan/j-013vue1cslv33-uan_20210922T210338/logs/deployment_1632345146/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-bfb5c6e78f74c584cf169e1f431d687314ab48472dddc46fe6767a836ea4bb3e/namespaces/openshift-storage/oc_output/storagecluster.yaml But AFAIU this bug is not about to fix this issue when upgrade but for fresh deployment. Boris and Mudit, can we get some scratch build for 4.9.1 to be able to verify this BZ? Thanks
I guess we need 4.9.1 build for the internal build upgrade issue also, Boris please help.
I am not sure if this fix will also fix the upgrade. I remember for upgrade there was other bug (opened even longer time ago), but cannot find it now :(
@
@branto any update here please? I cannot continue with verification so I am moving it back to Assigned as without scratch build of 4.9.1 I cannot test it.
We still have the fix in the downstream branch, so MODIFIED is the apt state here till it is testable by QA. Again non-testability doesn't make it FailedQA, its just not testable atm. Boris, can we provide a build here?
(In reply to Mudit Agarwal from comment #18) > We still have the fix in the downstream branch, so MODIFIED is the apt state > here till it is testable by QA. > Again non-testability doesn't make it FailedQA, its just not testable atm. > > Boris, can we provide a build here? I've started a 4.9.1 build here: https://ceph-downstream-jenkins-csb-storage.apps.ocp4.prod.psi.redhat.com/view/OCS%204.X%20CI%20CD/job/OCS%20Build%20Pipeline%204.9/184/ Thanks, Andrew
A testing build for 4.9.1 is now available: 4.9.1-186.ci
Thanks, running verification here: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/6660/ Will check storageCluster CR once it's deployed and paused.
bug-storageCluster $ oc get -n openshift-storage storagecluster NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 32m Ready 2021-10-13T09:57:07Z 4.9.0 bug-storageCluster $ oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE noobaa-operator.v4.9.1 NooBaa Operator 4.9.1 Succeeded ocs-operator.v4.9.1 OpenShift Container Storage 4.9.1 ocs-operator.v4.9.0 Succeeded odf-operator.v4.9.1 OpenShift Data Foundation 4.9.1 I see that in this build the version of storageCluster was not updated to 4.9.1. Not sure if it's problem of how testing build 4.9.1-186.ci was done. Anyway with what I got I have to FAIL QE.
Looks like this is not yet fixed and is not that important to make it a blocker for 4.9 We can give it a try again in 4.10 if it is still relevant.
At present ocs-operator gets its versioning from a version.go file: https://github.com/red-hat-storage/ocs-operator/blob/main/version/version.go https://github.com/red-hat-storage/ocs-operator/blob/release-4.10/version/version.go https://github.com/red-hat-storage/ocs-operator/blob/release-4.9/version/version.go So we need to update all branches with the appropriate values. Just changing the string should be sufficient.
version.Version is assigned a value at build time here: [0] I opened a PR for this because changes to the above env var were only reflected in the status field, not spec.version (as only status was updated: [1]), but its on hold for now because we need to rework the version logic a bit: [2]. [0]: https://github.com/red-hat-storage/ocs-operator/blob/0580908d99610dfddfc46c09ed9cb67effadb127/hack/go-build.sh#L10 [1]: https://github.com/red-hat-storage/ocs-operator/pull/1447/files#diff-4160c186e4e6b63623e530892b0aa7f6385e7f60f45a5f3da944492ea6211787L163 [2]: https://github.com/red-hat-storage/ocs-operator/pull/1447#pullrequestreview-847397536
Can't be fixed before 4.10 dev freeze
The main thing that needs consideration is upgrade, and specifically that you can't remove fields (only deprecate). So you would need to retain the Spec field when you add the Status field. Then you would need some logic to update the StorageCluster CR to use the new Status field and clear the Spec field. I suppose we would also need to update our CRD display information to pull the version from the Status as well. If we have buy-in from the Console devs, we should be able to do it all in the same release, otherwise we would add the new field in one release and then the upgrade logic later. They just have to be able to deal with it the new field by the time we officially deprecate it.
Not a 4.12 blocker
The fix should be available to test on all ODF 4.13 build after 4.13.0-93. As QE acks are already there moving to ON_QA
In order to test this we need to have some dummy 4.13.1 dummy build. @branto , can you please provide such build?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3742