Description of problem (please be detailed as possible and provide log snippests): ---------------------------------------------------------------------- This BZ is spinned off from Bug 1832889 based on comment#8 - Bug 1832889#c8 to track the CSI version upgrade issue On a cluster with OCP 4.5 and OCS 4.4, an upgrade to OCS 4.5 was performed via the CLI. The upgrade was reported as successful, however, the csi-* and osd pods were not upgraded and were still on OCS 4.4 builds. Bug tracking OSD version issue- Bug 1832889 LOGS AND OUTPUTS FROM THE CLUSTER: ================================== Logs available at: http://rhsqe-repo.lab.eng.blr.redhat.com/cns/ocs-qe-bugs/1832889/ $ oc get csv NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.5.0-419.ci OpenShift Container Storage 4.5.0-419.ci ocs-operator.v4.4.0-414.ci Succeeded sh-4.4# ceph versions { "mon": { "ceph version 14.2.8-35.el8cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable)": 3 }, "mgr": { "ceph version 14.2.8-35.el8cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable)": 1 }, "osd": { "ceph version 14.2.4-125.el8cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable)": 3 }, "mds": { "ceph version 14.2.8-35.el8cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable)": 2 }, "rgw": { "ceph version 14.2.8-35.el8cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable)": 1 }, "overall": { "ceph version 14.2.4-125.el8cp (db63624068590e593c47150c7574d08c1ec0d3e4) nautilus (stable)": 3, "ceph version 14.2.8-35.el8cp (b32eac9fd60c00c62a2d3c85d88b483be7b55ba1) nautilus (stable)": 7 } } Version of all relevant components (if applicable): 4.5.0-0.nightly-2020-05-04-113741 ocs-operator.v4.5.0-419.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes, having a partial upgrade results in a mismatch of versions across the ceph components and prevents the user from accessing the features of the latest release. Is there any workaround available to the best of your knowledge? Not that I am aware of Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? -------------------------------------------------------------------------------- 2 Can this issue reproducible? -------------------------------------------------------------------------------- I have tried this only once Can this issue reproduce from the UI? -------------------------------------------------------------------------------- The upgrade was done via CLI If this is a regression, please provide more details to justify this: Steps to Reproduce: -------------------------------------------------------------------------------- On a OCP 4.5 + OCS 4.4 cluster, perform an upgrade to OCS 4.5 as follows: 1. oc edit catsrc/ocs-catalogsource -n openshift-marketplace image: quay.io/rhceph-dev/ocs-olm-operator:4.5.0-419.ci 2. oc edit subscriptions.operators.coreos.com ocs-subscription spec: channel: stable-4.5 3. Wait for the upgrade to complete. Check for csv status $ oc get csv NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.5.0-419.ci OpenShift Container Storage 4.5.0-419.ci ocs-operator.v4.4.0-414.ci Succeeded 4. Check the status of the pods and their versions, esp. CSI plugin and provisioner pods Actual results: -------------------------------------------------------------------------------- csi-* pods were not upgraded and were still running with OCS 4.4 builds Expected results: -------------------------------------------------------------------------------- All the pods should be upgraded to the latest OCS 4.5 builds. There should be no mismatch in the versions across different components Additional Info ======================= none of the CSI pods also got upgraded to the required version: cephcsi:4.5-2.b38f2c5c.release_4.5 quay.io/rhceph-dev/cephcsi@sha256:86087a7123945ce4f7f720539693395e5a6fc8175318d050d0d983af8ea0e216 >> Builds in the CSV - name: ROOK_CSI_CEPH_IMAGE value: quay.io/rhceph-dev/cephcsi@sha256:86087a7123945ce4f7f720539693395e5a6fc8175318d050d0d983af8ea0e216 - name: ROOK_CSI_REGISTRAR_IMAGE value: registry.redhat.io/openshift4/ose-csi-driver-registrar@sha256:b17e943c72cfd2696db2388e817739c23c0427dde4737e14cf58a5f5db50ce60 - name: ROOK_CSI_RESIZER_IMAGE value: registry.redhat.io/openshift4/ose-csi-external-resizer-rhel7@sha256:e7302652fe3f698f8211742d08b2dcea9d77925de458eb30c20789e12ee7ae33 - name: ROOK_CSI_PROVISIONER_IMAGE value: registry.redhat.io/openshift4/ose-csi-external-provisioner-rhel7@sha256:49b470f8f5ce1edb883a03a0b6a726add01fb762cfd42f8941d6841f7d776318 - name: ROOK_CSI_ATTACHER_IMAGE value: registry.redhat.io/openshift4/ose-csi-external-attacher@sha256:fb9f73ed22b4241eba25e71b63aa6729daa2d7e9bce6a13a060fe4c236735140 image: quay.io/rhceph-dev/rook-ceph@sha256:e4e20a1e8756a8b9847def42a60aa117d8ab5633c6eaec3f8013132c2800c72c >> Builds in one of the prov pods csi-cephfsplugin-provisioner-679dd5d8b5-67pwr ==== Image: registry.redhat.io/openshift4/ose-csi-external-attacher@sha256:e07525ae9a8a772ac2e7db1b8f8d8df2dcbc79d66792f570577a7904858b6abb Image ID: registry.redhat.io/openshift4/ose-csi-external-attacher@sha256:e07525ae9a8a772ac2e7db1b8f8d8df2dcbc79d66792f570577a7904858b6abb Image: registry.redhat.io/openshift4/ose-csi-external-resizer-rhel7@sha256:e7302652fe3f698f8211742d08b2dcea9d77925de458eb30c20789e12ee7ae33 Image ID: registry.redhat.io/openshift4/ose-csi-external-resizer-rhel7@sha256:e7302652fe3f698f8211742d08b2dcea9d77925de458eb30c20789e12ee7ae33 Image: registry.redhat.io/openshift4/ose-csi-external-provisioner-rhel7@sha256:9fc69e94f111343a6482e94e413e267dfc3ba17973c321da8fe20f1ac4c09155 Image ID: registry.redhat.io/openshift4/ose-csi-external-provisioner-rhel7@sha256:9fc69e94f111343a6482e94e413e267dfc3ba17973c321da8fe20f1ac4c09155 Image: quay.io/rhceph-dev/cephcsi@sha256:9c55c32aa16e719888c408effe4e800495a70501e82c7a463bc826e3d8b5130f Image ID: quay.io/rhceph-dev/cephcsi@sha256:9c55c32aa16e719888c408effe4e800495a70501e82c7a463bc826e3d8b5130f Image: quay.io/rhceph-dev/cephcsi@sha256:9c55c32aa16e719888c408effe4e800495a70501e82c7a463bc826e3d8b5130f
This is a bug against 4.5, and if I understand it correctly, Madhu's patch is also against the target version of the upgrade, not the starting version.
looks ok in build ocs-operator.v4.5.0-508.ci oc exec rook-ceph-tools-66b74bdf95-qft52 -- ceph versions { "mon": { "ceph version 14.2.8-81.el8cp (0336e23b7404496341b988c8057538b8185ca5ec) nautilus (stable)": 4 }, "mgr": { "ceph version 14.2.8-81.el8cp (0336e23b7404496341b988c8057538b8185ca5ec) nautilus (stable)": 1 }, "osd": { "ceph version 14.2.8-81.el8cp (0336e23b7404496341b988c8057538b8185ca5ec) nautilus (stable)": 3 }, "mds": {}, "overall": { "ceph version 14.2.8-81.el8cp (0336e23b7404496341b988c8057538b8185ca5ec) nautilus (stable)": 8 } }
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.5.0 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3754