Description of problem (please be detailed as possible and provide log snippests): [DR] when Relocate action is performed rbd image is not getting deleted on seconday site Version of all relevant components (if applicable): odf-operator.v4.9.0-138.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy 2 DR cluster 2. Deploy workload 3. Perform Failover 4. After some time perform Relocate 5. Delete the Application completely 6. Check for pv,pvc,vrc 6. Check for rbd image on secondary site Actual results: rbd image still present on the secondary cluster rbd info ocs-storagecluster-cephblockpool/csi-vol-b0211025-1611-11ec-962a-0a580a8301dd rbd image 'csi-vol-b0211025-1611-11ec-962a-0a580a8301dd': size 1 GiB in 256 objects order 22 (4 MiB objects) snapshot_count: 2 id: 62b137f3f8bc block_name_prefix: rbd_data.62b137f3f8bc format: 2 features: layering, non-primary op_features: flags: create_timestamp: Wed Sep 15 10:42:59 2021 access_timestamp: Wed Sep 15 14:43:54 2021 modify_timestamp: Wed Sep 15 10:42:59 2021 mirroring state: enabled mirroring mode: snapshot mirroring global id: e8c61a2f-ddaf-4df9-ba93-d352483efe44 mirroring primary: false bash-4.4$ rbd mirror image status ocs-storagecluster-cephblockpool/csi-vol-b0211025-1611-11ec-962a-0a580a8301dd csi-vol-b0211025-1611-11ec-962a-0a580a8301dd: global_id: e8c61a2f-ddaf-4df9-ba93-d352483efe44 state: up+error description: split-brain last_update: 2021-09-16 11:08:14 peer_sites: name: 3401ff21-accc-4fbe-9cd7-34c9e729aa0d state: up+unknown description: remote image is non-primary last_update: 2021-09-20 13:35:29 The same rbd image is not present on secondary site Expected results: rbd image should be deleted Additional info:
RBD image on the Secondary site post a relocation of failover, would still exist. The image would be garbage collected when the application and its PVCs are deleted on the primary site. Hence I assume this is not an issue. Although, in our testing we have noticed that the remote site image is not always garbage collected when primary site application (PVC and resources) is deleted. @pratik, looking for clarification if the application was deleted and the remote image was still present, or the expectation here is that post relocation the remote image would not be present. If the former then we would need to track and get this fixed, if the latter then there is an expectation mismatch on the feature.
This will be targeted for RHCS 5.0 z2 and therefore be available for the ODF 4.9 z stream.
This BZ was discussed in the last DR sync up and was agreed upon to move out of 4.9 because the ceph fix is out of scope for 5.0z1 It is an image garbage collection issue. Does not impact a user from trying out the feature and should be ok to be part of a subsequent z release. Users who have tried this out in their clusters may need to use the toolbox(with the help of support) to garbage collect the image on the secondary cluster. Moving this out and marking it as a known issue.
Shyam, please add doc text
*** Bug 2037650 has been marked as a duplicate of this bug. ***
The intention was to close the 4.9.z bug but I did not notice that it is also targeted for 4.10. So removing the 4.9 flag and moving the bug back to on_qa -> verified for correctness.
Please add doc text
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372