Description of problem (please be detailed as possible and provide log snippets): [DR] rbd mirror sheduling is getting stopped for some images Version of all relevant components (if applicable): OCP version:- 4.10.0-0.nightly-2022-03-17-204457 ODF version:- 4.10.0-199 CEPH version:- { "mon": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1 }, "osd": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3 }, "mds": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2 }, "rbd-mirror": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2 }, "rgw": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1 }, "overall": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 12 } } Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? yes there will be a possibility of data loss Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy RDR cluster 2. Run io for 2-3 days 3. Check rbd snap ls for all the images on both sites Actual results: $rbd snap ls output from the secondary site http://pastebin.test.redhat.com/1039155 $rbd mirror image status from the primary site http://pastebin.test.redhat.com/1039160 $rbd snap ls output from the primary site http://pastebin.test.redhat.com/1039156 $rbd mirror image status from the primary site http://pastebin.test.redhat.com/1039157 Expected results: Additional info:
Matching the assignment of the RHCS bz
Moving DR BZs to 4.10.z/4.11
Please provide doc text
*** Bug 2155753 has been marked as a duplicate of this bug. ***
Based on 17.2.6-47
Please add doc text.
Moving to 4.13.z for verification purposes
*** Bug 2215982 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days