Bug 2067095 - [RDR] [tracker for BZ 2111364 and BZ 2211290] rbd mirror scheduling is getting stopped for some images
Summary: [RDR] [tracker for BZ 2111364 and BZ 2211290] rbd mirror scheduling is gettin...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.14.0
Assignee: Ram Raja
QA Contact: Pratik Surve
URL:
Whiteboard:
: 2155753 2215982 (view as bug list)
Depends On: 1882534 2069720 2111364 2111375 2120624 2121514 2211290
Blocks: 2094357
TreeView+ depends on / blocked
 
Reported: 2022-03-23 09:57 UTC by Pratik Surve
Modified: 2024-03-08 04:25 UTC (History)
21 users (show)

Fixed In Version: 4.14.0-130
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2069720 2102221 2111364 2111375 2116900 2120624 2229303 (view as bug list)
Environment:
Last Closed: 2023-11-08 18:49:50 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:50:34 UTC

Description Pratik Surve 2022-03-23 09:57:15 UTC
Description of problem (please be detailed as possible and provide log
snippets):

[DR] rbd mirror sheduling is getting stopped for some images 

Version of all relevant components (if applicable):

OCP version:- 4.10.0-0.nightly-2022-03-17-204457
ODF version:- 4.10.0-199
CEPH version:- {
    "mon": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "osd": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mds": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2
    },
    "rbd-mirror": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2
    },
    "rgw": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "overall": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 12
    }
}

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

yes there will be a possibility of data loss

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy RDR cluster
2. Run io for 2-3 days 
3. Check rbd snap ls for all the images on both sites 


Actual results:
$rbd snap ls output from the secondary site 

http://pastebin.test.redhat.com/1039155

$rbd mirror image status from the primary site

http://pastebin.test.redhat.com/1039160

$rbd snap ls output from the primary site

http://pastebin.test.redhat.com/1039156

$rbd mirror image status from the primary site

http://pastebin.test.redhat.com/1039157


Expected results:


Additional info:

Comment 5 Josh Durgin 2022-03-29 15:26:13 UTC
Matching the assignment of the RHCS bz

Comment 11 Mudit Agarwal 2022-04-05 13:45:16 UTC
Moving DR BZs to 4.10.z/4.11

Comment 53 Mudit Agarwal 2022-08-11 05:03:00 UTC
Please provide doc text

Comment 109 Ilya Dryomov 2023-04-11 12:38:55 UTC
*** Bug 2155753 has been marked as a duplicate of this bug. ***

Comment 110 Mudit Agarwal 2023-05-15 17:48:45 UTC
Based on 17.2.6-47

Comment 114 Mudit Agarwal 2023-06-09 02:49:01 UTC
Please add doc text.

Comment 120 Elad 2023-06-19 06:01:23 UTC
Moving to 4.13.z for verification purposes

Comment 123 Ilya Dryomov 2023-07-11 12:55:43 UTC
*** Bug 2215982 has been marked as a duplicate of this bug. ***

Comment 137 errata-xmlrpc 2023-11-08 18:49:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832

Comment 138 Red Hat Bugzilla 2024-03-08 04:25:05 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.