Bug 2066259

Summary: [DR] rbd mirrored images are reporting failed to copy remote image
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Pratik Surve <prsurve>
Component: cephAssignee: Sunny Kumar <sunkumar>
ceph sub component: RBD-Mirror QA Contact: Elad <ebenahar>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: high    
Priority: unspecified CC: bniver, ekuric, idryomov, kramdoss, kseeger, madam, mmuench, muagarwa, ocs-bugs, odf-bz-bot, srangana, sunkumar, vashastr
Version: 4.10   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-02 13:02:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pratik Surve 2022-03-21 11:16:15 UTC
Description of problem (please be detailed as possible and provide a log
snippets):

[DR] rbd image mirrored images are reporting failed to copy remote image


A version of all relevant components (if applicable):



Does this issue impact your ability to continue to work with the product?
(please explain in detail what is the user impact)?

ODF version:- 4.10.0-199
OCP version:- 4.10.0-0.nightly-2022-03-17-204457

$ ceph versions
{
    "mon": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "osd": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
    },
    "mds": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2
    },
    "rbd-mirror": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "rgw": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
    },
    "overall": {
        "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 11
    }
}



Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue be reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy RDR cluster
2. Run io in bg for 2,3 days
3. Check the rbd mirror image status 


Actual results:
##########################################
csi-vol-2456491d-a6ad-11ec-a26c-0a580a83002e:
  global_id:   7aff2041-fb7f-47be-a3db-5c7128f6e4c7
  state:       up+error
  description: failed to copy remote image
  service:     a on prsurve-vm-dev-5ffqx-worker-dwmlm
  last_update: 2022-03-19 07:02:24
  peer_sites:
    name: f9c4bbbf-4acf-41cd-8f78-5c7afbad18ba
    state: up+stopped
    description: local image is primary
    last_update: 2022-03-21 11:10:59


Expected results:

there should not be any error

Additional info:
Only observing `failed to copy remote image` on secondary cluster

Comment 9 Mudit Agarwal 2022-05-31 09:19:51 UTC
Not reproducible, reducing the severity.

Comment 10 Mudit Agarwal 2022-07-05 13:13:12 UTC
Not a 4.11 blocker.

Comment 16 Mudit Agarwal 2022-10-26 10:07:47 UTC
Hi Ilya,
The needinfo was for https://bugzilla.redhat.com/show_bug.cgi?id=2066259#c14 only in case you missed it