Description of problem (please be detailed as possible and provide a log
snippets):
[DR] rbd image mirrored images are reporting failed to copy remote image
A version of all relevant components (if applicable):
Does this issue impact your ability to continue to work with the product?
(please explain in detail what is the user impact)?
ODF version:- 4.10.0-199
OCP version:- 4.10.0-0.nightly-2022-03-17-204457
$ ceph versions
{
"mon": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
},
"osd": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3
},
"mds": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2
},
"rbd-mirror": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
},
"rgw": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1
},
"overall": {
"ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 11
}
}
Is there any workaround available to the best of your knowledge?
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
Can this issue be reproducible?
yes
Can this issue reproduce from the UI?
If this is a regression, please provide more details to justify this:
Steps to Reproduce:
1. Deploy RDR cluster
2. Run io in bg for 2,3 days
3. Check the rbd mirror image status
Actual results:
##########################################
csi-vol-2456491d-a6ad-11ec-a26c-0a580a83002e:
global_id: 7aff2041-fb7f-47be-a3db-5c7128f6e4c7
state: up+error
description: failed to copy remote image
service: a on prsurve-vm-dev-5ffqx-worker-dwmlm
last_update: 2022-03-19 07:02:24
peer_sites:
name: f9c4bbbf-4acf-41cd-8f78-5c7afbad18ba
state: up+stopped
description: local image is primary
last_update: 2022-03-21 11:10:59
Expected results:
there should not be any error
Additional info:
Only observing `failed to copy remote image` on secondary cluster
Description of problem (please be detailed as possible and provide a log snippets): [DR] rbd image mirrored images are reporting failed to copy remote image A version of all relevant components (if applicable): Does this issue impact your ability to continue to work with the product? (please explain in detail what is the user impact)? ODF version:- 4.10.0-199 OCP version:- 4.10.0-0.nightly-2022-03-17-204457 $ ceph versions { "mon": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1 }, "osd": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 3 }, "mds": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 2 }, "rbd-mirror": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1 }, "rgw": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 1 }, "overall": { "ceph version 16.2.7-76.el8cp (f4d6ada772570ae8b05c62ad79e222fbd3f04188) pacific (stable)": 11 } } Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue be reproducible? yes Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy RDR cluster 2. Run io in bg for 2,3 days 3. Check the rbd mirror image status Actual results: ########################################## csi-vol-2456491d-a6ad-11ec-a26c-0a580a83002e: global_id: 7aff2041-fb7f-47be-a3db-5c7128f6e4c7 state: up+error description: failed to copy remote image service: a on prsurve-vm-dev-5ffqx-worker-dwmlm last_update: 2022-03-19 07:02:24 peer_sites: name: f9c4bbbf-4acf-41cd-8f78-5c7afbad18ba state: up+stopped description: local image is primary last_update: 2022-03-21 11:10:59 Expected results: there should not be any error Additional info: Only observing `failed to copy remote image` on secondary cluster