Description of problem (please be detailed as possible and provide log
snippests):
Steps to reproduce:-
-------------------
1) Keep the workload in RDR setup running for more than a week
Additional Info:
----------------
Not able to execute some rbd commands. Getting the below message while executing the rbd cmd
rbd mirror snapshot schedule list --recursive
rbd: rbd mirror snapshot schedule list failed: (11) Resource temporarily unavailable
Actual results:
---------------
observing the error message
'rados: ret=-11, Resource temporarily unavailable'
Because of this the snapshot scheduling stops
Expected results:
------------------
-> Snapshot scheduling should not stop
vr yaml:
-------
oc get vr busybox-pvc-61 -o yaml
apiVersion: replication.storage.openshift.io/v1alpha1
kind: VolumeReplication
metadata:
creationTimestamp: "2023-07-10T08:04:25Z"
finalizers:
- replication.storage.openshift.io
generation: 1
name: busybox-pvc-61
namespace: appset-busybox-4
ownerReferences:
- apiVersion: ramendr.openshift.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: VolumeReplicationGroup
name: busybox-4-placement-drpc
uid: 6f21ad83-16e0-4eb9-98bf-e43b9fb9bdf0
resourceVersion: "36486402"
uid: a85e701c-4109-49a5-9dd6-fcb682a818bf
spec:
autoResync: false
dataSource:
apiGroup: ""
kind: PersistentVolumeClaim
name: busybox-pvc-61
replicationHandle: ""
replicationState: primary
volumeReplicationClass: rbd-volumereplicationclass-2263283542
status:
conditions:
- lastTransitionTime: "2023-07-10T08:04:26Z"
message: ""
observedGeneration: 1
reason: FailedToPromote
status: "False"
type: Completed
- lastTransitionTime: "2023-07-10T08:04:26Z"
message: ""
observedGeneration: 1
reason: Error
status: "True"
type: Degraded
- lastTransitionTime: "2023-07-10T08:04:26Z"
message: ""
observedGeneration: 1
reason: NotResyncing
status: "False"
type: Resyncing
message: 'rados: ret=-11, Resource temporarily unavailable'
observedGeneration: 1
state: Unknown
Must gather logs
----------------
c1 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/c1/
c2 - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/c2/
hub - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-2219628/july10/hub/
Version of all relevant components (if applicable):
OCP Version - 4.13.0-0.nightly-2023-06-05-164816
ODF - ODF 4.13.0-219.snaptrim
SUBMARINER version:- v0.15.1
VOLSYNC version:- volsync-product.v0.7.1
ceph version - ceph version 17.2.6-70.0.TEST.bz2119217.el9cp (6d74fefa15d1216867d1d112b47bb83c4913d28f) quincy (stable)
Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Is there any workaround available to the best of your knowledge?
Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
Can this issue reproducible?
Can this issue reproduce from the UI?
If this is a regression, please provide more details to justify this: