Bug 2162257

Summary: [RDR][CEPHFS] sync/replication is getting stopped for some pvc
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Pratik Surve <prsurve>
Component: odf-drAssignee: Benamar Mekhissi <bmekhiss>
odf-dr sub component: ramen QA Contact: Pratik Surve <prsurve>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bmekhiss, kramdoss, muagarwa, ocs-bugs, odf-bz-bot
Version: 4.12   
Target Milestone: ---   
Target Release: ODF 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-21 15:23:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pratik Surve 2023-01-19 08:30:39 UTC
Description of problem (please be detailed as possible and provide log
snippets):

[RDR][CEPHFS] sync/replication is getting stopped for some pvc 

Version of all relevant components (if applicable):

OCP version:- 4.12.0-0.nightly-2023-01-10-062211
ODF version:- 4.12.0-162
CEPH version:- ceph version 16.2.10-94.el8cp (48ce8ed67474ea50f10c019b9445be7f49749d23) pacific (stable)
ACM version:- v2.7.0
SUBMARINER version:- v0.14.1
VOLSYNC version:- volsync-product.v0.6.0

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy RDR cluster 
2.Create same workload in diff namespace
3.check replicationsource status


Actual results:

busybox-workloads-8     dd-io-pvc-1      dd-io-pvc-1      2023-01-18T06:07:49Z   1m41.91049445s    2023-01-18T06:14:00Z
busybox-workloads-8     dd-io-pvc-2      dd-io-pvc-2      2023-01-18T21:37:29Z   2m29.297571888s   2023-01-18T21:42:00Z
busybox-workloads-8     dd-io-pvc-3      dd-io-pvc-3      2023-01-18T21:37:27Z   2m27.436788893s   2023-01-18T21:42:00Z
busybox-workloads-8     dd-io-pvc-4      dd-io-pvc-4      2023-01-18T21:37:08Z   2m8.632871414s    2023-01-18T21:42:00Z
busybox-workloads-8     dd-io-pvc-5      dd-io-pvc-5      2023-01-18T21:36:36Z   1m36.320840289s   2023-01-18T21:42:00Z
busybox-workloads-8     dd-io-pvc-6      dd-io-pvc-6      2023-01-18T21:36:51Z   1m51.222437946s   2023-01-18T21:42:00Z
busybox-workloads-8     dd-io-pvc-7      dd-io-pvc-7      2023-01-18T21:37:23Z   2m23.237898439s   2023-01-18T21:42:00Z


$pods 
NAME                                  READY   STATUS    RESTARTS   AGE
dd-io-1-5857bfdcd9-pfcxq              1/1     Running   0          26h
dd-io-2-bcd6d9f65-m2dfb               1/1     Running   0          26h
dd-io-3-5d6b4b84df-48lm2              1/1     Running   0          26h
dd-io-4-6f6db89fbf-rrsf9              1/1     Running   0          26h
dd-io-5-7868bc6b5c-k75hr              1/1     Running   0          26h
dd-io-6-58c98598d5-zk75j              1/1     Running   0          26h
dd-io-7-694958ff97-szbch              1/1     Running   0          26h
volsync-rsync-src-dd-io-pvc-1-5jnq2   1/1     Running   0          11h

$ oc logs volsync-rsync-src-dd-io-pvc-1-5jnq2     
VolSync rsync container version: ACM-0.6.0-ce9a280
Syncing data to volsync-rsync-dst-dd-io-pvc-1.busybox-workloads-8.svc.clusterset.local:22 ...


Expected results:
Sync/replication should not stop or hang

Additional info:

Comment 7 Benamar Mekhissi 2023-03-14 12:16:12 UTC
Needs the test to be retried.

Comment 8 krishnaram Karthick 2023-03-30 12:49:48 UTC
@Benamar - Could you please add details on what was fixed and links to PR if any.

Comment 15 errata-xmlrpc 2023-06-21 15:23:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742