Bug 2239587 - [RDR][CEPHFS][Tracker] sync for some pvc hangs
Summary: [RDR][CEPHFS][Tracker] sync for some pvc hangs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.16.0
Assignee: Benamar Mekhissi
QA Contact: Aman Agrawal
URL:
Whiteboard:
Depends On: 2246185
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-19 07:23 UTC by Pratik Surve
Modified: 2024-11-15 04:25 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-07-17 13:09:47 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2024:4591 0 None None None 2024-07-17 13:09:48 UTC

Description Pratik Surve 2023-09-19 07:23:25 UTC
Description of problem (please be detailed as possible and provide log
snippests):
[RDR][CEPHFS] sync for some pvc hangs 

Version of all relevant components (if applicable):
OCP version:- 4.14.0-0.nightly-2023-09-15-055234
ODF version:- 4.14.0-135
CEPH version:- ceph version 17.2.6-138.el9cp (b488c8dad42b2ecffcd96f3d76eeeecce48b8590) quincy (stable)
ACM version:- 2.9.0-109
SUBMARINER version:- devel
VOLSYNC version:- volsync-product.v0.7.4
VOLSYNC method:- destinationCopyMethod: LocalDirect

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy RDR cluster
2. Deploy cephfs workload
3. Keep workload running for some days


Actual results:
### from primary cluster 

volsync-rsync-tls-src-dd-io-pvc-1-7n8wt   1/1     Running   0          4h12m
volsync-rsync-tls-src-dd-io-pvc-3-7jl95   1/1     Running   0          4h12m
volsync-rsync-tls-src-dd-io-pvc-6-gtbvm   1/1     Running   0          4h12m


### from Secondary 
pods
NAME                                            READY   STATUS              RESTARTS   AGE
volsync-rsync-tls-dst-dd-io-pvc-1-local-bmqjc   1/1     Running             0          4h13m
volsync-rsync-tls-dst-dd-io-pvc-1-sbfwc         1/1     Running             0          4h14m
volsync-rsync-tls-dst-dd-io-pvc-2-cqhh8         1/1     Running             0          51s
volsync-rsync-tls-dst-dd-io-pvc-2-local-lg9b7   1/1     Running             0          22s
volsync-rsync-tls-dst-dd-io-pvc-3-local-85vxf   1/1     Running             0          4h13m
volsync-rsync-tls-dst-dd-io-pvc-3-tpbjt         1/1     Running             0          4h14m
volsync-rsync-tls-dst-dd-io-pvc-4-local-nftrq   1/1     Running             0          13s
volsync-rsync-tls-dst-dd-io-pvc-4-z26x2         1/1     Running             0          39s
volsync-rsync-tls-dst-dd-io-pvc-5-cxzbl         1/1     Running             0          33s
volsync-rsync-tls-dst-dd-io-pvc-5-local-fm8lv   1/1     Running             0          9m10s
volsync-rsync-tls-dst-dd-io-pvc-6-bhxdj         1/1     Running             0          4h14m
volsync-rsync-tls-dst-dd-io-pvc-6-local-p5gp4   1/1     Running             0          4h11m
volsync-rsync-tls-dst-dd-io-pvc-7-local-r7spb   1/1     Running             0          8m46s
volsync-rsync-tls-dst-dd-io-pvc-7-qsd4g         0/1     ContainerCreating   0          14s
volsync-rsync-tls-src-dd-io-pvc-5-local-hp676   1/1     Running             0          32s
volsync-rsync-tls-src-dd-io-pvc-7-local-l8vdz   0/1     ContainerCreating   0          13s



Expected results:
sync should not hang

Additional info:

Comment 5 Karolin Seeger 2023-09-21 12:12:38 UTC
Seems to be related to Submariner issues, Submariner team is investigating.

Comment 14 Mudit Agarwal 2023-10-16 05:43:54 UTC
Talur, PTAL

Comment 23 Aman Agrawal 2023-10-31 13:23:21 UTC
BZ2246185 is a temp. fix for 4.14 release and we would still need to RCA this bug and understand why it's happening? It can be targetted for 4.15 and then backport to 4.14.z if the fix is from ODF (or track with submariner team if needed).

Comment 27 Mudit Agarwal 2024-01-23 10:11:33 UTC
No update since October, is this still a blocker?

Comment 43 errata-xmlrpc 2024-07-17 13:09:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4591

Comment 44 Red Hat Bugzilla 2024-11-15 04:25:08 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.