Bug 2290691

Summary: [Tracker ACM-12021] [RDR] VolSync - rsync-tls fails to sync when there are too many files in the root of the source PVC
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Aman Agrawal <amagrawa>
Component: odf-drAssignee: Karolin Seeger <kseeger>
odf-dr sub component: ramen QA Contact: Aman Agrawal <amagrawa>
Status: ON_QA --- Docs Contact:
Severity: urgent    
Priority: unspecified CC: ebenahar, kramdoss, kseeger, muagarwa, rtalur, sheggodu
Version: 4.15Keywords: Tracking
Target Milestone: ---   
Target Release: ODF 4.15.8   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.15.8-1 Doc Type: Bug Fix
Doc Text:
Cause: The default limit on number of files open was too low if a PVC had many files in the root directory. Consequence: Not all the files would be copied to the destination cluster. Fix: Updated version of VolSync allows for more files to be open. Result: All the files in the PVC are copied to the destination cluster.
Story Points: ---
Clone Of: 2290526 Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2290526    
Bug Blocks:    

Description Aman Agrawal 2024-06-06 09:46:06 UTC
+++ This bug was initially created as a clone of Bug #2290526 +++

Description of problem (please be detailed as possible and provide log
snippests):

ARGS_MAX can be different values on different linux versions, but have been able to reproduce with ~60k files in the root of the source PVC.


Version of all relevant components (if applicable):
ceph version 18.2.1-188.el9cp (b1ae9c989e2f41dcfec0e680c11d1d9465b1db0e) reef (stable)
OCP 4.16.0-0.nightly-2024-05-23-173505
ACM 2.11.0-DOWNSTREAM-2024-05-23-15-16-26
MCE 2.6.0-104 
ODF 4.16.0-108.stable
Gitops v1.12.3 
Submariner 0.18.0 (image: brew.registry.redhat.io/rh-osbs/iib:722673)
VolSync 0.8.1

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create source PVC with 60k files or dirs (easier to create if using long file names) in the root of the PVC
Create a replicationsource/dest using rsync-tls to sync this to the destination
Check the destination after replication is complete and compare to ensure all files/dirs were copied correctly
2.
3.


Actual results: Files from the source may not be copied to the destination

Expected results: Files should be copied to the dst without any loss/hinderance


Additional info:

--- Additional comment from Aman Agrawal on 2024-06-05 14:34:25 IST ---

We don't need logs for this BZ because live setup was used for debugging/RCA of the issue by Benamar and @tflower together and the fix would land from VolSync.

Comment 3 Karolin Seeger 2024-06-06 09:50:34 UTC
Current plan is to ship the fix with VolSync v0.9.2 on June 16th.

Comment 7 Sunil Kumar Acharya 2024-06-25 07:57:14 UTC
As ACM-12021 is ON_QA moving the BZ to ON_QA.

Comment 9 Sunil Kumar Acharya 2024-06-25 12:09:21 UTC
Please update the RDT flag/text appropriately.