Bug 2136864 - [ACM Tracker] [DR][CEPHFS] volsync-rsync-src pods are in Error state as they are unable to connect to volsync-rsync-dst
Summary: [ACM Tracker] [DR][CEPHFS] volsync-rsync-src pods are in Error state as they ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-dr
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.12.0
Assignee: Benamar Mekhissi
QA Contact: Pratik Surve
URL:
Whiteboard:
: 2132566 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-21 16:09 UTC by Pratik Surve
Modified: 2023-12-08 04:31 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, CephFS volumes that are DR protected, failed to sync their data across clusters because of the incorrect MTU size configuration for intra-cluster submariner based setups. With this update, in every schedule interval, a `VolSync` job is created for every schedule time interval to sync the delta change between the source and the destination.
Clone Of:
Environment:
Last Closed: 2023-01-31 00:19:51 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2023:0551 0 None None None 2023-01-31 00:20:12 UTC

Description Pratik Surve 2022-10-21 16:09:39 UTC
Description of problem (please be detailed as possible and provide log
snippets):

[DR][CEPHFS] volsync-rsync-src pods are in Error state as they are unable to connect to volsync-rsync-dst 

Version of all relevant components (if applicable):
OCP version:- 4.12.0-0.nightly-2022-10-18-192348
ODF version:- 4.12.0-79
CEPH version:- ceph version 16.2.10-50.el8cp (f311fa3856a155d4cd9b658e25a78def0ae7a7c3) pacific (stable)
ACM version:- 2.6.1
SUBMARINER version:- v0.13.0
VOLSYNC version:- volsync-product.v0.5.0

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes

Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy RDR cluster 
2.Run cephfs dr workload
3.check volsync-rsync-src pod logs


Actual results:
VolSync rsync container version: ACM-0.5.0-df22d29
Syncing data to volsync-rsync-dst-busybox-pvc-1.busybox-workloads-1.svc.clusterset.local:22 ...
Connection closed by 172.31.211.240 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]
Syncronization failed. Retrying in 2 seconds. Retry 1/5.
Connection closed by 172.31.211.240 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]
Syncronization failed. Retrying in 4 seconds. Retry 2/5.
Connection closed by 172.31.211.240 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]
Syncronization failed. Retrying in 8 seconds. Retry 3/5.
Connection closed by 172.31.211.240 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]
Syncronization failed. Retrying in 16 seconds. Retry 4/5.
Connection closed by 172.31.211.240 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.3]
Syncronization failed. Retrying in 32 seconds. Retry 5/5.
Rsync completed in 665s
Synchronization failed. rsync returned: 255


Expected results:


Additional info:

Comment 11 Benamar Mekhissi 2022-10-27 18:31:37 UTC
@prsurve; tt this point, this looks like a submariner issue and we believe it is fixed by this PR: https://github.com/submariner-io/submariner/pull/2087

Comment 12 Mudit Agarwal 2022-10-29 03:56:35 UTC
Can this be tested with the latest builds with the fix?

Comment 17 Shyamsundar 2022-11-08 12:14:09 UTC
*** Bug 2132566 has been marked as a duplicate of this bug. ***

Comment 33 errata-xmlrpc 2023-01-31 00:19:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.12.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:0551

Comment 34 Red Hat Bugzilla 2023-12-08 04:31:01 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.