Bug 2154351

Summary: [RDR] [Ceph Fix Tracker BZ #2119217] lastSyncTime for all VR's is several hours behind lastUpdateTime
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Aman Agrawal <amagrawa>
Component: cephAssignee: Adam Kupczyk <akupczyk>
ceph sub component: RBD-Mirror QA Contact: Sidhant Agrawal <sagrawal>
Status: VERIFIED --- Docs Contact:
Severity: high    
Priority: unspecified CC: aclewett, akupczyk, asuryana, bniver, ebenahar, ekuric, idryomov, kramdoss, kseeger, muagarwa, nojha, odf-bz-bot, prsurve, sgaddam, sheggodu, skitt, sostapov, srangana
Version: 4.12Flags: kramdoss: needinfo? (ekuric)
amagrawa: needinfo? (aclewett)
Target Milestone: ---   
Target Release: ODF 4.14.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.13.0-202 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2218013, 2119217    
Bug Blocks:    

Comment 59 Aswin Suryanarayanan 2023-06-29 16:22:14 UTC
@idryomov The workaround is to restart the route agent pod running in all nodes of the cluster that was brought down. The connectivity issue will be between will be present between the pods that are in node other than g/w node. This should not lead to high latency. If there is connectivity , but latency is high, it should not be related to this issue and will not be addressed by https://issues.redhat.com/browse/ACM-5640

Comment 61 Aswin Suryanarayanan 2023-06-29 20:27:55 UTC
@idryomov Submariner does not have any built in tools for that, you will have to do that by manually deploying the pods. 

@skitt or @sgaddam may have some thoughts on this.

Comment 68 krishnaram Karthick 2023-07-04 11:41:06 UTC
@Ilya - Thank you for pointing out the cause of issue. 

In a way, I am glad we arrived here. Would it make sense to have a mapping of "data to be synced" vs available bandwidth" and what can one expect on a given configuration? 
1) We can request Elvir (or someone from perf engineering) to come up with this information. This is a good information to publish to customers to set expectations on the RPO. Adding needinfo on Elvir to share his thoughts. 
2) For now, would it be possible for you to help QE with a theoretical numbers on this mapping?