@idryomov The workaround is to restart the route agent pod running in all nodes of the cluster that was brought down. The connectivity issue will be between will be present between the pods that are in node other than g/w node. This should not lead to high latency. If there is connectivity , but latency is high, it should not be related to this issue and will not be addressed by https://issues.redhat.com/browse/ACM-5640
@idryomov Submariner does not have any built in tools for that, you will have to do that by manually deploying the pods. @skitt or @sgaddam may have some thoughts on this.
@Ilya - Thank you for pointing out the cause of issue. In a way, I am glad we arrived here. Would it make sense to have a mapping of "data to be synced" vs available bandwidth" and what can one expect on a given configuration? 1) We can request Elvir (or someone from perf engineering) to come up with this information. This is a good information to publish to customers to set expectations on the RPO. Adding needinfo on Elvir to share his thoughts. 2) For now, would it be possible for you to help QE with a theoretical numbers on this mapping?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6832
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days