Bug 1349571

Summary: Improve MMR replication convergence
Product: Red Hat Enterprise Linux 7 Reporter: Noriko Hosoi <nhosoi>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: urgent Docs Contact: Petr Bokoc <pbokoc>
Priority: urgent    
Version: 7.3CC: arubin, ekeck, mkolaja, mreynolds, nkinder, pbokoc, rmeggins, salmy, snagar
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.5.6-1.el7 Doc Type: Enhancement
Doc Text:
New attribute for configuring replica release timeout In a multi-master replication environment where multiple masters receive updates at the same time, it was previously possible for a single master to obtain exclusive access to a replica and hold it for a very long time due to problems such as a slow network connection. During this time, other masters were blocked from accessing the same replica, which considerably slowed down the replication process. This update adds a new configuration attribute, "nsds5ReplicaReleaseTimeout", which can be used to specify a timeout in seconds. After the specified timeout period passes, the master releases the replica, allowing other masters to access it and send their updates.
Story Points: ---
Clone Of:
: 1351323 1358392 (view as bug list) Environment:
Last Closed: 2016-11-03 20:43:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351323, 1356898    

Description Noriko Hosoi 2016-06-23 17:12:47 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/48636

Replication latency, especially over a WAN, can become worse when there are several masters receiving updates at the same time.  What happens is that one master will take exclusive access of a replica, and not release it for a very long time.  This blocks the other masters from sending their updates to that consumer, and this adds to the replication latency as those updates have to travel back and forth with all the other masters, and consumers.  See the bugzilla for more detailed info.

We need a way to notify a master that it is holding its exclusive access of a replica for too long, and that it needs to yield so other masters can start sending some of their updates to that replica.

Comment 1 mreynolds 2016-06-23 18:26:06 UTC
Fixed upstream:

Design doc for new feature:

http://www.port389.org/docs/389ds/design/repl-conv-design.html

Comment 2 Noriko Hosoi 2016-06-24 18:22:40 UTC
Justification: Important customer reported the problem.
(See also https://bugzilla.redhat.com/show_bug.cgi?id=1157799)

This improvement is beneficial for all the customers who deploy the Directory Server/IPA with the replication.

Comment 17 errata-xmlrpc 2016-11-03 20:43:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html