Bug 1009122 - replication stops with excessive clock skew
replication stops with excessive clock skew
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
Depends On:
Blocks: 1009679 1061410
  Show dependency treegraph
 
Reported: 2013-09-17 14:04 EDT by Rich Megginson
Modified: 2016-03-11 10:44 EST (History)
7 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-34.el6
Doc Type: Bug Fix
Doc Text:
Cause: The multi-master replication protocol keeps a cumulative counter of the relative time offsets between servers. If the system time is adjusted by more than one day (ntp issues, vm issues), the counter will be off by more than one day. Consequence: A replication consumer will refuse to accept changes from a master that has a time offset more than 1 day. Replication will be broken from that supplier to the consumer. Fix: A new configuration attribute was added to cn=config - nsslapd-ignore-time-skew. The default is "off". If this attribute is set to "on", a replication consumer will allow replication to proceed despite excessive time skew. An error message will still be logged, warning the admin about the time skew issue. Result: When nsslapd-ignore-time-skew is set to "on", replication will proceed despite excessive time skew.
Story Points: ---
Clone Of:
: 1009679 (view as bug list)
Environment:
Last Closed: 2014-10-14 03:50:19 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1385 normal SHIPPED_LIVE 389-ds-base bug fix and enhancement update 2014-10-13 21:27:42 EDT

  None (edit)
Description Rich Megginson 2013-09-17 14:04:55 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47516

If the CSN generator clock skew is over 1 day, replication stops.  Users need to be able to continue to replicate with the high clock skew.  There should be a configuration attr that allows replication to continue despite excessive clock skew.

This is becoming a much bigger problem now that many users are using VMs, which are notorious for having system clock/time/ntp issues.
Comment 1 Rich Megginson 2013-09-17 14:06:25 EDT
Red Hat IT is requesting a hot fix, which means this bug will need to be officially fixed and supported in rhel 6.6.
Comment 2 Rich Megginson 2013-09-18 17:29:59 EDT
external 389-ds-base-1.2.11 commit

commit 9dc7a4630cb13f1da074183208b1b34962fe8101
Author: Rich Megginson <rmeggins@redhat.com>
Date:   Wed Sep 18 12:32:23 2013 -0600

internal
To ssh://git.app.eng.bos.redhat.com/srv/git/389-ds-base.git
 * [new branch]      rhel-6.4-bug1009122 -> rhel-6.4-bug1009122
commit 9c657d5d72569af8c650170913328d3fc5f9b3d9
Author: Rich Megginson <rmeggins@redhat.com>
Date:   Wed Sep 18 12:32:23 2013 -0600
To ssh://git.app.eng.bos.redhat.com/srv/git/389-ds-base.git
 * [new tag]         389-ds-base-1.2.11.15-22.1-bug1009122 -> 389-ds-base-1.2.11.15-22.1-bug1009122
Comment 3 Rich Megginson 2013-09-18 17:59:05 EDT
added test to TET trunk
will need to cherry-pick (merge and ci) this change to the rhel7 and rhel6 branch when the fix is added to rhel7.0 and rhel6.6

r8122 | rmeggins@REDHAT.COM | 2013-09-18 15:57:47 -0600 (Wed, 18 Sep 2013) | 5 lines

Bug 1009122 - replication stops with excessive clock skew
https://bugzilla.redhat.com/show_bug.cgi?id=1009122

added test bug1009122 to test the new nsslapd-ignore-time-skew attribute
Comment 4 Rich Megginson 2014-01-16 13:48:28 EST
The previous fix makes replication ignore time skew errors, but does not ensure that the CSN generator will continue to issue CSNs that exceed its built-in time skew limit. We need to make sure that the CSN generator will never issue duplicate CSNs or regress CSNs.
Comment 5 Rich Megginson 2014-01-20 13:28:56 EST
New builds available:
http://download.devel.redhat.com/brewroot/packages/389-ds-base/1.2.11.15/31.2.el6_5.bug1009122/

Please upgrade to these new builds ASAP
Comment 6 Rich Megginson 2014-01-22 11:20:24 EST
testcases/DS/6.0/mmrepl/accept/accept.sh
------------------------------------------------------------------------
r8283 | rmeggins@REDHAT.COM | 2014-01-22 09:16:11 -0700 (Wed, 22 Jan 2014) | 3 lines

Bug 1009122
Additional debugging
Comment 7 Rich Megginson 2014-01-24 18:45:21 EST
Customer in case 01023323 has been given a hotfix:
http://download.devel.redhat.com/brewroot/packages/389-ds-base/1.2.11.15/31.3.el6_5.citrix/x86_64
customer reports hotfix packages are working fine
Comment 11 Viktor Ashirov 2014-07-17 10:03:27 EDT
520|0 51 10030 1 2|----------------- Starting Test bug1009122 -------------------------
520|0 51 10030 1 3|Replication breaks when there is excessive clock skew.
520|0 51 10030 1 4|first, shutdown the masters
520|0 51 10030 1 5|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 14|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 15|                                                      
520|0 51 10030 1 16|stopped slapd-s1
520|0 51 10030 1 17|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 71|stopped slapd-s2
520|0 51 10030 1 72|next, grab the nsState value on S1 to save for later
520|0 51 10030 1 73|change the nsState value on S1 to be bogus
520|0 51 10030 1 74|changed nsstate
520|0 51 10030 1 75|start the servers
520|0 51 10030 1 76|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 81|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 82|                                                      
520|0 51 10030 1 83|stopped slapd-s1
520|0 51 10030 1 84|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 89|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 90|                                                      
520|0 51 10030 1 91|stopped slapd-s2
520|0 51 10030 1 92|do a change on S1
520|0 51 10030 1 93|verify that the change does not replicate to S2
520|0 51 10030 1 94|good S2 does not contain the change
520|0 51 10030 1 95|turn nsslapd-ignore-time-skew: on
520|0 51 10030 1 96|do a change on S2
520|0 51 10030 1 97|restart the servers
520|0 51 10030 1 98|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 107|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 108|                                                      
520|0 51 10030 1 109|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 114|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 115|                                                      
520|0 51 10030 1 116|stopped slapd-s1
520|0 51 10030 1 117|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 171|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 176|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 177|                                                      
520|0 51 10030 1 178|stopped slapd-s2
520|0 51 10030 1 179|do 3 changes on S1
520|0 51 10030 1 180|wait for changes to replicate to S2
520|0 51 10030 1 181|do 3 changes on S2
520|0 51 10030 1 182|verify that the changes replicate to S2
520|0 51 10030 1 183|good S2 contains change from S1
520|0 51 10030 1 184|reset and cleanup
520|0 51 10030 1 185|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 194|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 195|                                                      
520|0 51 10030 1 196|stopped slapd-s1
520|0 51 10030 1 197|changed nsstate
520|0 51 10030 1 198|start the servers
520|0 51 10030 1 199|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 204|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 205|                                                      
520|0 51 10030 1 206|stopped slapd-s1
520|0 51 10030 1 207|TestCase [bug1009122] result-> [PASS]

Testcase passes, hence marking as verified.
Comment 12 errata-xmlrpc 2014-10-14 03:50:19 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1385.html

Note You need to log in before you can comment on or make changes to this bug.