Bug 1009122 - replication stops with excessive clock skew
Summary: replication stops with excessive clock skew
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Rich Megginson
QA Contact: Sankar Ramalingam
URL:
Whiteboard:
Depends On:
Blocks: 1009679 1061410
TreeView+ depends on / blocked
 
Reported: 2013-09-17 18:04 UTC by Rich Megginson
Modified: 2018-12-09 17:11 UTC (History)
7 users (show)

Fixed In Version: 389-ds-base-1.2.11.15-34.el6
Doc Type: Bug Fix
Doc Text:
Cause: The multi-master replication protocol keeps a cumulative counter of the relative time offsets between servers. If the system time is adjusted by more than one day (ntp issues, vm issues), the counter will be off by more than one day. Consequence: A replication consumer will refuse to accept changes from a master that has a time offset more than 1 day. Replication will be broken from that supplier to the consumer. Fix: A new configuration attribute was added to cn=config - nsslapd-ignore-time-skew. The default is "off". If this attribute is set to "on", a replication consumer will allow replication to proceed despite excessive time skew. An error message will still be logged, warning the admin about the time skew issue. Result: When nsslapd-ignore-time-skew is set to "on", replication will proceed despite excessive time skew.
Clone Of:
: 1009679 (view as bug list)
Environment:
Last Closed: 2014-10-14 07:50:19 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1385 normal SHIPPED_LIVE 389-ds-base bug fix and enhancement update 2014-10-14 01:27:42 UTC

Description Rich Megginson 2013-09-17 18:04:55 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47516

If the CSN generator clock skew is over 1 day, replication stops.  Users need to be able to continue to replicate with the high clock skew.  There should be a configuration attr that allows replication to continue despite excessive clock skew.

This is becoming a much bigger problem now that many users are using VMs, which are notorious for having system clock/time/ntp issues.

Comment 1 Rich Megginson 2013-09-17 18:06:25 UTC
Red Hat IT is requesting a hot fix, which means this bug will need to be officially fixed and supported in rhel 6.6.

Comment 2 Rich Megginson 2013-09-18 21:29:59 UTC
external 389-ds-base-1.2.11 commit

commit 9dc7a4630cb13f1da074183208b1b34962fe8101
Author: Rich Megginson <rmeggins@redhat.com>
Date:   Wed Sep 18 12:32:23 2013 -0600

internal
To ssh://git.app.eng.bos.redhat.com/srv/git/389-ds-base.git
 * [new branch]      rhel-6.4-bug1009122 -> rhel-6.4-bug1009122
commit 9c657d5d72569af8c650170913328d3fc5f9b3d9
Author: Rich Megginson <rmeggins@redhat.com>
Date:   Wed Sep 18 12:32:23 2013 -0600
To ssh://git.app.eng.bos.redhat.com/srv/git/389-ds-base.git
 * [new tag]         389-ds-base-1.2.11.15-22.1-bug1009122 -> 389-ds-base-1.2.11.15-22.1-bug1009122

Comment 3 Rich Megginson 2013-09-18 21:59:05 UTC
added test to TET trunk
will need to cherry-pick (merge and ci) this change to the rhel7 and rhel6 branch when the fix is added to rhel7.0 and rhel6.6

r8122 | rmeggins@REDHAT.COM | 2013-09-18 15:57:47 -0600 (Wed, 18 Sep 2013) | 5 lines

Bug 1009122 - replication stops with excessive clock skew
https://bugzilla.redhat.com/show_bug.cgi?id=1009122

added test bug1009122 to test the new nsslapd-ignore-time-skew attribute

Comment 4 Rich Megginson 2014-01-16 18:48:28 UTC
The previous fix makes replication ignore time skew errors, but does not ensure that the CSN generator will continue to issue CSNs that exceed its built-in time skew limit. We need to make sure that the CSN generator will never issue duplicate CSNs or regress CSNs.

Comment 5 Rich Megginson 2014-01-20 18:28:56 UTC
New builds available:
http://download.devel.redhat.com/brewroot/packages/389-ds-base/1.2.11.15/31.2.el6_5.bug1009122/

Please upgrade to these new builds ASAP

Comment 6 Rich Megginson 2014-01-22 16:20:24 UTC
testcases/DS/6.0/mmrepl/accept/accept.sh
------------------------------------------------------------------------
r8283 | rmeggins@REDHAT.COM | 2014-01-22 09:16:11 -0700 (Wed, 22 Jan 2014) | 3 lines

Bug 1009122
Additional debugging

Comment 7 Rich Megginson 2014-01-24 23:45:21 UTC
Customer in case 01023323 has been given a hotfix:
http://download.devel.redhat.com/brewroot/packages/389-ds-base/1.2.11.15/31.3.el6_5.citrix/x86_64
customer reports hotfix packages are working fine

Comment 11 Viktor Ashirov 2014-07-17 14:03:27 UTC
520|0 51 10030 1 2|----------------- Starting Test bug1009122 -------------------------
520|0 51 10030 1 3|Replication breaks when there is excessive clock skew.
520|0 51 10030 1 4|first, shutdown the masters
520|0 51 10030 1 5|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 14|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 15|                                                      
520|0 51 10030 1 16|stopped slapd-s1
520|0 51 10030 1 17|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 71|stopped slapd-s2
520|0 51 10030 1 72|next, grab the nsState value on S1 to save for later
520|0 51 10030 1 73|change the nsState value on S1 to be bogus
520|0 51 10030 1 74|changed nsstate
520|0 51 10030 1 75|start the servers
520|0 51 10030 1 76|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 81|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 82|                                                      
520|0 51 10030 1 83|stopped slapd-s1
520|0 51 10030 1 84|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 89|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 90|                                                      
520|0 51 10030 1 91|stopped slapd-s2
520|0 51 10030 1 92|do a change on S1
520|0 51 10030 1 93|verify that the change does not replicate to S2
520|0 51 10030 1 94|good S2 does not contain the change
520|0 51 10030 1 95|turn nsslapd-ignore-time-skew: on
520|0 51 10030 1 96|do a change on S2
520|0 51 10030 1 97|restart the servers
520|0 51 10030 1 98|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 107|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 108|                                                      
520|0 51 10030 1 109|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 114|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 115|                                                      
520|0 51 10030 1 116|stopped slapd-s1
520|0 51 10030 1 117|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 171|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 176|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 177|                                                      
520|0 51 10030 1 178|stopped slapd-s2
520|0 51 10030 1 179|do 3 changes on S1
520|0 51 10030 1 180|wait for changes to replicate to S2
520|0 51 10030 1 181|do 3 changes on S2
520|0 51 10030 1 182|verify that the changes replicate to S2
520|0 51 10030 1 183|good S2 contains change from S1
520|0 51 10030 1 184|reset and cleanup
520|0 51 10030 1 185|-----------------StopSlapd: Called -----------------
520|0 51 10030 1 194|-----------------StopSlapd: Completed-----------------
520|0 51 10030 1 195|                                                      
520|0 51 10030 1 196|stopped slapd-s1
520|0 51 10030 1 197|changed nsstate
520|0 51 10030 1 198|start the servers
520|0 51 10030 1 199|-----------------StartSlapd: Called -----------------
520|0 51 10030 1 204|-----------------StartSlapd: Completed-----------------
520|0 51 10030 1 205|                                                      
520|0 51 10030 1 206|stopped slapd-s1
520|0 51 10030 1 207|TestCase [bug1009122] result-> [PASS]

Testcase passes, hence marking as verified.

Comment 12 errata-xmlrpc 2014-10-14 07:50:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1385.html


Note You need to log in before you can comment on or make changes to this bug.