Bug 233643 - MMR breaks with time skew errors
Summary: MMR breaks with time skew errors
Keywords:
Status: CLOSED DUPLICATE of bug 233642
Alias: None
Product: 389
Classification: Retired
Component: Replication - General
Version: 1.0.3
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Rich Megginson
QA Contact: Orla Hegarty
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-03-23 15:41 UTC by Chris St. Pierre
Modified: 2007-03-23 15:54 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-03-23 15:54:37 UTC
Embargoed:


Attachments (Terms of Use)

Description Chris St. Pierre 2007-03-23 15:41:53 UTC
Tuesday night (20 March), replication suddenly ceased between our four nodes, 
with errors similar to these on all nodes:

[20/Mar/2007:17:13:26 -0500] NSMMReplicationPlugin - agmt="cn="Replication to gro
ucho (o=isp)"" (groucho:389): Unable to acquire replica: Excessive clock skew bet
ween the supplier and the consumer. Replication is aborting.
[20/Mar/2007:17:13:26 -0500] NSMMReplicationPlugin - agmt="cn="Replication to gro
ucho (o=isp)"" (groucho:389): Incremental update failed and requires administrato
r action
[20/Mar/2007:17:13:26 -0500] NSMMReplicationPlugin - agmt="cn="Replication to zep
po.nebrwesleyan.edu (o=isp)"" (zeppo:389): Unable to acquire replica: Excessive c
lock skew between the supplier and the consumer. Replication is aborting.
[20/Mar/2007:17:13:26 -0500] NSMMReplicationPlugin - agmt="cn="Replication to zep
po.nebrwesleyan.edu (o=isp)"" (zeppo:389): Incremental update failed and requires
 administrator action
[20/Mar/2007:17:13:27 -0500] - csngen_adjust_time: adjustment limit exceeded; val
ue - 86401, limit - 86400
[20/Mar/2007:17:13:27 -0500] NSMMReplicationPlugin - conn=1600790 op=4983 replica
="o=isp": Unable to acquire replica: error: excessive clock skew
[20/Mar/2007:17:23:56 -0500] NSMMReplicationPlugin - agmt="cn="Replication to har
po.nebrwesleyan.edu (o=isp)"" (harpo:389): Unable to acquire replica: Excessive c
lock skew between the supplier and the consumer. Replication is aborting.
[20/Mar/2007:17:23:56 -0500] NSMMReplicationPlugin - agmt="cn="Replication to har
po.nebrwesleyan.edu (o=isp)"" (harpo:389): Incremental update failed and requires
 administrator action
[20/Mar/2007:17:58:27 -0500] - csngen_adjust_time: adjustment limit exceeded; val
ue - 86401, limit - 86400
[20/Mar/2007:17:58:27 -0500] NSMMReplicationPlugin - conn=1615833 op=3276 replica
="o=isp": Unable to acquire replica: error: excessive clock skew

Subsequent efforts to resume replication have come to naught.  (See this thread: 
https://www.redhat.com/archives/fedora-directory-users/2007-March/msg00100.html 
for more details on getting replication working again at all.)  Once I get 
replication working, a few minutes after putting the cluster back into production 
we get the same error messages and replication ceases.  The messages do not occur 
when the machines are not actively getting queries.

In all cases, the 'value' on the csngen_adjust_time line is 86401 and the limit 
is 86400.  All four nodes have the same clock times, and are running NTP against 
a local NTP server.

We are using four-way MMR with no read-only replicas.  We used the mmr.pl script 
to set up replication.  Each node replicates to all three other nodes, and all 
four nodes receive updates.

We have another database on all nodes that has continued to replicate without a 
problem.  It's only our "o=isp" base that has troubles.

This guy seems to have encountered a similar issue: http://www.mail-archive.com/
fedora-directory-users/msg03614.html

Comment 1 Rich Megginson 2007-03-23 15:54:37 UTC

*** This bug has been marked as a duplicate of 233642 ***


Note You need to log in before you can comment on or make changes to this bug.