Bug 643937 - replication broken between 1.2.2 and
Summary: replication broken between 1.2.2 and
Alias: None
Product: 389
Classification: Retired
Component: Replication - General
Version: 1.2.6
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Nathan Kinder
QA Contact: Ben Levenson
Depends On:
Blocks: 389_1.2.7 639035
TreeView+ depends on / blocked
Reported: 2010-10-18 15:20 UTC by Robert Viduya
Modified: 2015-12-10 18:38 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2015-12-10 18:38:58 UTC

Attachments (Terms of Use)
Patch (1.70 KB, patch)
2010-10-18 16:36 UTC, Nathan Kinder
nkinder: review?
rmeggins: review+
Details | Diff

Description Robert Viduya 2010-10-18 15:20:27 UTC
Description of problem:
Replication breaks between 2 masters, one running 1.2.2 and one running  The master fails trying to replicate to the 1.2.2 master. server's relevant log messages:

[14/Oct/2010:16:26:19 -0400] NSMMReplicationPlugin - agmt="cn=people moulin stefan" (stefan:636): Unable to parse the response to the startReplication extended operation. Replication is aborting.
[14/Oct/2010:16:26:19 -0400] NSMMReplicationPlugin - agmt="cn=people moulin stefan" (stefan:636): Incremental update failed and requires administrator action

1.2.2 server's relevant log messages:

[14/Oct/2010:16:26:19 -0400] conn=139 fd=65 slot=65 SSL connection from to
[14/Oct/2010:16:26:19 -0400] conn=139 op=0 BIND dn="cn=Replication Manager,cn=config" method=128 version=3
[14/Oct/2010:16:26:19 -0400] conn=139 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=replication manager,cn=config"
[14/Oct/2010:16:26:19 -0400] conn=139 op=1 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=1 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=2 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=2 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=3 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=3 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=4 EXT oid="2.16.840.1.113730.3.5.12"
[14/Oct/2010:16:26:19 -0400] conn=139 op=4 RESULT err=2 tag=120 nentries=0 etime=0
[14/Oct/2010:16:27:19 -0400] conn=139 op=5 UNBIND
[14/Oct/2010:16:27:19 -0400] conn=139 op=5 fd=65 closed - U1

It looks like the master is trying to make the 1.2.2 master use the 2.16.840.1.113730.3.5.12 extended operation and isn't happy when it gets refused.

The rootdse of the 1.2.2 server is:

objectClass: top
namingContexts: dc=gted,dc=gatech,dc=edu
namingContexts: o=netscaperoot
supportedExtension: 2.16.840.1.113730.3.5.7
supportedExtension: 2.16.840.1.113730.3.5.8
supportedExtension: 2.16.840.1.113730.3.5.10
supportedExtension: 2.16.840.1.113730.3.5.3
supportedExtension: 2.16.840.1.113730.3.5.5
supportedExtension: 2.16.840.1.113730.3.5.6
supportedExtension: 2.16.840.1.113730.3.5.9
supportedExtension: 2.16.840.1.113730.3.5.4
supportedControl: 2.16.840.1.113730.3.4.2
supportedControl: 2.16.840.1.113730.3.4.3
supportedControl: 2.16.840.1.113730.3.4.4
supportedControl: 2.16.840.1.113730.3.4.5
supportedControl: 1.2.840.113556.1.4.473
supportedControl: 2.16.840.1.113730.3.4.9
supportedControl: 2.16.840.1.113730.3.4.16
supportedControl: 2.16.840.1.113730.3.4.15
supportedControl: 2.16.840.1.113730.3.4.17
supportedControl: 2.16.840.1.113730.3.4.19
supportedControl: 1.2.840.113556.1.4.319
supportedControl: 2.16.840.1.113730.3.4.14
supportedControl: 2.16.840.1.113730.3.4.20
supportedControl: 2.16.840.1.113730.3.4.12
supportedControl: 2.16.840.1.113730.3.4.18
supportedControl: 2.16.840.1.113730.3.4.13
supportedSASLMechanisms: EXTERNAL
supportedSASLMechanisms: LOGIN
supportedSASLMechanisms: DIGEST-MD5
supportedSASLMechanisms: PLAIN
supportedSASLMechanisms: ANONYMOUS
supportedSASLMechanisms: CRAM-MD5
supportedSASLMechanisms: GSSAPI
supportedLDAPVersion: 2
supportedLDAPVersion: 3
vendorName: 389 Project
vendorVersion: 389-Directory/1.2.2 B2009.237.2054
dataversion: 020101015151757020101015151757020101015151757020101015151757
netscapemdsuffix: cn=ldap://dc=stefan,dc=iam,dc=gatech,dc=edu:389

Version-Release number of selected component (if applicable):
1.2.2 and

How reproducible:
Random, sometimes it works, sometimes it doesn't.

Steps to Reproduce:
Actual results:

Expected results:

Additional info:
A quick grep through the code shows that repl90consumer (a field of Private_Repl_Protocol struct) is never initialized to zero.  Neither is field repl71consumer.  However field repl50consumer is initialized to zero in Repl_5_Inc_Protocol_new.

Comment 1 Nathan Kinder 2010-10-18 16:36:02 UTC
Created attachment 454163 [details]

Comment 2 Nathan Kinder 2010-10-18 16:52:23 UTC
Pushed to master.  Thanks to Rich for his review!

Counting objects: 15, done.
Delta compression using 2 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 880 bytes, done.
Total 8 (delta 6), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   f39aab7..52632d7  master -> master

Comment 3 Noriko Hosoi 2011-07-26 22:17:38 UTC
Steps to verify
1. install DS9.0 on hostA and DS8.2 on hostB.
2. set up mmr between the 2 servers.
3. run consumer initialization on DS9.0.
   run any mod op on DS9.0 and check that the mod is replicated to 8.2.
   run any mod op on DS8.2 and check that the mod is replicated to 9.0.
4. run consumer initialization on DS8.2.
   run any mod op on DS9.0 and check that the mod is replicated to 8.2.
   run any mod op on DS8.2 and check that the mod is replicated to 9.0.

Note You need to log in before you can comment on or make changes to this bug.