Bug 643937

Summary: replication broken between 1.2.2 and 1.2.6.1
Product: [Retired] 389 Reporter: Robert Viduya <robert>
Component: Replication - GeneralAssignee: Nathan Kinder <nkinder>
Status: CLOSED CURRENTRELEASE QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: low    
Version: 1.2.6CC: nhosoi, nkinder, rmeggins
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-10 18:38:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 576869, 639035    
Attachments:
Description Flags
Patch nkinder: review?, rmeggins: review+

Description Robert Viduya 2010-10-18 15:20:27 UTC
Description of problem:
Replication breaks between 2 masters, one running 1.2.2 and one running 1.2.6.1.  The 1.2.6.1 master fails trying to replicate to the 1.2.2 master.

1.2.6.1 server's relevant log messages:

[14/Oct/2010:16:26:19 -0400] NSMMReplicationPlugin - agmt="cn=people moulin stefan" (stefan:636): Unable to parse the response to the startReplication extended operation. Replication is aborting.
[14/Oct/2010:16:26:19 -0400] NSMMReplicationPlugin - agmt="cn=people moulin stefan" (stefan:636): Incremental update failed and requires administrator action

1.2.2 server's relevant log messages:

[14/Oct/2010:16:26:19 -0400] conn=139 fd=65 slot=65 SSL connection from 130.207.183.16 to 130.207.183.14
[14/Oct/2010:16:26:19 -0400] conn=139 op=0 BIND dn="cn=Replication Manager,cn=config" method=128 version=3
[14/Oct/2010:16:26:19 -0400] conn=139 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=replication manager,cn=config"
[14/Oct/2010:16:26:19 -0400] conn=139 op=1 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=1 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=2 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=2 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=3 SRCH base="" scope=0 filter="(objectClass=*)" attrs="supportedControl supportedExtension"
[14/Oct/2010:16:26:19 -0400] conn=139 op=3 RESULT err=0 tag=101 nentries=1 etime=0
[14/Oct/2010:16:26:19 -0400] conn=139 op=4 EXT oid="2.16.840.1.113730.3.5.12"
[14/Oct/2010:16:26:19 -0400] conn=139 op=4 RESULT err=2 tag=120 nentries=0 etime=0
[14/Oct/2010:16:27:19 -0400] conn=139 op=5 UNBIND
[14/Oct/2010:16:27:19 -0400] conn=139 op=5 fd=65 closed - U1

It looks like the 1.2.6.1 master is trying to make the 1.2.2 master use the 2.16.840.1.113730.3.5.12 extended operation and isn't happy when it gets refused.

The rootdse of the 1.2.2 server is:

dn:
objectClass: top
namingContexts: dc=gted,dc=gatech,dc=edu
namingContexts: o=netscaperoot
supportedExtension: 2.16.840.1.113730.3.5.7
supportedExtension: 2.16.840.1.113730.3.5.8
supportedExtension: 2.16.840.1.113730.3.5.10
supportedExtension: 2.16.840.1.113730.3.5.3
supportedExtension: 2.16.840.1.113730.3.5.5
supportedExtension: 2.16.840.1.113730.3.5.6
supportedExtension: 2.16.840.1.113730.3.5.9
supportedExtension: 2.16.840.1.113730.3.5.4
supportedExtension: 1.3.6.1.4.1.1466.20037
supportedExtension: 1.3.6.1.4.1.4203.1.11.1
supportedControl: 2.16.840.1.113730.3.4.2
supportedControl: 2.16.840.1.113730.3.4.3
supportedControl: 2.16.840.1.113730.3.4.4
supportedControl: 2.16.840.1.113730.3.4.5
supportedControl: 1.2.840.113556.1.4.473
supportedControl: 2.16.840.1.113730.3.4.9
supportedControl: 2.16.840.1.113730.3.4.16
supportedControl: 2.16.840.1.113730.3.4.15
supportedControl: 2.16.840.1.113730.3.4.17
supportedControl: 2.16.840.1.113730.3.4.19
supportedControl: 1.3.6.1.4.1.42.2.27.8.5.1
supportedControl: 1.3.6.1.4.1.42.2.27.9.5.2
supportedControl: 1.2.840.113556.1.4.319
supportedControl: 1.3.6.1.4.1.4203.666.5.16
supportedControl: 2.16.840.1.113730.3.4.14
supportedControl: 2.16.840.1.113730.3.4.20
supportedControl: 1.3.6.1.4.1.1466.29539.12
supportedControl: 2.16.840.1.113730.3.4.12
supportedControl: 2.16.840.1.113730.3.4.18
supportedControl: 2.16.840.1.113730.3.4.13
supportedSASLMechanisms: EXTERNAL
supportedSASLMechanisms: LOGIN
supportedSASLMechanisms: DIGEST-MD5
supportedSASLMechanisms: PLAIN
supportedSASLMechanisms: ANONYMOUS
supportedSASLMechanisms: CRAM-MD5
supportedSASLMechanisms: GSSAPI
supportedLDAPVersion: 2
supportedLDAPVersion: 3
vendorName: 389 Project
vendorVersion: 389-Directory/1.2.2 B2009.237.2054
dataversion: 020101015151757020101015151757020101015151757020101015151757
netscapemdsuffix: cn=ldap://dc=stefan,dc=iam,dc=gatech,dc=edu:389

Version-Release number of selected component (if applicable):
1.2.2 and 1.2.6.1

How reproducible:
Random, sometimes it works, sometimes it doesn't.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
A quick grep through the code shows that repl90consumer (a field of Private_Repl_Protocol struct) is never initialized to zero.  Neither is field repl71consumer.  However field repl50consumer is initialized to zero in Repl_5_Inc_Protocol_new.

Comment 1 Nathan Kinder 2010-10-18 16:36:02 UTC
Created attachment 454163 [details]
Patch

Comment 2 Nathan Kinder 2010-10-18 16:52:23 UTC
Pushed to master.  Thanks to Rich for his review!

Counting objects: 15, done.
Delta compression using 2 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 880 bytes, done.
Total 8 (delta 6), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
   f39aab7..52632d7  master -> master

Comment 3 Noriko Hosoi 2011-07-26 22:17:38 UTC
Steps to verify
1. install DS9.0 on hostA and DS8.2 on hostB.
2. set up mmr between the 2 servers.
3. run consumer initialization on DS9.0.
   run any mod op on DS9.0 and check that the mod is replicated to 8.2.
   run any mod op on DS8.2 and check that the mod is replicated to 9.0.
4. run consumer initialization on DS8.2.
   run any mod op on DS9.0 and check that the mod is replicated to 8.2.
   run any mod op on DS8.2 and check that the mod is replicated to 9.0.