Bug 1006846 - 2Master replication with SASL/GSSAPI auth broken
2Master replication with SASL/GSSAPI auth broken
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.5
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
: TestBlocker
Depends On:
Blocks: 1012125
  Show dependency treegraph
 
Reported: 2013-09-11 07:32 EDT by Ján Rusnačko
Modified: 2013-11-21 16:11 EST (History)
4 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-25.el6
Doc Type: Bug Fix
Doc Text:
Cause: If a replication is configured against a sub backend instance before initializing the backend with a sub suffix entry, an RUV entry is inserted with a temporary sub suffix entryid to the entryrdn index. The temporary entryid was not updated when the real sub suffix entry was added. Consequence: Search failed to return entries under the sub suffix if a replication is configured before initializing the sub backend. Fix: When a real sub suffix is added, the temporary entry id in the entryrdn index is replaced with the real entry id. Result: Search successfully returns entries under the sub suffix if a replication is configured before initializing the sub backend.
Story Points: ---
Clone Of:
: 1012125 (view as bug list)
Environment:
Last Closed: 2013-11-21 16:11:51 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ján Rusnačko 2013-09-11 07:32:21 EDT
Description of problem:
When 2Master replication is set up with SASL/GSSAPI authentication, I can reproduce these 3 problems:
1) subtree search on replicated suffix fails to return child entries
2) after DS restart entries under replicated suffix are missing, though db2ldif shows they are still in database
3) it is possible to add the same entries under replicated suffix, creating conflict for replication and duplicating content of database

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-23.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. Set up 2 instances of directory server (I used dstet.example.com and dstet2.example.com) and set them up for Kerberos auth and replication (SASL mappings, create replication manager entries ..)
2. On both servers: create subsuffix ou=foo,dc=example,dc=com with new backend foo.
3. Enable replica foo: set both servers to multiple master, set Replica IDs and replication managers.
4. Create new replication agreements for replica foo: use LDAP (no encryption), SASL/GSSAPI for authentication
5. initialize consumers on both servers
6. On server1 add top level entry ou=foo,dc=example,dc=com. Search on server2 to verify replication:
[jrusnack@dstet ~]$ ldapsearch -h dstet2.example.com -p 22222 -D "cn=directory manager" -w Secret123 -LLL -b "ou=foo,dc=example,dc=com"
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit

Entry successfully replicated from server1 to server2.

7. On server2 add user uid=testuser1,ou=foo,dc=example,dc=com. Search on server1 to verify replication:
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet.example.com -p 22221 -D "cn=directory manager" -w Secret123 -LLL -b "uid=testuser,ou=foo,dc=example,dc=com"
dn: uid=testuser,ou=foo,dc=example,dc=com
uid: testuser
givenName: testuser
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
sn: testuser
cn: testuser

Entry successfully replicated from server2 to server1.

8. Issue subtree search on ou=foo,dc=example,dc=com:
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet2.example.com -p 22222 -D "cn=directory manager" -w Secret123 -b "ou=foo,dc=example,dc=com" -s sub
# extended LDIF
#
# LDAPv3
# base <ou=foo,dc=example,dc=com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# foo, example.com
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet.example.com -p 22221 -D "cn=directory manager" -w Secret123 -b "ou=foo,dc=example,dc=com" -s sub
# extended LDIF
#
# LDAPv3
# base <ou=foo,dc=example,dc=com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# foo, example.com
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

Search does not show entry uid=testuser1,ou=foo,dc=example,dc=com, and returns only ou=foo,dc=example,dc=com. With verbose logging this search generates this output in logs (error and access log):
[11/Sep/2013:13:12:21 +0200] - => get_filter_internal
[11/Sep/2013:13:12:21 +0200] - PRESENT
[11/Sep/2013:13:12:21 +0200] - <= get_filter_internal 0
[11/Sep/2013:13:12:22 +0200] get_filter - before optimize: (objectClass=*)
[11/Sep/2013:13:12:22 +0200] get_filter -  after optimize: (objectClass=*)
[11/Sep/2013:13:12:22 +0200] index_subsys_assign_filter_decoders - before: (objectClass=*)
[11/Sep/2013:13:12:22 +0200] index_subsys_assign_filter_decoders -  after: (objectClass=*)
[11/Sep/2013:13:12:22 +0200] - slapi_str2filter "objectclass=referral"
[11/Sep/2013:13:12:22 +0200] - slapi_str2filter: default
[11/Sep/2013:13:12:22 +0200] - str2simple "objectclass=referral"
[11/Sep/2013:13:12:22 +0200] -  OR
[11/Sep/2013:13:12:22 +0200] -  PRESENT
[11/Sep/2013:13:12:22 +0200] -  EQUALITY
[11/Sep/2013:13:12:22 +0200] - => slapi_attr_assertion2keys_ava_sv
[11/Sep/2013:13:12:22 +0200] - <= slapi_attr_assertion2keys_ava_sv 0
[11/Sep/2013:13:12:22 +0200] - slapi_filter_free type 0xA1
[11/Sep/2013:13:12:22 +0200] - slapi_filter_free type 0xA3
[11/Sep/2013:13:12:22 +0200] - slapi_filter_dup type 0x87
[11/Sep/2013:13:12:22 +0200] - => slapi_vattr_filter_test_ext
[11/Sep/2013:13:12:22 +0200] - => test_substring_filter
[11/Sep/2013:13:12:22 +0200] -     PRESENT
[11/Sep/2013:13:12:22 +0200] - <= slapi_vattr_filter_test 0
[11/Sep/2013:13:12:22 +0200] - slapi_filter_free type 0x87
[11/Sep/2013:13:12:22 +0200] - slapi_filter_free type 0x87

[11/Sep/2013:13:16:08 +0200] conn=21 fd=69 slot=69 connection from ::1 to ::1
[11/Sep/2013:13:16:08 +0200] conn=21 op=0 BIND dn="cn=directory manager" method=128 version=3
[11/Sep/2013:13:16:08 +0200] conn=21 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="cn=directory manager"
[11/Sep/2013:13:16:08 +0200] conn=21 op=1 SRCH base="ou=foo,dc=example,dc=com" scope=2 filter="(objectClass=*)" attrs=ALL
[11/Sep/2013:13:16:08 +0200] conn=21 op=1 RESULT err=0 tag=101 nentries=1 etime=0 notes=U
[11/Sep/2013:13:16:08 +0200] conn=21 op=2 UNBIND


9. Export db to ldif:
[jrusnack@dstet 389-scripts]$ /usr/lib64/dirsrv/slapd-dstet/db2ldif -n foo
Exported ldif file: /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_131412.ldif
ldiffile: /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_131412.ldif
[11/Sep/2013:13:14:12 +0200] - export foo: Processed 2 entries (100%).
[11/Sep/2013:13:14:13 +0200] - All database threads now stopped
[jrusnack@dstet 389-scripts]$ cat /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_131412.ldif
version: 1

# entry-id: 2
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130911110308Z
modifyTimestamp: 20130911110308Z
nsUniqueId: b5b0d981-1ad111e3-9b54eb82-8658ba38

# entry-id: 3
dn: uid=testuser,ou=foo,dc=example,dc=com
uid: testuser
givenName: testuser
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
sn: testuser
cn: testuser
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130911110450Z
modifyTimestamp: 20130911110450Z
nsUniqueId: db3de301-1ad111e3-9ef18a1c-6d145b1d

10. Restart server1 and search for ou=foo.. :
[jrusnack@dstet 389-scripts]$ sudo service dirsrv stop
Shutting down dirsrv: 
    dstet...                                               [  OK  ]
[jrusnack@dstet 389-scripts]$ sudo service dirsrv start
Starting dirsrv: 
    dstet...                                               [  OK  ]
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet.example.com -p 22221 -D "cn=directory manager" -w Secret123 -b "ou=foo,dc=example,dc=com" -s sub
# extended LDIF
#
# LDAPv3
# base <ou=foo,dc=example,dc=com> with scope subtree
# filter: (objectclass=*)
# requesting: ALL
#

# search result
search: 2
result: 32 No such object

# numResponses: 1
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet2.example.com -p 22222 -D "cn=directory manager" -w Secret123 -b "ou=foo,dc=example,dc=com" -s sub -LLL
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit

Entry is "missing" on server1 but is present on server2.

11. Add entry ou=foo,dc=example,dc=com on server1 and search:
[jrusnack@dstet 389-scripts]$ ldapsearch -h dstet.example.com -p 22221 -D "cn=directory manager" -w Secret123 -b "ou=foo,dc=example,dc=com" -s sub -LLL
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit

[jrusnack@dstet 389-scripts]$ tail /var/log/dirsrv/slapd-dstet/errors
...
[11/Sep/2013:13:14:53 +0200] - slapd stopped.
[11/Sep/2013:13:14:58 +0200] - 389-Directory/1.2.11.15 B2013.240.1920 starting up
[11/Sep/2013:13:14:58 +0200] - I'm resizing my cache now...cache was 20000000 and is now 8000000
[11/Sep/2013:13:15:00 +0200] - slapd started.  Listening on All Interfaces port 22221 for LDAP requests
[11/Sep/2013:13:18:24 +0200] NSMMReplicationPlugin - agmt="cn=foo" (dstet2:22222): Consumer failed to replay change (uniqueid cfebb701-1ad311e3-9b54eb82-8658ba38, CSN 5230517f000000010000): Operations error (1). Will retry later.

12. Export db to ldif:
[jrusnack@dstet 389-scripts]$ /usr/lib64/dirsrv/slapd-dstet/db2ldif -n foo
Exported ldif file: /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_132022.ldif
ldiffile: /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_132022.ldif
[11/Sep/2013:13:20:22 +0200] - export foo: Processed 3 entries (100%).
[11/Sep/2013:13:20:22 +0200] - All database threads now stopped
[jrusnack@dstet 389-scripts]$ cat /var/lib/dirsrv/slapd-dstet/ldif/dstet-foo-2013_09_11_132022.ldif
version: 1

# entry-id: 2
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130911110308Z
modifyTimestamp: 20130911110308Z
nsUniqueId: b5b0d981-1ad111e3-9b54eb82-8658ba38

# entry-id: 3
dn: uid=testuser,ou=foo,dc=example,dc=com
uid: testuser
givenName: testuser
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
sn: testuser
cn: testuser
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130911110450Z
modifyTimestamp: 20130911110450Z
nsUniqueId: db3de301-1ad111e3-9ef18a1c-6d145b1d

# entry-id: 4
dn: ou=foo,dc=example,dc=com
ou: foo
objectClass: top
objectClass: organizationalunit
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130911111822Z
modifyTimestamp: 20130911111822Z
nsUniqueId: cfebb701-1ad311e3-9b54eb82-8658ba38

There are 2 entries ou=foo,dc=example,dc=com that differ only in nsUniqueId.
Comment 5 Nathan Kinder 2013-09-11 11:48:24 EDT
Is GSSAPI really needed to reproduce this, or does it happen when you do the same thing using a simple bind for replication?

From your steps, it sounds to like you have a nested backend:

    userroot = "dc=example,dc=com"
    foo =  "ou=foo,dc=example,dc=com"

The problem sounds like it's somehow related to the mapping tree, though that doesn't explain why you are able to get two "ou=foo,dc=example,dc=com" entries in the foo backend.
Comment 6 Ján Rusnačko 2013-09-11 13:00:28 EDT
(In reply to Nathan Kinder from comment #5)
> Is GSSAPI really needed to reproduce this, or does it happen when you do the
Yes, you are right, GSSAPI is NOT needed to reproduce this problem.
> same thing using a simple bind for replication?
> 
> From your steps, it sounds to like you have a nested backend:
Correct.
> 
>     userroot = "dc=example,dc=com"
>     foo =  "ou=foo,dc=example,dc=com"
> 
> The problem sounds like it's somehow related to the mapping tree, though
> that doesn't explain why you are able to get two "ou=foo,dc=example,dc=com"
> entries in the foo backend.
Comment 8 Noriko Hosoi 2013-09-16 15:30:48 EDT
I'm having a problem to reproduce the problem...

A question on the steps:
> 5. initialize consumers on both servers

What does this "on both servers" mean?  If it is 2 way MMR, you only need to initialize a consumer on one master, e.g., on server1.  Did you initialize the other on each master at the same time using a script?
Comment 9 Ján Rusnačko 2013-09-17 03:53:53 EDT
(In reply to Noriko Hosoi from comment #8)
> I'm having a problem to reproduce the problem...
> 
> A question on the steps:
> > 5. initialize consumers on both servers
> 
> What does this "on both servers" mean?  If it is 2 way MMR, you only need to
> initialize a consumer on one master, e.g., on server1.  Did you initialize
> the other on each master at the same time using a script?
Yes, I have first set up replication agreements, then manually initialized s1 from s2 and s2 from s1.
Comment 10 Rich Megginson 2013-09-17 09:56:23 EDT
(In reply to Ján Rusnačko from comment #9)
> (In reply to Noriko Hosoi from comment #8)
> > I'm having a problem to reproduce the problem...
> > 
> > A question on the steps:
> > > 5. initialize consumers on both servers
> > 
> > What does this "on both servers" mean?  If it is 2 way MMR, you only need to
> > initialize a consumer on one master, e.g., on server1.  Did you initialize
> > the other on each master at the same time using a script?
> Yes, I have first set up replication agreements, then manually initialized
> s1 from s2 and s2 from s1.

Do not do that.  Once you have initialized s1 from s2, do not initialize s2 from s1.
Comment 11 Ján Rusnačko 2013-09-24 08:34:46 EDT
Ok, with the setup I have I tried again:

1) removed all subsuffixes from both servers and restarted
2) created subsuffix ou=test,dc=example,dc=com with db "test" on both servers
3) enabled replicas for multi-master replication (replica IDs and replication managers) on both servers
4) created replication agreement on first server, authentication set to GSSAPI
5) initialized consumer from first server
6) created replication agreement on second server, selected Do not initialize consumer.

At this point, I add a few entries and verify the replication is working. When I restart server, I see the very same behaviour I reported above:

[jrusnack@dstet ~]$ /usr/lib64/dirsrv/slapd-dstet/db2ldif -n test
Exported ldif file: /var/lib/dirsrv/slapd-dstet/ldif/dstet-test-2013_09_24_141311.ldif
ldiffile: /var/lib/dirsrv/slapd-dstet/ldif/dstet-test-2013_09_24_141311.ldif
[24/Sep/2013:14:13:11 +0200] - export test: Processed 2 entries (100%).
[24/Sep/2013:14:13:11 +0200] - All database threads now stopped
[jrusnack@dstet ~]$ cat /var/lib/dirsrv/slapd-dstet/ldif/dstet-test-2013_09_24_141311.ldif
version: 1

# entry-id: 2
dn: ou=test,dc=example,dc=com
ou: test
objectClass: top
objectClass: organizationalunit
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130924121218Z
modifyTimestamp: 20130924121218Z
nsUniqueId: 6d061102-251211e3-b7668a1c-6d145b1d

# entry-id: 3
dn: uid=tuser1,ou=test,dc=example,dc=com
uid: tuser1
givenName: tuser1
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetorgperson
sn: tuser1
cn: tuser1
creatorsName: cn=directory manager
modifiersName: cn=directory manager
createTimestamp: 20130924121238Z
modifyTimestamp: 20130924121238Z
nsUniqueId: 932bb101-251211e3-9b54eb82-8658ba38

[jrusnack@dstet ~]$ ldapsearch -h localhost -p 22221 -D "cn=directory manager" -w Secret123 -b "ou=test,dc=example,dc=com" -LLL
No such object (32)

Tried this also with simple bing instead of GSSAPI and the behaviour is the same.
Comment 13 Noriko Hosoi 2013-09-25 14:31:46 EDT
Upstream ticket:
https://fedorahosted.org/389/ticket/47523
Comment 15 Ján Rusnačko 2013-09-30 09:39:47 EDT
Verified on 389-ds-base-1.2.11.15-26.el6.x86_64.
Comment 16 Ján Rusnačko 2013-11-11 06:59:49 EST
Automated in mmrepl/accept/accept.sh as bug1006846
Comment 17 errata-xmlrpc 2013-11-21 16:11:51 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html

Note You need to log in before you can comment on or make changes to this bug.