Hide Forgot
{{{ Description of problem: It was found that running db2index with no command line arguments results in breaking replication by changing the slave's RUV tombstone. Version-Release number of selected component (if applicable): 389-ds-base-1.2.11.15-32.el6_5.x86_64 and 389-ds-base-1.2.11.15-32.1.el6_5.bug1009122.x86_64 How reproducible: This happened on 10+ slaves in our production environment, however, it did not occur in pre-prod environments. In production, I noticed some DB errors, which may be related: [08/Oct/2014:14:47:55 -0400] upgrade DB - userRoot: Start upgradedb. [08/Oct/2014:14:47:55 -0400] - WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database [08/Oct/2014:14:47:55 -0400] - reindex userRoot: Index buffering enabled with bucket size 100 [08/Oct/2014:14:47:56 -0400] entryrdn-index - entryrdn_lookup_dn: Failed to position cursor at the key: P2992: DB_PAGE_NOTFOUND: Requested page not found(-30986) [08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: Skipping entry "nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" which has no parent, ending at line 119705 of file "id2entry.db4" [08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: bad entry: ID 119705 [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers finished; cleaning up... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers cleaned up. [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Cleaning up producer thread... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Indexing complete. Post-processing... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Generating numSubordinates complete. [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Flushing caches... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Closing files... [08/Oct/2014:14:48:16 -0400] - All database threads now stopped [08/Oct/2014:14:48:16 -0400] - reindex userRoot: Reindexing complete. Processed 126506 entries (1 were skipped) in 21 seconds. (6024.10 entries/sec) [08/Oct/2014:14:48:16 -0400] - All database threads now stopped I also noticed the tool stating 'upgrade DB - userRoot: Start upgradedb.', that next should probably be changed to 'Start indexing'. Steps to Reproduce: 1. service dirsrv stop 2. /usr/lib64/dirsrv/slapd-*/db2index 3. service dirsrv start Actual results: Replication fails with: [08/Oct/2014:16:09:09 -0400] NSMMReplicationPlugin - agmt="cn=FQDM" (ldap02:636): Replica has a different generation ID than the local data. on the master Expected results: Replication to resume once dirsrv is started. Additional info: Master: $ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com '(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))' Enter LDAP Password: dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com objectClass: top objectClass: nsTombstone objectClass: extensibleobject nsds50ruv: {replicageneration} 51154ed3003d00010000 nsds50ruv: {replica 1 ldap://FQDN:389} 51154ed3003e00010000 56edd86c000000010000 nsds50ruv: {replica 2 ldap://FQDN:389} 51159de3000000020000 56ed65d1000200020000 dc: test nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 5435b09b nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000 Slave: $ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com '(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))' Enter LDAP Password: dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com objectClass: top objectClass: nsTombstone objectClass: extensibleobject nsds50ruv: {replicageneration} 56ed21150001ffff0000 nsds50ruv: {replica 2 ldap://FQDN:389} nsds50ruv: {replica 1 ldap://FQDN:389} dc: test nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000 nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 00000000 Reinitialzing the consume restored replication. }}}
1). [root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base 389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64 389-ds-base-1.3.5.10-7.el7.x86_64 389-ds-base-devel-1.3.5.10-7.el7.x86_64 389-ds-base-libs-1.3.5.10-7.el7.x86_64 2). [root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done 67 67 67 67 3). [root@ratangad MMR_WINSYNC]# /bin/systemctl status dirsrv.target |grep -i active Active: active since Thu 2016-08-11 12:08:33 IST; 16s ago [root@ratangad MMR_WINSYNC]# /bin/systemctl stop dirsrv.target 4). [root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M1/db2index [root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M2/db2index [root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C1/db2index [root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C2/db2index 5). [root@ratangad MMR_WINSYNC]# /bin/systemctl start dirsrv.target [root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1289 "ou=people,dc=passsync,dc=com" m2usr 9 localhost [root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1189 "ou=people,dc=passsync,dc=com" m1usr 9 localhost 6). [root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done 85 85 85 85 7). Replication works fine and no errors observed from error logs. Hence, marking the bug as Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2594.html