Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Running *db2index* with no options no longer causes replication failures
When running the *db2index* script with no options, the script failed to handle on-disk Replica Update Vector (RUV) entries because these entries have no parent entries. The existing RUV was skipped and a new one was generated instead, which subsequently caused the next replication to fail due to an ID mismatch. This update fixes handling of RUV entries in *db2index*, and running this script without specifying any options no longer causes replication failures.
{{{
Description of problem:
It was found that running db2index with no command line arguments results in
breaking replication by changing the slave's RUV tombstone.
Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-32.el6_5.x86_64 and
389-ds-base-1.2.11.15-32.1.el6_5.bug1009122.x86_64
How reproducible:
This happened on 10+ slaves in our production environment, however, it did not
occur in pre-prod environments. In production, I noticed some DB errors, which
may be related:
[08/Oct/2014:14:47:55 -0400] upgrade DB - userRoot: Start upgradedb.
[08/Oct/2014:14:47:55 -0400] - WARNING: Import is running with
nsslapd-db-private-import-mem on; No other process is allowed to access the
database
[08/Oct/2014:14:47:55 -0400] - reindex userRoot: Index buffering enabled with
bucket size 100
[08/Oct/2014:14:47:56 -0400] entryrdn-index - entryrdn_lookup_dn: Failed to
position cursor at the key: P2992: DB_PAGE_NOTFOUND: Requested page not
found(-30986)
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: Skipping entry
"nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" which has no parent, ending at
line 119705 of file "id2entry.db4"
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: bad entry: ID 119705
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers finished; cleaning
up...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers cleaned up.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Cleaning up producer thread...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Indexing complete.
Post-processing...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Generating numSubordinates
complete.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Flushing caches...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Closing files...
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped
[08/Oct/2014:14:48:16 -0400] - reindex userRoot: Reindexing complete.
Processed 126506 entries (1 were skipped) in 21 seconds. (6024.10 entries/sec)
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped
I also noticed the tool stating 'upgrade DB - userRoot: Start upgradedb.', that
next should probably be changed to 'Start indexing'.
Steps to Reproduce:
1. service dirsrv stop
2. /usr/lib64/dirsrv/slapd-*/db2index
3. service dirsrv start
Actual results:
Replication fails with:
[08/Oct/2014:16:09:09 -0400] NSMMReplicationPlugin -
agmt="cn=FQDM" (ldap02:636): Replica has a different
generation ID than the local data. on the master
Expected results:
Replication to resume once dirsrv is started.
Additional info:
Master:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 51154ed3003d00010000
nsds50ruv: {replica 1 ldap://FQDN:389} 51154ed3003e00010000 56edd86c000000010000
nsds50ruv: {replica 2 ldap://FQDN:389} 51159de3000000020000 56ed65d1000200020000
dc: test
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 5435b09b
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000
Slave:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 56ed21150001ffff0000
nsds50ruv: {replica 2 ldap://FQDN:389}
nsds50ruv: {replica 1 ldap://FQDN:389}
dc: test
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 00000000
Reinitialzing the consume restored replication.
}}}
Comment 3Sankar Ramalingam
2016-08-11 07:54:16 UTC
1).
[root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base
389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64
389-ds-base-1.3.5.10-7.el7.x86_64
389-ds-base-devel-1.3.5.10-7.el7.x86_64
389-ds-base-libs-1.3.5.10-7.el7.x86_64
2).
[root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
67
67
67
67
3).
[root@ratangad MMR_WINSYNC]# /bin/systemctl status dirsrv.target |grep -i active Active: active since Thu 2016-08-11 12:08:33 IST; 16s ago
[root@ratangad MMR_WINSYNC]# /bin/systemctl stop dirsrv.target
4).
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M1/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M2/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C1/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C2/db2index
5).
[root@ratangad MMR_WINSYNC]# /bin/systemctl start dirsrv.target
[root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1289 "ou=people,dc=passsync,dc=com" m2usr 9 localhost
[root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1189 "ou=people,dc=passsync,dc=com" m1usr 9 localhost
6).
[root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
85
85
85
85
7). Replication works fine and no errors observed from error logs.
Hence, marking the bug as Verified.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHSA-2016-2594.html
{{{ Description of problem: It was found that running db2index with no command line arguments results in breaking replication by changing the slave's RUV tombstone. Version-Release number of selected component (if applicable): 389-ds-base-1.2.11.15-32.el6_5.x86_64 and 389-ds-base-1.2.11.15-32.1.el6_5.bug1009122.x86_64 How reproducible: This happened on 10+ slaves in our production environment, however, it did not occur in pre-prod environments. In production, I noticed some DB errors, which may be related: [08/Oct/2014:14:47:55 -0400] upgrade DB - userRoot: Start upgradedb. [08/Oct/2014:14:47:55 -0400] - WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database [08/Oct/2014:14:47:55 -0400] - reindex userRoot: Index buffering enabled with bucket size 100 [08/Oct/2014:14:47:56 -0400] entryrdn-index - entryrdn_lookup_dn: Failed to position cursor at the key: P2992: DB_PAGE_NOTFOUND: Requested page not found(-30986) [08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: Skipping entry "nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" which has no parent, ending at line 119705 of file "id2entry.db4" [08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: bad entry: ID 119705 [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers finished; cleaning up... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers cleaned up. [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Cleaning up producer thread... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Indexing complete. Post-processing... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Generating numSubordinates complete. [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Flushing caches... [08/Oct/2014:14:48:14 -0400] - reindex userRoot: Closing files... [08/Oct/2014:14:48:16 -0400] - All database threads now stopped [08/Oct/2014:14:48:16 -0400] - reindex userRoot: Reindexing complete. Processed 126506 entries (1 were skipped) in 21 seconds. (6024.10 entries/sec) [08/Oct/2014:14:48:16 -0400] - All database threads now stopped I also noticed the tool stating 'upgrade DB - userRoot: Start upgradedb.', that next should probably be changed to 'Start indexing'. Steps to Reproduce: 1. service dirsrv stop 2. /usr/lib64/dirsrv/slapd-*/db2index 3. service dirsrv start Actual results: Replication fails with: [08/Oct/2014:16:09:09 -0400] NSMMReplicationPlugin - agmt="cn=FQDM" (ldap02:636): Replica has a different generation ID than the local data. on the master Expected results: Replication to resume once dirsrv is started. Additional info: Master: $ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com '(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))' Enter LDAP Password: dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com objectClass: top objectClass: nsTombstone objectClass: extensibleobject nsds50ruv: {replicageneration} 51154ed3003d00010000 nsds50ruv: {replica 1 ldap://FQDN:389} 51154ed3003e00010000 56edd86c000000010000 nsds50ruv: {replica 2 ldap://FQDN:389} 51159de3000000020000 56ed65d1000200020000 dc: test nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 5435b09b nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000 Slave: $ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com '(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))' Enter LDAP Password: dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com objectClass: top objectClass: nsTombstone objectClass: extensibleobject nsds50ruv: {replicageneration} 56ed21150001ffff0000 nsds50ruv: {replica 2 ldap://FQDN:389} nsds50ruv: {replica 1 ldap://FQDN:389} dc: test nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000 nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 00000000 Reinitialzing the consume restored replication. }}}