1340307 – Running db2index with no options breaks replication

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1340307 - Running db2index with no options breaks replication

Summary: Running db2index with no options breaks replication

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	389-ds-base
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Noriko Hosoi
QA Contact:	Viktor Ashirov
Docs Contact:	Petr Bokoc
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-05-27 01:21 UTC by Noriko Hosoi
Modified:	2020-09-13 21:44 UTC (History)
CC List:	4 users (show)
Fixed In Version:	389-ds-base-1.3.5.5-1.el7
Doc Type:	Bug Fix
Doc Text:	Running db2index with no options no longer causes replication failures When running the db2index script with no options, the script failed to handle on-disk Replica Update Vector (RUV) entries because these entries have no parent entries. The existing RUV was skipped and a new one was generated instead, which subsequently caused the next replication to fail due to an ID mismatch. This update fixes handling of RUV entries in db2index, and running this script without specifying any options no longer causes replication failures.
Clone Of:
Environment:
Last Closed:	2016-11-03 20:42:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	389ds 389-ds-base issues 1914	0	None	None	None	2020-09-13 21:44:41 UTC
Red Hat Product Errata	RHSA-2016:2594	0	normal	SHIPPED_LIVE	Moderate: 389-ds-base security, bug fix, and enhancement update	2016-11-03 12:11:08 UTC

Description Noriko Hosoi 2016-05-27 01:21:09 UTC

{{{
Description of problem:

It was found that running db2index with no command line arguments results in
breaking replication by changing the slave's RUV tombstone.

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-32.el6_5.x86_64 and
389-ds-base-1.2.11.15-32.1.el6_5.bug1009122.x86_64

How reproducible:

This happened on 10+ slaves in our production environment, however, it did not
occur in pre-prod environments.  In production, I noticed some DB errors, which
may be related:

[08/Oct/2014:14:47:55 -0400] upgrade DB - userRoot: Start upgradedb.
[08/Oct/2014:14:47:55 -0400] - WARNING: Import is running with
nsslapd-db-private-import-mem on; No other process is allowed to access the
database
[08/Oct/2014:14:47:55 -0400] - reindex userRoot: Index buffering enabled with
bucket size 100
[08/Oct/2014:14:47:56 -0400] entryrdn-index - entryrdn_lookup_dn: Failed to
position cursor at the key: P2992: DB_PAGE_NOTFOUND: Requested page not
found(-30986)
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: Skipping entry
"nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" which has no parent, ending at
line 119705 of file "id2entry.db4"
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: bad entry: ID 119705
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers finished; cleaning
up...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers cleaned up.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Cleaning up producer thread...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Indexing complete.
Post-processing...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Generating numSubordinates
complete.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Flushing caches...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Closing files...
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped
[08/Oct/2014:14:48:16 -0400] - reindex userRoot: Reindexing complete.
Processed 126506 entries (1 were skipped) in 21 seconds. (6024.10 entries/sec)
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped

I also noticed the tool stating 'upgrade DB - userRoot: Start upgradedb.', that
next should probably be changed to 'Start indexing'.


Steps to Reproduce:
1. service dirsrv stop
2. /usr/lib64/dirsrv/slapd-*/db2index
3. service dirsrv start

Actual results:

Replication fails with:

[08/Oct/2014:16:09:09 -0400] NSMMReplicationPlugin -
agmt="cn=FQDM" (ldap02:636): Replica has a different
generation ID than the local data. on the master


Expected results:

Replication to resume once dirsrv is started.


Additional info:
Master:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 51154ed3003d00010000
nsds50ruv: {replica 1 ldap://FQDN:389} 51154ed3003e00010000 56edd86c000000010000
nsds50ruv: {replica 2 ldap://FQDN:389} 51159de3000000020000 56ed65d1000200020000
dc: test
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 5435b09b
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000

Slave:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 56ed21150001ffff0000
nsds50ruv: {replica 2 ldap://FQDN:389}
nsds50ruv: {replica 1 ldap://FQDN:389}
dc: test
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 00000000

Reinitialzing the consume restored replication.
}}}

Comment 3 Sankar Ramalingam 2016-08-11 07:54:16 UTC

1).
[root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base
389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64
389-ds-base-1.3.5.10-7.el7.x86_64
389-ds-base-devel-1.3.5.10-7.el7.x86_64
389-ds-base-libs-1.3.5.10-7.el7.x86_64

2).
[root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
67
67
67
67

3).
[root@ratangad MMR_WINSYNC]# /bin/systemctl status dirsrv.target |grep -i active    Active: active since Thu 2016-08-11 12:08:33 IST; 16s ago
[root@ratangad MMR_WINSYNC]#  /bin/systemctl stop  dirsrv.target

4).
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M1/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-M2/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C1/db2index
[root@ratangad MMR_WINSYNC]# /usr/lib64/dirsrv/slapd-C2/db2index

5).
[root@ratangad MMR_WINSYNC]#  /bin/systemctl start  dirsrv.target
[root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1289 "ou=people,dc=passsync,dc=com" m2usr 9 localhost
[root@ratangad MMR_WINSYNC]# ./AddEntry.sh Users 1189 "ou=people,dc=passsync,dc=com" m1usr 9 localhost

6).
[root@ratangad MMR_WINSYNC]# PORT=1389; SUFF="dc=passsync,dc=com"; for PORT in `echo "1189 1289 1389 14089"`; do /usr/bin/ldapsearch -x -p $PORT -h localhost -D "cn=Directory Manager" -w Secret123 -b $SUFF |grep -i dn: | wc -l ; done
85
85
85
85

7). Replication works fine and no errors observed from error logs.

Hence, marking the bug as Verified.

Comment 5 errata-xmlrpc 2016-11-03 20:42:14 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html

Note You need to log in before you can comment on or make changes to this bug.