730396 – DEADLOCK in entryrdn.db may add "nsuniqueid=...+" to the DN

Bug 730396 - DEADLOCK in entryrdn.db may add "nsuniqueid=...+" to the DN

Summary: DEADLOCK in entryrdn.db may add "nsuniqueid=...+" to the DN

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	389
Classification:	Retired
Component:	Database - Indexes/Searches
Sub Component:
Version:	1.2.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Noriko Hosoi
QA Contact:	Chandrasekar Kannan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-08-12 18:55 UTC by Noriko Hosoi
Modified:	2015-01-04 23:50 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-08-12 20:15:57 UTC
Embargoed:

Attachments	(Terms of Use)
test program to reproduce the problem (28.53 KB, text/plain) 2011-08-12 19:10 UTC, Noriko Hosoi	no flags	Details
View All

Description Noriko Hosoi 2011-08-12 18:55:45 UTC

Description of problem:
The test program (being attached later) adds entries with dn:
    uid=user<master_port>.<thread_num>.<seq_num>,dc=example,dc=com
where <master_port> is 389 or 390,
<thread_num> is 0, 1, 2, or 3,
<seq_num> is 0 .. 24999.

Since 4 threads stresses the same location of the entryrdn index, it causes quite many DEADLOCKs.  I see 1131 DEADLOCKS out of 100,000 adds.

[11/Aug/2011:18:24:52 -0700] entryrdn-index - _entryrdn_put_data: Adding the parent link (P727:nsuniqueid=ae95e8a1-c48111e0-b5efb617-5d664145+uid=user390.1.40) failed: DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock (-30994)

The DEADLOCK should be roll-backed and the operation should be retried.  But it seems there is a bug in the retry code and 282 entries out of 1131 DEADLOCK errors end up having "nsuniqueid=...+" in front of the dn as follows:

    $ ldapsearch -LLLx -h nereid -p 391 -b "dc=example,dc=com" -D 'cn=directory manager' -w Secret123 "uid=user390.1.40"
    dn: nsuniqueid=ae95e8a1-c48111e0-b5efb617-5d664145+uid=user390.1.40,dc=example,dc=com
    objectClass: top
    objectClass: person
    objectClass: organizationalperson
    objectClass: inetorgperson
    cn: cn value
    sn: sn value
    givenName: givenname value
    mail: mail value
    uid: user390.1.40
    userPassword:: e1NTSEF9NC92THNxUm5mYThiZTBJOTVzRXRaYmtQWXpnaGJtWC9vcFMxYXc9PQ==

It does not have a nsTombstone objectclass value.  And the entry is still indexed (e.g., in the uid.db4).  So, as ldapsearch returns the entry, it's not "deleted".  But the DN is not correct.

Comment 1 Noriko Hosoi 2011-08-12 19:10:14 UTC

Created attachment 518095 [details]
test program to reproduce the problem

How to reproduce the problem.
Build the test program:
$ gcc -Wall replperf.c -o replperf -lpthread -lldap -lm

Set up 2-way MMR + one read only replica.
Assume 3 servers are all on hostX.
M1 (port 389) <--> M2 (port 390)
 \                /
  v              v
read only replica (port 391)

Launch the test program from 2 terminals.
terminal1> ./replperf -h hostX -p 389 -i hostX -q 391 -D 'cn=directory manager' -w password -d 'cn=directory manager' -W password -n 25000 -I 50 -t 4 -a -e user389
terminal2> ./replperf -h hostX -p 390 -i hostX -q 391 -D 'cn=directory manager' -w password -d 'cn=directory manager' -W password -n 25000 -I 50 -t 4 -a -e user390

When it's done, run a command line against the replica (port 391) to get the DNs to which nsuniqueid is added.  If none, this bug is fixed.
$ ldapsearch -LLLx -h hostX -p 391 -b "dc=example,dc=com" -D 'cn=directory manager' -w password "nsuniqueid=*" dn | egrep nsuniqueid
dn: nsuniqueid=958d3759-c48111e0-9386a9ab-cfa2440d+uid=user389.3.23,dc=example
dn: nsuniqueid=958d375f-c48111e0-9386a9ab-cfa2440d+uid=user389.0.23,dc=example

Also check the error log of the replica.  DEADLOCKs would still have occurred.
[11/Aug/2011:15:07:34 -0700] entryrdn-index - _entryrdn_put_data: Adding the par
ent link (P9918:uid=user389.1.497) failed: DB_LOCK_DEADLOCK: Locker killed to re
solve a deadlock (-30994)

Comment 2 Noriko Hosoi 2011-08-12 20:15:57 UTC

It's NOT a bug.  It's "just" a conflict. I am closing this bug.

# entry-id: 727
dn: nsuniqueid=ae95e8a1-c48111e0-b5efb617-5d664145+uid=user390.1.40,dc=example
 ,dc=com
nsUniqueId: ae95e8a1-c48111e0-b5efb617-5d664145
objectClass;vucsn-4e4480a9000500020000: top
objectClass;vucsn-4e4480a9000500020000: person
objectClass;vucsn-4e4480a9000500020000: organizationalperson
objectClass;vucsn-4e4480a9000500020000: inetorgperson
cn;vucsn-4e4480a9000500020000: cn value
sn;vucsn-4e4480a9000500020000: sn value
givenName;vucsn-4e4480a9000500020000: givenname value
mail;vucsn-4e4480a9000500020000: mail value
uid;vucsn-4e4480a9000500020000;mdcsn-4e4480a9000500020000: user390.1.40
userPassword;vucsn-4e4480a9000500020000: {SSHA}4/vLsqRnfa8be0I95sEtZbkPYzghbmX
 /opS1aw==
creatorsName;vucsn-4e4480a9000500020000: cn=directory manager
modifiersName;vucsn-4e4480a9000500020000: cn=directory manager
createTimestamp;vucsn-4e4480a9000500020000: 20110812012352Z
modifyTimestamp;vucsn-4e4480a9000500020000: 20110812012352Z
nsds5ReplConflict;vucsn-4e4480a9000500020000: namingConflict uid=user390.1.40,dc=example,dc=com

Note You need to log in before you can comment on or make changes to this bug.