Bug 988559 - deadlock after adding and deleting entries
deadlock after adding and deleting entries
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
7.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
Depends On:
Blocks: 988562
  Show dependency treegraph
 
Reported: 2013-07-25 16:11 EDT by Nathan Kinder
Modified: 2014-06-17 22:59 EDT (History)
2 users (show)

See Also:
Fixed In Version: 389-ds-base-1.3.1.5-1.el7
Doc Type: Bug Fix
Doc Text:
Cause: Performing many add and delete operations at the same time. Consequence: The server can hang or deadlock. Fix: The entry cache locking needed to be refined. Result: The server does not hang when performing concurrent add and delete operations.
Story Points: ---
Clone Of:
: 988562 (view as bug list)
Environment:
Last Closed: 2014-06-13 07:43:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
LDIF file that adds and deletes 200 entries (1) (31.87 KB, text/x-ldif)
2013-07-25 16:28 EDT, mreynolds
no flags Details
LDIF file that adds and deletes 200 entries (2) (32.46 KB, text/x-ldif)
2013-07-25 16:28 EDT, mreynolds
no flags Details
LDIF file that adds and deletes 200 entries (3) (31.87 KB, text/x-ldif)
2013-07-25 16:29 EDT, mreynolds
no flags Details
LDIF file that adds and deletes 200 entries (4) (32.46 KB, text/x-ldif)
2013-07-25 16:29 EDT, mreynolds
no flags Details
LDIF file that adds and deletes 200 entries (5) (32.46 KB, text/x-ldif)
2013-07-25 16:30 EDT, mreynolds
no flags Details

  None (edit)
Description Nathan Kinder 2013-07-25 16:11:50 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47449

If you have multiple clients, each adding and deleting users the server will deadlock.  I created 5 ldif files.  Each ldif file added and then deleted 200 entries.  Using 5 separate ldapmodify's the server will deadlock within a minute or so.

Appears to be an issue with an entry cache lock not being unlocked:

Thread 29 (Thread 0x7f7d16bfd700 (LWP 8337)):
#3  0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so
#4  0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so
#5  0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942
#6  0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d16bfa860) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664
#7  0x00007f7d29eeb814 in ldbm_back_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_delete.c:257
#8  0x00007f7d2dc8def4 in op_shared_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/delete.c:364
#9  0x00007f7d2dc8d6dd in do_delete (pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/delete.c:128
#10 0x000000000041578e in connection_dispatch_operation (conn=0x7f7d2464f730, op=0x231ee80, pb=0x7f7d16bfcaa0) at ../ds/ldap/servers/slapd/connection.c:643
#11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482

Thread 27 (Thread 0x7f7d157fb700 (LWP 8339)):
#3  0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so
#4  0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so
#5  0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942
#6  0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d157f6790) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664
#7  0x00007f7d29ede3f1 in ldbm_back_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_add.c:261
#8  0x00007f7d2dc7dc4b in op_shared_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/add.c:735
#9  0x00007f7d2dc7cb96 in do_add (pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/add.c:258
#10 0x000000000041576c in connection_dispatch_operation (conn=0x7f7d2464f878, op=0x22ec000, pb=0x7f7d157faaa0) at ../ds/ldap/servers/slapd/connection.c:638
#11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482

Thread 15 (Thread 0x7f7d09bf5700 (LWP 8351)):
#0  0x000000377560e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000037756093be in _L_lock_995 () from /lib64/libpthread.so.0
#2  0x0000003775609326 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so
#4  0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so
#5  0x00007f7d29eb19bb in dblayer_lock_backend (be=0x2094160) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3942
#6  0x00007f7d29eb102f in dblayer_txn_begin (be=0x2094160, parent_txn=0x0, txn=0x7f7d09bf0790) at ../ds/ldap/servers/slapd/back-ldbm/dblayer.c:3664
#7  0x00007f7d29ede3f1 in ldbm_back_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_add.c:261
#8  0x00007f7d2dc7dc4b in op_shared_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/add.c:735
#9  0x00007f7d2dc7cb96 in do_add (pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/add.c:258
#10 0x000000000041576c in connection_dispatch_operation (conn=0x7f7d2464f4a0, op=0x232c850, pb=0x7f7d09bf4aa0) at ../ds/ldap/servers/slapd/connection.c:638
#11 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482


---> this thread is causing the deadlock

#3  0x0000003b56223fe9 in PR_Lock () from /lib64/libnspr4.so
#4  0x0000003b5622410b in PR_EnterMonitor () from /lib64/libnspr4.so
#5  0x00007f7d29ea7169 in cache_lock_entry (cache=0x21130b8, e=0x22f67a0) at ../ds/ldap/servers/slapd/back-ldbm/cache.c:1502
#6  0x00007f7d29ebee77 in find_entry_internal_dn (pb=0x7f7d073f0aa0, be=0x2094160, sdn=0x7f7ca400dec0, lock=1, txn=0x7f7d073ee860, flags=0)
    at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:155
#7  0x00007f7d29ebf446 in find_entry_internal (pb=0x7f7d073f0aa0, be=0x2094160, addr=0x22f4b68, lock=1, txn=0x7f7d073ee860, flags=0)
    at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:293
#8  0x00007f7d29ebf530 in find_entry2modify (pb=0x7f7d073f0aa0, be=0x2094160, addr=0x22f4b68, txn=0x7f7d073ee860)
    at ../ds/ldap/servers/slapd/back-ldbm/findentry.c:324
#9  0x00007f7d29eeb8b4 in ldbm_back_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/back-ldbm/ldbm_delete.c:273
#10 0x00007f7d2dc8def4 in op_shared_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/delete.c:364
#11 0x00007f7d2dc8d6dd in do_delete (pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/delete.c:128
#12 0x000000000041578e in connection_dispatch_operation (conn=0x7f7d2464f358, op=0x22f4a90, pb=0x7f7d073f0aa0) at ../ds/ldap/servers/slapd/connection.c:643
#13 0x0000000000417524 in connection_threadmain () at ../ds/ldap/servers/slapd/connection.c:2482
Comment 1 mreynolds 2013-07-25 16:28:15 EDT
Created attachment 778511 [details]
LDIF file that adds and deletes 200 entries (1)
Comment 2 mreynolds 2013-07-25 16:28:48 EDT
Created attachment 778512 [details]
LDIF file that adds and deletes 200 entries (2)
Comment 3 mreynolds 2013-07-25 16:29:14 EDT
Created attachment 778513 [details]
LDIF file that adds and deletes 200 entries (3)
Comment 4 mreynolds 2013-07-25 16:29:43 EDT
Created attachment 778514 [details]
LDIF file that adds and deletes 200 entries (4)
Comment 5 mreynolds 2013-07-25 16:30:04 EDT
Created attachment 778515 [details]
LDIF file that adds and deletes 200 entries (5)
Comment 6 mreynolds 2013-07-25 16:30:55 EDT
Steps to reproduce:

[1]  Create an instance of 389 using "dc=example,dc=com" as the suffix.
[2]  Open 5 terminals, and run 5 ldapmodifys using the supplied ldifs at the same time. Example:

    ldapmodify -D "cn=directory manager" -w password -c -f add1.ldif

[3]  Keep running these ldapmodifys until the server hangs.  Usually happens in less than a minute.
Comment 7 Rich Megginson 2013-10-01 19:27:01 EDT
moving all ON_QA bugs to MODIFIED in order to add them to the errata (can't add bugs in the ON_QA state to an errata).  When the errata is created, the bugs should be automatically moved back to ON_QA.
Comment 9 Sankar Ramalingam 2014-01-30 07:25:46 EST
Followed the steps given in comment #6 and waited for 2 to 3 minutes, but didn't observe any crash. Hence, marking the bug as verified.
Comment 10 Ludek Smid 2014-06-13 07:43:51 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.