Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Fixed a deadlock during backend deletion in Directory Server
Previously, transaction information was not passed to one of the database helper functions during backend deletion. This could result in a deadlock if a plug-in attempted to access data in the area locked by the transaction. With this update, transaction info is passed to all necessary database helper functions, and a deadlock no longer occurs in the described situation.
A deadlock occurs in a single thread processing a DEL operation. The deadlock is dynamic and there is no clear testcase.
The deadlock occurs in the database lock while the DEL operation under transaction holds a db_lock in write, then issue an internal DEL without the parent txn. The internal DEL needs the db_lock in read, but as it is not using the same transaction the lock is not granted
{{{
Thread 6 (Thread 0x7f52c1feb700 (LWP 17418)):
#0 0x00007f52f2c0c705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f52ed55d2f3 in __db_hybrid_mutex_suspend () from /lib64/libdb-5.3.so
#2 0x00007f52ed55c640 in __db_tas_mutex_lock () from /lib64/libdb-5.3.so
#3 0x00007f52ed606cea in __lock_get_internal () from /lib64/libdb-5.3.so
#4 0x00007f52ed6077d0 in __lock_get () from /lib64/libdb-5.3.so
#5 0x00007f52ed633112 in __db_lget () from /lib64/libdb-5.3.so
#6 0x00007f52ed57a5f5 in __bam_search () from /lib64/libdb-5.3.so
#7 0x00007f52ed565256 in __bamc_search () from /lib64/libdb-5.3.so
#8 0x00007f52ed566d0f in __bamc_get () from /lib64/libdb-5.3.so
#9 0x00007f52ed61fc56 in __dbc_iget () from /lib64/libdb-5.3.so
#10 0x00007f52ed62c843 in __db_get () from /lib64/libdb-5.3.so
#11 0x00007f52ed630123 in __db_get_pp () from /lib64/libdb-5.3.so
#12 0x00007f52e97535fb in id2entry (be=0x1cc48b0, id=id@entry=4, txn=txn@entry=0x0, err=err@entry=0x7f52c1fe6638) at ldap/servers/slapd/back-ldbm/id2entry.c:323
#13 0x00007f52e9777c20 in ldbm_back_delete (pb=0x7f5284061cf0) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:463
#14 0x00007f52f4c1d380 in op_shared_delete (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:364
#15 0x00007f52f4c1d532 in delete_internal_pb (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:242
#16 0x00007f52f4c1d7e3 in slapi_delete_internal_pb (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:185
#17 0x00007f52e907c3cd in mep_del_post_op (pb=<optimized out>) at ldap/servers/plugins/mep/mep.c:2556
#18 0x00007f52f4c638c5 in plugin_call_func (list=0x1bcae30, operation=operation@entry=563, pb=pb@entry=0x3184f90, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1453
#19 0x00007f52f4c63a78 in plugin_call_list (pb=0x3184f90, operation=563, list=<optimized out>) at ldap/servers/slapd/plugin.c:1415
#20 plugin_call_plugins (pb=pb@entry=0x3184f90, whichfunction=whichfunction@entry=563) at ldap/servers/slapd/plugin.c:398
#21 0x00007f52e9776f07 in ldbm_back_delete (pb=0x3184f90) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:1091
#22 0x00007f52f4c1d380 in op_shared_delete (pb=pb@entry=0x3184f90) at ldap/servers/slapd/delete.c:364
#23 0x00007f52f4c1d643 in do_delete (pb=0x3184f90) at ldap/servers/slapd/delete.c:128
#24 0x0000000000417688 in connection_dispatch_operation (pb=<optimized out>, op=0x2dbc8e0, conn=0x7f52e02b5190) at ldap/servers/slapd/connection.c:587
#25 connection_threadmain () at ldap/servers/slapd/connection.c:2372
#26 0x00007f52f32679eb in _pt_root () from /lib64/libnspr4.so
#27 0x00007f52f2c08df3 in start_thread () from /lib64/libpthread.so.0
#28 0x00007f52f293601d in clone () from /lib64/libc.so.6
}}}
The delete is done under backend-transcation and owns in write a id2entry page (2):
{{{
800006b9 dd= 8 locks held 185 write locks 85 pid/thread 17379/139993418741504 flags 0 priority 100
800006b9 READ 30 HELD ipaca/ancestorid.db page 1
...
800006b9 WRITE 1 HELD userRoot/id2entry.db page 2
...
}}}
Then the internal delete is hanging waiting for this same page lock to do a read.
{{{
2 dd=103 locks held 0 write locks 0 pid/thread 17379/139993418741504 flags 0 priority 100
2 READ 1 WAIT userRoot/id2entry.db page 2
}}}
The test case I did is the following but I was only able to reproduce once:
1 - install ipa server + replica
2 - stop the replica (I reproduced without any replication session)
3 - run the script lot_mods.sh
note: I was testing on 1.3.1 where MOD/REPL kept the delete values in the entry. So the entry was growing. This explain why the test create/delete the test entry. The growth of the entry can help to reproduce. In 1.3.3 and after MOD/REPL do not keep deleted values. The deadlock occured during the 'ipa user-del'
(In reply to Sankar Ramalingam from comment #4)
> Hi Noriko, where can I find lot_mods.sh file?
Hi Sankar, you can use https://bugzilla.redhat.com/show_bug.cgi?id=1280210#c5 to verify bugzilla.
test case ticket47976_test.py is developed by Theirry for this fix.
Regards,
Ami
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2016-0737.html
A deadlock occurs in a single thread processing a DEL operation. The deadlock is dynamic and there is no clear testcase. The deadlock occurs in the database lock while the DEL operation under transaction holds a db_lock in write, then issue an internal DEL without the parent txn. The internal DEL needs the db_lock in read, but as it is not using the same transaction the lock is not granted {{{ Thread 6 (Thread 0x7f52c1feb700 (LWP 17418)): #0 0x00007f52f2c0c705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f52ed55d2f3 in __db_hybrid_mutex_suspend () from /lib64/libdb-5.3.so #2 0x00007f52ed55c640 in __db_tas_mutex_lock () from /lib64/libdb-5.3.so #3 0x00007f52ed606cea in __lock_get_internal () from /lib64/libdb-5.3.so #4 0x00007f52ed6077d0 in __lock_get () from /lib64/libdb-5.3.so #5 0x00007f52ed633112 in __db_lget () from /lib64/libdb-5.3.so #6 0x00007f52ed57a5f5 in __bam_search () from /lib64/libdb-5.3.so #7 0x00007f52ed565256 in __bamc_search () from /lib64/libdb-5.3.so #8 0x00007f52ed566d0f in __bamc_get () from /lib64/libdb-5.3.so #9 0x00007f52ed61fc56 in __dbc_iget () from /lib64/libdb-5.3.so #10 0x00007f52ed62c843 in __db_get () from /lib64/libdb-5.3.so #11 0x00007f52ed630123 in __db_get_pp () from /lib64/libdb-5.3.so #12 0x00007f52e97535fb in id2entry (be=0x1cc48b0, id=id@entry=4, txn=txn@entry=0x0, err=err@entry=0x7f52c1fe6638) at ldap/servers/slapd/back-ldbm/id2entry.c:323 #13 0x00007f52e9777c20 in ldbm_back_delete (pb=0x7f5284061cf0) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:463 #14 0x00007f52f4c1d380 in op_shared_delete (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:364 #15 0x00007f52f4c1d532 in delete_internal_pb (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:242 #16 0x00007f52f4c1d7e3 in slapi_delete_internal_pb (pb=pb@entry=0x7f5284061cf0) at ldap/servers/slapd/delete.c:185 #17 0x00007f52e907c3cd in mep_del_post_op (pb=<optimized out>) at ldap/servers/plugins/mep/mep.c:2556 #18 0x00007f52f4c638c5 in plugin_call_func (list=0x1bcae30, operation=operation@entry=563, pb=pb@entry=0x3184f90, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1453 #19 0x00007f52f4c63a78 in plugin_call_list (pb=0x3184f90, operation=563, list=<optimized out>) at ldap/servers/slapd/plugin.c:1415 #20 plugin_call_plugins (pb=pb@entry=0x3184f90, whichfunction=whichfunction@entry=563) at ldap/servers/slapd/plugin.c:398 #21 0x00007f52e9776f07 in ldbm_back_delete (pb=0x3184f90) at ldap/servers/slapd/back-ldbm/ldbm_delete.c:1091 #22 0x00007f52f4c1d380 in op_shared_delete (pb=pb@entry=0x3184f90) at ldap/servers/slapd/delete.c:364 #23 0x00007f52f4c1d643 in do_delete (pb=0x3184f90) at ldap/servers/slapd/delete.c:128 #24 0x0000000000417688 in connection_dispatch_operation (pb=<optimized out>, op=0x2dbc8e0, conn=0x7f52e02b5190) at ldap/servers/slapd/connection.c:587 #25 connection_threadmain () at ldap/servers/slapd/connection.c:2372 #26 0x00007f52f32679eb in _pt_root () from /lib64/libnspr4.so #27 0x00007f52f2c08df3 in start_thread () from /lib64/libpthread.so.0 #28 0x00007f52f293601d in clone () from /lib64/libc.so.6 }}} The delete is done under backend-transcation and owns in write a id2entry page (2): {{{ 800006b9 dd= 8 locks held 185 write locks 85 pid/thread 17379/139993418741504 flags 0 priority 100 800006b9 READ 30 HELD ipaca/ancestorid.db page 1 ... 800006b9 WRITE 1 HELD userRoot/id2entry.db page 2 ... }}} Then the internal delete is hanging waiting for this same page lock to do a read. {{{ 2 dd=103 locks held 0 write locks 0 pid/thread 17379/139993418741504 flags 0 priority 100 2 READ 1 WAIT userRoot/id2entry.db page 2 }}} The test case I did is the following but I was only able to reproduce once: 1 - install ipa server + replica 2 - stop the replica (I reproduced without any replication session) 3 - run the script lot_mods.sh note: I was testing on 1.3.1 where MOD/REPL kept the delete values in the entry. So the entry was growing. This explain why the test create/delete the test entry. The growth of the entry can help to reproduce. In 1.3.3 and after MOD/REPL do not keep deleted values. The deadlock occured during the 'ipa user-del'