RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1608746 - Nunc stans - deadlock between threads taking NS job lock and connection lock in the opposite order
Summary: Nunc stans - deadlock between threads taking NS job lock and connection lock ...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: 389-ds-base
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: pre-dev-freeze
: 8.2
Assignee: mreynolds
QA Contact: RHDS QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-26 09:02 UTC by thierry bordaz
Modified: 2019-11-15 00:32 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-15 00:32:39 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description thierry bordaz 2018-07-26 09:02:08 UTC
Description of problem:
The problem is a deadlock when two threads are taking locks in the opposite order. Both thread are on the same connection. One is dispatched to read data coming in, the other is sending back results and decide to close the connection.

One thread is a worker sending back result to the client and that decides to close the connection. At this time it hold the connection lock. As the connection is flagged NEED_CLOSING and that exists a conn->job (handler) to read the connection it tries to terminate the job (ns_job_done). To do that it acquire the job lock.

Thread 42 (Thread 0x7fa5d9ff3700 (LWP 27235)):
#0  0x00007fa611ba985d in __lll_lock_wait () at /lib64/libpthread.so.0
#1  0x00007fa611ba2df4 in pthread_mutex_lock () at /lib64/libpthread.so.0
#2  0x00007fa61450dcc9 in ns_job_done () at /usr/lib64/dirsrv/libnunc-stans.so.0
#3  0x000055f28f01ae2e in ns_connection_post_io_or_closing.part ()
#4  0x000055f28f0186c1 in disconnect_server ()
#5  0x00007fa614281493 in flush_ber () at /usr/lib64/dirsrv/libslapd.so.0
#6  0x00007fa6142835d3 in send_ldap_result_ext () at /usr/lib64/dirsrv/libslapd.so.0
#7  0x00007fa61428376f in send_ldap_result () at /usr/lib64/dirsrv/libslapd.so.0
#8  0x000055f28f02b0cb in ids_sasl_check_bind ()
#9  0x000055f28f012497 in do_bind ()
#10 0x000055f28f0195aa in connection_threadmain ()
#11 0x00007fa611fff3b8 in None () at /lib64/libnspr4.so
#12 0x00007fa611ba05f4 in start_thread () at /lib64/libpthread.so.0
#13 0x00007fa6113e003f in clone () at /lib64/libc.so.6


At the same time a event on the connection calls the registered job handler. To call the handler callback, the thread holds the job lock. The handler callback reads the data on the connection and so acquire the connection lock

Thread 18 (Thread 0x7fa5fa9d5700 (LWP 27211)):
#0  0x00007fa611ba658c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00007fa611ff98eb in PR_EnterMonitor () at /lib64/libnspr4.so
#2  0x000055f28f01b222 in ns_handle_pr_read_ready ()
#3  0x00007fa61450dda9 in work_job_execute () at /usr/lib64/dirsrv/libnunc-stans.so.0
#4  0x00007fa6116c8395 in None () at /lib64/libevent-2.1.so.6
#5  0x00007fa6116c8d97 in event_base_loop () at /lib64/libevent-2.1.so.6
#6  0x00007fa61450f0b2 in ns_event_fw_loop () at /usr/lib64/dirsrv/libnunc-stans.so.0
#7  0x00007fa61450dbc9 in event_loop_thread_func () at /usr/lib64/dirsrv/libnunc-stans.so.0
#8  0x00007fa611ba05f4 in start_thread () at /lib64/libpthread.so.0
#9  0x00007fa6113e003f in clone () at /lib64/libc.so.6

This bug should not happen frequently.

Version-Release number of selected component (if applicable):
Not clear if it comes in 7.4 or 7.5
7.5 introduced the link job <--> connection in the data structures job/conn. But I am not sure this possibility of deadlock already existed in 7.4

How reproducible:
during install of idm+pki-core
should not be systematic


Actual results:
It hangs indefinitely

Expected results:
It should not hang

Additional info:

Comment 5 mreynolds 2019-11-15 00:32:39 UTC
nunc-stans has been abandoned, closing as won't fix


Note You need to log in before you can comment on or make changes to this bug.