Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1597384 - Async operations can hang when the server is running nunc-stans
Async operations can hang when the server is running nunc-stans
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
7.4
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: thierry bordaz
RHDS QE
: ZStream
: 1600764 (view as bug list)
Depends On:
Blocks: 1597530
  Show dependency treegraph
 
Reported: 2018-07-02 16:02 EDT by mreynolds
Modified: 2018-10-30 08:41 EDT (History)
10 users (show)

See Also:
Fixed In Version: 389-ds-base-1.3.8.4-1.el7
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
: 1597530 (view as bug list)
Environment:
Last Closed: 2018-10-30 06:14:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3553761 None None None 2018-08-07 05:57 EDT
Red Hat Product Errata RHSA-2018:3127 None None None 2018-10-30 06:15 EDT

  None (edit)
Description mreynolds 2018-07-02 16:02:14 EDT
This bug is created as a clone of upstream ticket:
https://pagure.io/389-ds-base/issue/49765

#### Issue Description
This bug is a side effect of https://pagure.io/389-ds-base/issue/48184

With async operations on the same connection, the connection can be set several times in read_ready. This is done with connection_threadmain scheduling ns_handle_pr_read_ready (connection_make_readable_nolock).

When there is no more activity on the connection (no new req, timeout, closure..) and the server is listening to it (ns_handle_pr_read_ready), then if an async operation tries to schedule ns_handle_pr_read_ready it will hang (before completing the operation).

The consequence is one or more hanging operations on the connection with a stack like

    Thread 63 (Thread 0x7f794a711700 (LWP 28798)):
    #0  0x00007f7995e75c93 in select () from /lib64/libc.so.6
    #1  0x00007f7998e8046b in DS_Sleep (ticks=100) at 389-ds-base/ldap/servers/slapd/util.c:1086
    #2  0x0000000000438a6a in ns_connection_post_io_or_closing (conn=0x7f79550fc800) at 389-ds-base/ldap/servers/slapd/daemon.c:1835
    #3  0x0000000000427adc in connection_make_readable_nolock (conn=0x7f79550fc800) at 389-ds-base/ldap/servers/slapd/connection.c:1361                                                                                              
    #4  0x0000000000429e91 in connection_threadmain () at 389-ds-base/ldap/servers/slapd/connection.c:1724
    #5  0x00007f7996a0907b in _pt_root () from /lib64/libnspr4.so
    #6  0x00007f79965ab36d in start_thread () from /lib64/libpthread.so.0
    #7  0x00007f7995e7fb4f in clone () from /lib64/libc.so.6      

A way to mitigate the problem is to set a idletimeout

#### Package Version and Platform
1.3.7, 1.3.8 and master


#### Steps to reproduce

test case is not systematic.
On rapid machine, it hangs one out of ten

#### Actual results
Some operations are not completing (hanging)

#### Expected results
Should not hang
Comment 6 German Parente 2018-08-01 12:08:11 EDT
*** Bug 1600764 has been marked as a duplicate of this bug. ***
Comment 7 Viktor Ashirov 2018-08-30 09:22:56 EDT
Build tested: 389-ds-base-1.3.8.4-12.el7.x86_64

Cmocka unit tests pass:

PASS: test_nuncstans
PASS: test_nuncstans_stress_small
PASS: test_nuncstans_stress_large

No hang was observed during the acceptance testing with nunc-stans. NS is also disabled by default on new installs (bz1614501), so marking as VERIFIED, SanityOnly.
Comment 14 errata-xmlrpc 2018-10-30 06:14:34 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3127

Note You need to log in before you can comment on or make changes to this bug.