Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1415162 - ipa-extdom-extop plugin can exhaust DS worker threads
ipa-extdom-extop plugin can exhaust DS worker threads
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ipa (Show other bugs)
7.3
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: IPA Maintainers
ipa-qe
Aneta Šteflová Petrová
:
Depends On: 1473571
Blocks: 1420851 1467835 1472344
  Show dependency treegraph
 
Reported: 2017-01-20 07:40 EST by Thorsten Scherf
Modified: 2018-04-10 12:41 EDT (History)
19 users (show)

See Also:
Fixed In Version: ipa-server-4.5.4-5.el7
Doc Type: Bug Fix
Doc Text:
The IdM LDAP server no longer becomes unresponsive when resolving an AD user takes a long time When the System Security Services Daemon (SSSD) took a long time to resolve a user from a trusted Active Directory (AD) domain on the Identity Management (IdM) server, the IdM LDAP server sometimes exhausted its own worker threads. Consequently, the IdM LDAP server was unable to respond to further requests from SSSD clients or other LDAP clients. This update adds a new API to SSSD on the IdM server, which enables identity requests to time out. Also, the IdM LDAP extended identity operations plug-in and the Schema Compatibility plug-in now support this API to enable canceling requests that take too long. As a result, the IdM LDAP server can recover from the described situation and keep responding to further requests.
Story Points: ---
Clone Of:
: 1473571 1473577 (view as bug list)
Environment:
Last Closed: 2018-04-10 12:40:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
tar ball with test build with a reduced client timeout (7.81 MB, application/x-gzip)
2017-02-13 10:32 EST, Sumit Bose
no flags Details
tar-ball with test build (7.85 MB, application/x-gzip)
2017-07-20 08:46 EDT, Sumit Bose
no flags Details
valgrind output (2.69 MB, text/plain)
2017-10-12 11:43 EDT, German Parente
no flags Details
tar-ball with test build rebased to sssd-1.15.2-50.el7_4.6 (8.96 MB, application/x-gzip)
2017-12-07 14:57 EST, Sumit Bose
no flags Details
tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.6 (8.97 MB, application/x-gzip)
2018-01-12 07:18 EST, Sumit Bose
no flags Details
tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.8 (8.98 MB, application/x-gzip)
2018-01-12 07:20 EST, Sumit Bose
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0918 None None None 2018-04-10 12:41 EDT

  None (edit)
Description Thorsten Scherf 2017-01-20 07:40:46 EST
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/freeipa/ticket/5464

ipa-extdom-extop is used to resolve AD trust users/groups. It does this using libnss calls like getpwnam, getgrname, etc.

libnss calls are serialized by a simple lock and each call can last a long time because it has to get info from SSSD/AD.

If a DS server is flooded with "IPA trusted domain ID mapper" extop, many worker threads will be busy for long time. The worse condition is when all the workers are busy with such extop. 
Then DS is no longer to process others requests and DS appears to have transient hang.

ipa-extdom-extop should manage those extop with its own threads (possibly like persistant searches) to not impact DS.
Comment 6 Sumit Bose 2017-02-13 10:32 EST
Created attachment 1249898 [details]
tar ball with test build with a reduced client timeout
Comment 27 Sumit Bose 2017-07-20 08:46 EDT
Created attachment 1301685 [details]
tar-ball with test build
Comment 29 Martin Kosek 2017-07-21 04:44:43 EDT
Note that this solution is composed from changes in ipa, sssd and slapi-nis:

* ipa: Bug 1415162 - this bug
* sssd: Bug 1473571
* slapi-nis: Bug 1473577
Comment 56 German Parente 2017-10-12 11:43 EDT
Created attachment 1337857 [details]
valgrind output
Comment 81 Alexander Bokovoy 2017-11-30 09:35:48 EST
Added a doc. Let me know, Aneta, if this is enough.
Comment 82 Alexander Bokovoy 2017-11-30 09:37:44 EST
Fixed upstream. 

master:
    78ad1cf ipa-extdom-extop: refactor nsswitch operations

ipa-4-6:
    d1dd794 ipa-extdom-extop: refactor nsswitch operations

ipa-4-5:
    a2da9f9 ipa-extdom-extop: refactor nsswitch operations
Comment 98 Sumit Bose 2017-12-07 14:57 EST
Created attachment 1364463 [details]
tar-ball with test build rebased to sssd-1.15.2-50.el7_4.6
Comment 100 Mohammad Rizwan 2018-01-03 04:13:37 EST
version:
ipa-server-4.5.4-7.el7.x86_64
sssd-1.16.0-14.el7.x86_64

sss_nss_getpwnam_timeout_test.c 
--------------------------------------------------------------
#include <stdio.h>
#define IPA_389DS_PLUGIN_HELPER_CALLS 1
#include <sss_nss_idmap.h>

int main(int argc, char* argv[])
{
    int ret;
    struct passwd pwd;
    struct passwd *pwd_result;
    char buffer[1024];
    size_t buflen = sizeof(buffer);

    if (argc != 2) {
        fprintf(stderr, "Missing argument.\n");
        return 1;
    }

    ret = sss_nss_getpwnam_timeout(argv[1], &pwd, buffer, buflen, &pwd_result,
                                   0, 1000);

    fprintf(stderr, "Done [%d].\n", ret);

    return ret;
}
--------------------------------------------------------------

steps:

Make sure the 'libsss_nss_idmap-devel' is installed (yum install libsss_nss_idmap-devel) and then call:

1. gcc -Wall -Wextra -Werror sss_nss_getpwnam_timeout_test.c -o sss_nss_getpwnam_timeout_test -lsss_nss_idmap

2. set 'timeout = 999999' in the [domain/...] section of sssd.conf

3. restart the sssd service.

4. call $ kill -STOP $(pidof sssd_be)

5. call './sss_nss_getpwnam_timeout_test non_exisiting_user_name'
   this call should return after about 1s with "Done [5]."

6. as a reference you can call 'getent passwd non_exisiting_user_name'
   this call will return after 5 minites or if

7. kill -CONT $(pidof sssd_be) is called.

Actual result:
[root@client ~]# vi /etc/sssd/sssd.conf 
[root@client ~]# 
[root@client ~]# systemctl restart sssd
[root@client ~]# kill -STOP $(pidof sssd_be)
[root@client ~]# ./sss_nss_getpwnam_timeout_test test101
Done [5].
[root@client ~]#

Thus on the basis of above observations, marking the bug status "VERIFIED".
Comment 101 Sumit Bose 2018-01-12 07:18 EST
Created attachment 1380407 [details]
tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.6
Comment 102 Sumit Bose 2018-01-12 07:20 EST
Created attachment 1380408 [details]
tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.8
Comment 108 errata-xmlrpc 2018-04-10 12:40:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0918

Note You need to log in before you can comment on or make changes to this bug.