Hide Forgot
This bug is created as a clone of upstream ticket: https://fedorahosted.org/freeipa/ticket/5464 ipa-extdom-extop is used to resolve AD trust users/groups. It does this using libnss calls like getpwnam, getgrname, etc. libnss calls are serialized by a simple lock and each call can last a long time because it has to get info from SSSD/AD. If a DS server is flooded with "IPA trusted domain ID mapper" extop, many worker threads will be busy for long time. The worse condition is when all the workers are busy with such extop. Then DS is no longer to process others requests and DS appears to have transient hang. ipa-extdom-extop should manage those extop with its own threads (possibly like persistant searches) to not impact DS.
Created attachment 1249898 [details] tar ball with test build with a reduced client timeout
Created attachment 1301685 [details] tar-ball with test build
Note that this solution is composed from changes in ipa, sssd and slapi-nis: * ipa: Bug 1415162 - this bug * sssd: Bug 1473571 * slapi-nis: Bug 1473577
Created attachment 1337857 [details] valgrind output
Added a doc. Let me know, Aneta, if this is enough.
Fixed upstream. master: 78ad1cf ipa-extdom-extop: refactor nsswitch operations ipa-4-6: d1dd794 ipa-extdom-extop: refactor nsswitch operations ipa-4-5: a2da9f9 ipa-extdom-extop: refactor nsswitch operations
Created attachment 1364463 [details] tar-ball with test build rebased to sssd-1.15.2-50.el7_4.6
version: ipa-server-4.5.4-7.el7.x86_64 sssd-1.16.0-14.el7.x86_64 sss_nss_getpwnam_timeout_test.c -------------------------------------------------------------- #include <stdio.h> #define IPA_389DS_PLUGIN_HELPER_CALLS 1 #include <sss_nss_idmap.h> int main(int argc, char* argv[]) { int ret; struct passwd pwd; struct passwd *pwd_result; char buffer[1024]; size_t buflen = sizeof(buffer); if (argc != 2) { fprintf(stderr, "Missing argument.\n"); return 1; } ret = sss_nss_getpwnam_timeout(argv[1], &pwd, buffer, buflen, &pwd_result, 0, 1000); fprintf(stderr, "Done [%d].\n", ret); return ret; } -------------------------------------------------------------- steps: Make sure the 'libsss_nss_idmap-devel' is installed (yum install libsss_nss_idmap-devel) and then call: 1. gcc -Wall -Wextra -Werror sss_nss_getpwnam_timeout_test.c -o sss_nss_getpwnam_timeout_test -lsss_nss_idmap 2. set 'timeout = 999999' in the [domain/...] section of sssd.conf 3. restart the sssd service. 4. call $ kill -STOP $(pidof sssd_be) 5. call './sss_nss_getpwnam_timeout_test non_exisiting_user_name' this call should return after about 1s with "Done [5]." 6. as a reference you can call 'getent passwd non_exisiting_user_name' this call will return after 5 minites or if 7. kill -CONT $(pidof sssd_be) is called. Actual result: [root@client ~]# vi /etc/sssd/sssd.conf [root@client ~]# [root@client ~]# systemctl restart sssd [root@client ~]# kill -STOP $(pidof sssd_be) [root@client ~]# ./sss_nss_getpwnam_timeout_test test101 Done [5]. [root@client ~]# Thus on the basis of above observations, marking the bug status "VERIFIED".
Created attachment 1380407 [details] tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.6
Created attachment 1380408 [details] tar-ball with test build 5 rebased to sssd-1.15.2-50.el7_4.8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0918