Bug 1461462

Summary: sssd_client: add mutex protected call to the PAC responder
Product: Red Hat Enterprise Linux 7 Reporter: German Parente <gparente>
Component: sssdAssignee: SSSD Maintainers <sssd-maint>
Status: CLOSED ERRATA QA Contact: ipa-qe <ipa-qe>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: abokovoy, a.v.miroshnichenko, dpal, fidencio, gparente, grajaiya, ifloodmu, jhrozek, joniknsk, lslebodn, mkosek, mzidek, ndehadra, nkinder, nsoman, pbrezina, rmeggins, sbose, sgoveas, tmihinto, tscherf
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.16.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1506682 (view as bug list) Environment:
Last Closed: 2018-04-10 17:11:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1506682    
Attachments:
Description Flags
pac logs none

Description German Parente 2017-06-14 13:35:27 UTC
Description of problem:

a customer is hitting the following unresponsiveness:

#0  0x00007f29342dcdfd in poll () from /lib64/libc.so.6
#1  0x00007f2901e722ba in sss_cli_make_request_nochecks () from /usr/lib64/krb5/plugins/authdata/sssd_pac_plugin.so
#2  0x00007f2901e72a75 in sss_cli_check_socket () from /usr/lib64/krb5/plugins/authdata/sssd_pac_plugin.so
#3  0x00007f2901e72e07 in sss_pac_make_request () from /usr/lib64/krb5/plugins/authdata/sssd_pac_plugin.so
#4  0x00007f2901e71feb in sssdpac_verify () from /usr/lib64/krb5/plugins/authdata/sssd_pac_plugin.so
#5  0x00007f29364ea3d3 in krb5int_authdata_verify () from /lib64/libkrb5.so.3
#6  0x00007f293650b621 in rd_req_decoded_opt () from /lib64/libkrb5.so.3
#7  0x00007f293650c03a in krb5_rd_req_decoded () from /lib64/libkrb5.so.3
#8  0x00007f292d592b3f in kg_accept_krb5 () from /lib64/libgssapi_krb5.so.2
#9  0x00007f292d5941fa in krb5_gss_accept_sec_context_ext () from /lib64/libgssapi_krb5.so.2
#10 0x00007f292d594359 in krb5_gss_accept_sec_context () from /lib64/libgssapi_krb5.so.2
#11 0x00007f292d5816d6 in gss_accept_sec_context () from /lib64/libgssapi_krb5.so.2
#12 0x00007f292d7c3edc in gssapi_server_mech_step () from /usr/lib64/sasl2/libgssapiv2.so
#13 0x00007f29349e5b9b in sasl_server_step () from /lib64/libsasl2.so.3
#14 0x00007f29349e6109 in sasl_server_start () from /lib64/libsasl2.so.3
#15 0x00007f293719a0a2 in ids_sasl_check_bind ()
#16 0x00007f2937181c2a in do_bind ()
#17 0x00007f2937188aad in connection_threadmain ()
#18 0x00007f2934c1896b in _pt_root () from /lib64/libnspr4.so
#19 0x00007f29345b8dc5 in start_thread () from /lib64/libpthread.so.0
#20 0x00007f29342e773d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f29371218c0 (LWP 17841)):
#0  0x00007f29345bc6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f2934c13463 in PR_EnterMonitor () from /lib64/libnspr4.so
#2  0x00007f293718dd29 in slapd_daemon ()
#3  0x00007f293717f253 in main ()

A gssapi sasl bind provokes a sssd operation, probably to AD and it's taking longtime to complete.

During all that time the scheduler cannot assign any new operation to a worker thread because the connection table is locked by the bind operation.


Version-Release number of selected component (if applicable): 
   389-ds-base-1.3.5.10-15.el7_3.x86_64


How reproducible: ipa with AD trust. sssd taking long to discover the server by dns could be a hint to reproduce this.

Comment 2 wibrown@redhat.com 2017-07-05 06:37:28 UTC
This is not a directory server problem - it looks like the issue is in krb or sssd. I'm going to assign to krb because they likely know about the sss integration they are trying to call.

Comment 3 wibrown@redhat.com 2017-07-05 06:46:15 UTC
As an aside, this ticket exists to remove the conntable lock: https://pagure.io/389-ds-base/issue/49098 but the core of the issue you have here is a krb one.

Comment 25 German Parente 2017-09-12 15:27:14 UTC
Created attachment 1324944 [details]
pac logs

Comment 27 Lukas Slebodnik 2017-09-22 14:28:26 UTC
master:
* 1f331476e7d33bb03cc35a2a9064ee1cc5bed6cf

Comment 31 Jakub Hrozek 2017-10-05 15:18:57 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3518

Comment 38 Fabiano FidĂȘncio 2017-10-27 08:42:14 UTC
Just for the record:
1f331476e7d33bb03cc35a2a9064ee1cc5bed6cf

Comment 41 Nikhil Dehadrai 2017-12-06 07:10:43 UTC
IPA-server: ipa-server-4.5.4-6.el7.x86_64
sssd version: sssd-1.16.0-9.el7.x86_64


Verified the bug with following tests/ observations:
1. Tested that the sanity test suite for IPA-Trust Functional-ssh runs successfully.
2. Tested that the sanity test suite for IPA-Trust Functional-sudo runs successfully.
3. Tested that the sanity test suite for IPA-Trust Functional-user runs successfully.
4. For the failure noticed inside test suite a separate bug bz1520984 is logged.

Thus marking the status of bug to "VERIFIED", based on above observations.

Comment 45 errata-xmlrpc 2018-04-10 17:11:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0929