Bug 790414

Summary: Segfault in IPA provider on F17
Product: [Fedora] Fedora Reporter: Simo Sorce <ssorce>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jhrozek, sbose, sgallagh, ssorce
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-23 01:29:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 771484    
Bug Blocks:    

Description Simo Sorce 2012-02-14 13:14:18 UTC
On top of the current issues where we have ldap_results() return -1, a while after that happens I get a segfault.
The bad news is that this leaves the request sssd_nsss made to this sssd_be pening and the client is left waiting (presumably until libnss_sss decides it waited long enough (timeout there is 5 minutes).

Segfault:

Program received signal SIGSEGV, Segmentation fault.
ldap_sasl_interactive_bind (ld=ld@entry=0x0, dn=dn@entry=0x0, mechs=mechs@entry=0x7f92cc3abde0 "GSSAPI", 
    serverControls=serverControls@entry=0x0, clientControls=clientControls@entry=0x0, flags=flags@entry=2, 
    interact=interact@entry=0x7f92bfa13d40 <sdap_sasl_interact>, defaults=defaults@entry=0x7f92cc3cc680, result=0x0, 
    rmech=rmech@entry=0x7fff15cf0940, msgid=msgid@entry=0x7fff15cf093c) at ../../../libraries/libldap/sasl.c:433
433		if( LDAP_IS_UDP(ld) ) {
(gdb) bt
#0  ldap_sasl_interactive_bind (ld=ld@entry=0x0, dn=dn@entry=0x0, mechs=mechs@entry=0x7f92cc3abde0 "GSSAPI", 
    serverControls=serverControls@entry=0x0, clientControls=clientControls@entry=0x0, flags=flags@entry=2, 
    interact=interact@entry=0x7f92bfa13d40 <sdap_sasl_interact>, defaults=defaults@entry=0x7f92cc3cc680, result=0x0, 
    rmech=rmech@entry=0x7fff15cf0940, msgid=msgid@entry=0x7fff15cf093c) at ../../../libraries/libldap/sasl.c:433
#1  0x00007f92ca15ebba in ldap_sasl_interactive_bind_s (ld=0x0, dn=dn@entry=0x0, mechs=mechs@entry=0x7f92cc3abde0 "GSSAPI", 
    serverControls=serverControls@entry=0x0, clientControls=clientControls@entry=0x0, flags=flags@entry=2, 
    interact=interact@entry=0x7f92bfa13d40 <sdap_sasl_interact>, defaults=0x7f92cc3cc680) at ../../../libraries/libldap/sasl.c:511
#2  0x00007f92bfa1af86 in sasl_bind_send (sasl_user=0x7f92cc3ae4c0 "host/dev2.ipa.ssimo.org", sasl_mech=0x7f92cc3abde0 "GSSAPI", 
    sh=0x7f92cc3cd010, ev=0x7f92cc385530, memctx=<optimized out>, sasl_cred=<optimized out>)
    at src/providers/ldap/sdap_async_connection.c:693
#3  sdap_auth_send (memctx=memctx@entry=0x7f92cc3ccf70, ev=0x7f92cc385530, sh=0x7f92cc3cd010, sasl_mech=0x7f92cc3abde0 "GSSAPI", 
    sasl_user=sasl_user@entry=0x7f92cc3ae4c0 "host/dev2.ipa.ssimo.org", user_dn=user_dn@entry=0x0, authtok_type=authtok_type@entry=0x0, 
    authtok=...) at src/providers/ldap/sdap_async_connection.c:1052
#4  0x00007f92bfa1bb99 in sdap_cli_auth_step (req=0x7f92cc3ccdf0) at src/providers/ldap/sdap_async_connection.c:1530
#5  0x00007f92bfa1c515 in sdap_cli_kinit_done (subreq=0x0) at src/providers/ldap/sdap_async_connection.c:1503
#6  0x00007f92bfa15656 in sdap_kinit_done (subreq=0x0) at src/providers/ldap/sdap_async_connection.c:961
#7  0x00007f92cc138a1e in read_pipe_handler (ev=<optimized out>, fde=<optimized out>, flags=<optimized out>, pvt=<optimized out>)
    at src/util/child_common.c:468
#8  0x00007f92cb8aab2a in ?? () from /lib64/libtevent.so.0
#9  0x00007f92cb8a7cb0 in _tevent_loop_once () from /lib64/libtevent.so.0
#10 0x00007f92cb8a7e3b in tevent_common_loop_wait () from /lib64/libtevent.so.0
#11 0x00007f92cc135ba3 in server_loop (main_ctx=0x7f92cc386630) at src/util/server.c:572
#12 0x00007f92cc0fc179 in main (argc=<optimized out>, argv=<optimized out>) at src/providers/data_provider_be.c:2012

Comment 1 Stephen Gallagher 2012-02-14 13:55:36 UTC
Looks like we're passing a NULL ldap descriptor to ldap_sasl_interactive_bind(). That shouldn't happen. We shouldn't even get here without having established the socket.

Could I get your (sanitized) logs at debug_level = 6 or higher?

Comment 2 Simo Sorce 2012-02-14 15:29:21 UTC
This is the relevant part of the log.
As you can see we get a new request in, and after sasl_bind_send() is called, the log abruptly restart as there is where sssd crashed with the backtrace above.
In the middle you can see something strange, it seem we are trying to process results for something but I do not see a call being issued beside the one for rootdse which already processed and marked done.

It looks like something is firing callbacks when nothing has been requested ?

(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [be_get_account_info] (0x0100): Got request for [4097][1][idnumber=492]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_id_op_connect_step] (0x4000): beginning to connect
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'IPA'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_server_status] (0x1000): Status of server 'dev2.ipa.ssimo.org' is 'name not resolved'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_port_status] (0x1000): Port status of port 0 for server 'dev2.ipa.ssimo.org' is 'neutral'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 10 seconds
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_server_status] (0x1000): Status of server 'dev2.ipa.ssimo.org' is 'name not resolved'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [resolv_is_address] (0x4000): [dev2.ipa.ssimo.org] does not look like an IP address
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [resolv_gethostbyname_step] (0x2000): Querying files
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [resolv_gethostbyname_files_send] (0x0100): Trying to resolve A record of 'dev2.ipa.ssimo.org' in files
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [set_server_common_status] (0x0100): Marking server 'dev2.ipa.ssimo.org' as 'resolving name'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [set_server_common_status] (0x0100): Marking server 'dev2.ipa.ssimo.org' as 'name resolved'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [be_resolve_server_done] (0x0100): Found address for server dev2.ipa.ssimo.org: [192.168.122.27] TTL 7200
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [ipa_resolve_callback] (0x0400): Constructed uri 'ldap://dev2.ipa.ssimo.org'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sss_ldap_init_send] (0x4000): Using file descriptor [22] for LDAP connection.
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sss_ldap_init_send] (0x0400): Setting 6 seconds timeout for connecting
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dev2.ipa.ssimo.org:389/??base] with fd [22].
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_rootdse_send] (0x4000): Getting rootdse
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][].
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [*]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [altServer]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [namingContexts]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [supportedControl]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [supportedExtension]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [supportedFeatures]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [supportedLDAPVersion]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [supportedSASLMechanisms]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [defaultNamingContext]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [lastUSN]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [highestCommittedUSN]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 1
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result] (0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[0x7f90ed097dc0], ldap[0x7f90ed08a480]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_ENTRY]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_parse_entry] (0x4000): OriginalDN: [].
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result] (0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[0x7f90ed097dc0], ldap[0x7f90ed08a480]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_generic_ext_done] (0x0400): Search result: Success(0), no errmsg set
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_rootdse_done] (0x4000): Got rootdse
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_server_opts_from_rootdse] (0x4000): USN value: 400 (int: 400)
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_kinit_send] (0x0400): Attempting kinit (default, host/dev2.ipa.ssimo.org, IPA.SSIMO.ORG, 86400)
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_kinit_next_kdc] (0x1000): Resolving next KDC for service IPA
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'IPA'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_server_status] (0x1000): Status of server 'dev2.ipa.ssimo.org' is 'name resolved'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_port_status] (0x1000): Port status of port 0 for server 'dev2.ipa.ssimo.org' is 'neutral'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 10 seconds
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [get_server_status] (0x1000): Status of server 'dev2.ipa.ssimo.org' is 'name resolved'
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [be_resolve_server_done] (0x0100): Found address for server dev2.ipa.ssimo.org: [192.168.122.27] TTL 7200
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_kinit_kdc_resolved] (0x1000): KDC resolved, attempting to get TGT...
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [create_tgt_req_send_buffer] (0x1000): buffer size: 52
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [child_handler_setup] (0x2000): Setting up signal handler up for pid [13953]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [child_handler_setup] (0x2000): Signal handler set up for pid [13953]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [set_tgt_child_timeout] (0x0400): Setting 6 seconds timeout for tgt child
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result] (0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[(nil)], ldap[0x7f90ed08a480]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result] (0x0100): ldap_result gave -1, something bad happend!
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_handle_release] (0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[(nil)], ldap[0x7f90ed08a480], destructor_lock[0], release_memory[0]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [remove_connection_callback] (0x4000): Successfully removed connection callback.
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [write_pipe_handler] (0x0400): All data has been sent!
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [child_sig_handler] (0x1000): Waiting for child [13953].
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [child_sig_handler] (0x0100): child [13953] finished successfully.
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sss_child_handler] (0x2000): waitpid failed [10]: No child processes
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [read_pipe_handler] (0x0400): EOF received, client finished
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_IPA.SSIMO.ORG], expired on [1329319309]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sasl_bind_send] (0x0100): Executing sasl bind mech: GSSAPI, user: host/dev2.ipa.ssimo.org
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [server_setup] (0x0080): CONFDB: /var/lib/sss/db/config.ldb
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [resolv_get_family_order] (0x1000): Lookup order: ipv4_first

Comment 3 Stephen Gallagher 2012-02-14 16:04:51 UTC
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result]
(0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[(nil)],
ldap[0x7f90ed08a480]
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_process_result]
(0x0100): ldap_result gave -1, something bad happend!
(Tue Feb 14 10:21:49 2012) [sssd[be[ipa.ssimo.org]]] [sdap_handle_release]
(0x2000): Trace: sh[0x7f90ed08daa0], connected[1], ops[(nil)],

This is VERY strange. You're right that something must be triggering callbacks that it should not be. It should be impossible for us to call sdap_process_result with a NULL ops.

I'll look into this, but I strongly suspect this is related to BZ #771484

Comment 4 Stephen Gallagher 2012-02-15 13:18:50 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1196

Comment 5 Stephen Gallagher 2012-02-23 01:29:16 UTC

*** This bug has been marked as a duplicate of bug 771484 ***