Description of problem: If the first ldap server in the ldap_uri line is is not reachable at all (e.g. firewalled), then the login attempt will fail with the following messages printed on the console: sssd_be[1951] trap stack segment rip:2b7944f4e27e rsp:7fffe7a74800 error:0 sssd_nss[1954]: segfault at 0000000000000048 rip 000000000043cae4 rsp 00007fffceff16c0 error 4 Version-Release number of selected component (if applicable): Linux lilx001 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux sssd-client-1.2.1-39.el5 sssd-1.2.1-39.el5 nss-3.12.8-1.el5 How reproducible: * configure nsswitch.conf: ... passwd: files sss shadow: files sss group: files sss ... * configure /etc/sssd/sssd.conf: [sssd] config_file_version = 2 reconnection_retries = 3 sbus_timeout = 30 services = nss, pam domains = default [nss] filter_groups = root filter_users = root reconnection_retries = 3 [pam] reconnection_retries = 3 [domain/default] id_provider = ldap ldap_uri = ldaps://ldap1, ldaps://dap2, ldaps://ldap3 ldap_search_base = dc=redhat,dc=com auth_provider = ldap ldap_tls_reqcert = allow cache_credentials = true * restart sssd service sssd stop rm -f /var/lib/sss/db/* service sssd start * try to login with a user which is in ldap Actual results: segfault Expected results: no segfault
I have been able to reproduce this problem. The first server in the ldap_uri list must be resolvable in DNS but not reachable by the client (firewall, server down, etc.) I strongly suspect a bug in the failover processing. Related logs: (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sbus_message_handler] (9): Received SBUS method [getAccountInfo] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [be_get_account_info] (4): Got request for [3][1][name=sgallagh] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [fo_resolve_service_send] (4): Trying to resolve service 'LDAP' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'download.bos.redhat.com' is 'name not resolved' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_port_status] (7): Port status of port 389 for server 'download.bos.redhat.com' is 'neutral' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'download.bos.redhat.com' is 'name not resolved' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [resolv_gethostbyname_send] (4): Trying to resolve A record of 'download.bos.redhat.com' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [schedule_timeout_watcher] (9): Scheduling DNS timeout watcher (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [set_server_common_status] (4): Marking server 'download.bos.redhat.com' as 'resolving name' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [unschedule_timeout_watcher] (9): Unscheduling DNS timeout watcher (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [set_server_common_status] (4): Marking server 'download.bos.redhat.com' as 'name resolved' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [be_resolve_server_done] (4): Found address for server download.bos.redhat.com: [10.16.60.17] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_uri_callback] (6): Constructed uri 'ldap://download.bos.redhat.com' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_uri_callback] (6): Constructed uri 'ldap://download.bos.redhat.com' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_uri_callback] (6): Constructed uri 'ldap://download.bos.redhat.com' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [setup_ldap_connection_callbacks] (9): LDAP connection callbacks are not supported. (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_rootdse_send] (9): Getting rootdse (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (6): calling ldap_search_ext with [(objectclass=*)][]. (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [*] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [altServer] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [namingContexts] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedControl] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedExtension] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedFeatures] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedLDAPVersion] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedSASLMechanisms] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (3): ldap_search_ext failed: Can't contact LDAP server (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (3): Connection error: (null) (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [sdap_install_ldap_callbacks] (8): Trace: sh[0xc504c60], connected[1], ops[(nil)], fde[0xc50d820], ldap[0xc504e20] (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [fo_set_port_status] (4): Marking port 389 of server 'download.bos.redhat.com' as 'not working' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [fo_resolve_service_send] (4): Trying to resolve service 'LDAP' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'ldap.bos.redhat.com' is 'name not resolved' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_port_status] (7): Port status of port 389 for server 'ldap.bos.redhat.com' is 'neutral' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'ldap.bos.redhat.com' is 'name not resolved' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [resolv_gethostbyname_send] (4): Trying to resolve A record of 'ldap.bos.redhat.com' (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [schedule_timeout_watcher] (9): Scheduling DNS timeout watcher (Tue Feb 8 16:32:31 2011) [sssd[be[redhat.com]]] [set_server_common_status] (4): Marking server 'ldap.bos.redhat.com' as 'resolving name' <sssd_be crashes here> Backtrace: #0 0x00002ada9807727e in std_event_loop_select (ev=<value optimized out>, location=<value optimized out>) at tevent_standard.c:461 r_fds = {fds_bits = {16, 0 <repeats 15 times>}} w_fds = {fds_bits = {0 <repeats 16 times>}} fde = 0xd926480 selrtn = <value optimized out> #1 std_event_loop_once (ev=<value optimized out>, location=<value optimized out>) at tevent_standard.c:548 std_ev = <value optimized out> tval = {tv_sec = 1, tv_usec = 918651} #2 0x00002ada98074690 in _tevent_loop_once (ev=0xd8ef450, location=0x4483f2 "util/server.c:523") at tevent.c:490 ret = 4490226 nesting_stack_ptr = 0x0 #3 0x00002ada980746fb in tevent_common_loop_wait (ev=0xd8ef450, location=0x4483f2 "util/server.c:523") at tevent.c:591 ret = 0 #4 0x0000000000439861 in server_loop (main_ctx=0xd8f05c0) at util/server.c:523 No locals. #5 0x000000000040e1bf in main (argc=5, argv=0x7fff9d1e1468) at providers/data_provider_be.c:1211 opt = <value optimized out> pc = 0xd8ee010 be_domain = 0xd8ee450 "redhat.com" srv_name = <value optimized out> conf_entry = <value optimized out> main_ctx = 0xd8f05c0 ret = 0 long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg = 0x64e7a0, val = 0, descrip = 0x43e337 "Help options:", argDescrip = 0x0}, {longName = 0x43e345 "debug-level", shortName = 100 'd', argInfo = 2, arg = 0x64e878, val = 0, descrip = 0x43e316 "Debug level", argDescrip = 0x0}, {longName = 0x43e351 "debug-to-files", shortName = 102 'f', argInfo = 0, arg = 0x64e87c, val = 0, descrip = 0x43ee68 "Send the debug output to files instead of stderr", argDescrip = 0x0}, {longName = 0x43e360 "debug-timestamps", shortName = 0 '\000', argInfo = 2, arg = 0x64e780, val = 0, descrip = 0x43e322 "Add debug timestamps", argDescrip = 0x0}, {longName = 0x43e371 "domain", shortName = 0 '\000', argInfo = 1, arg = 0x7fff9d1e1328, val = 0, descrip = 0x43eea0 "Domain of the information provider (mandatory)", argDescrip = 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, descrip = 0x0, argDescrip = 0x0}} __FUNCTION__ = "main"
Verified in version: # rpm -qi sssd | head Name : sssd Relocations: (not relocatable) Version : 1.5.1 Vendor: Red Hat, Inc. Release : 34.el5 Build Date: Tue 03 May 2011 10:46:09 PM IST Install Date: Wed 11 May 2011 02:07:53 PM IST Build Host: x86-004.build.bos.redhat.com Group : Applications/System Source RPM: sssd-1.5.1-34.el5.src.rpm Size : 3508089 License: GPLv3+ Signature : (none) Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> URL : http://fedorahosted.org/sssd/ Summary : System Security Services Daemon
Just hit this bug while deploying SSSD in production. It's kind of a showstopper for me. Verified on following packages: sssd-client-1.2.1-39.el5 sssd-1.2.1-39.el5 libtevent-0.9.8-10.el5
Addressed in 5.7.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: When the LDAP server defined in the first ldap_uri entry was unreachable, the login attempt to the system failed with a segmentation fault due to an issue in the failover processing. With this update, the segmentation fault no longer occurs if the first LDAP server can't be reached.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0975.html