Bug 975001
Summary: | Cannot authenticate using AD credentials | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | David Woodhouse <dwmw2> |
Component: | sssd | Assignee: | Jakub Hrozek <jhrozek> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 19 | CC: | jhrozek, pbrezina, sbose, sgallagh, ssorce |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | sssd-1.10.1-1.fc19 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-07-09 15:31:22 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Woodhouse
2013-06-17 10:58:14 UTC
/var/log/sssd/sssd_ger.corp.intel.com.log was helpfully completely empty until I restarted sssd manually with -d9, and now it's slightly more enlightening: (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): dbus conn: 7F3FF10CE150 (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): Dispatching. (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sbus_message_handler] (0x4000): Received SBUS method [getAccountInfo] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [be_get_account_info] (0x0100): Got request for [4097][1][name=dwoodhou] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_id_op_connect_step] (0x4000): reusing cached connection (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [DC=ger,DC=corp,DC=intel,DC=com] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(sAMAccountName=dwoodhou)(objectclass=user))][DC=ger,DC=corp,DC=intel,DC=com]. (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [sAMAccountName] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [unixUserPassword] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uidNumber] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gidNumber] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gecos] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [unixHomeDirectory] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [loginShell] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userPrincipalName] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [name] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [memberOf] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectGUID] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectSID] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [primaryGroupID] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [whenChanged] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [accountExpires] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userAccountControl] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 11 (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_result] (0x2000): Trace: sh[0x7f3ff10dc190], connected[1], ops[0x7f3ff11013d0], ldap[0x7f3ff10cdf30] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_ENTRY] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_entry] (0x4000): OriginalDN: [CN=Woodhouse\, David,OU=Workers,DC=ger,DC=corp,DC=intel,DC=com]. (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [objectClass] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [whenChanged] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [memberOf] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [uSNChanged] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [name] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [objectGUID] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [userAccountControl] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [primaryGroupID] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [objectSid] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [accountExpires] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [sAMAccountName] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [userPrincipalName] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [uidNumber] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [gidNumber] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [gecos] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [loginShell] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_parse_range] (0x2000): No sub-attributes for [unixHomeDirectory] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_result] (0x2000): Trace: sh[0x7f3ff10dc190], connected[1], ops[0x7f3ff11013d0], ldap[0x7f3ff10cdf30] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_result] (0x2000): Trace: sh[0x7f3ff10dc190], connected[1], ops[0x7f3ff11013d0], ldap[0x7f3ff10cdf30] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_generic_ext_done] (0x0400): Search result: Success(0), no errmsg set (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_users_process] (0x0400): Search for users, returned 1 results. (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): start ldb transaction (nesting: 0) (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_user] (0x4000): Save user (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_user] (0x1000): Mapping user [dwoodhou] objectSID [S-1-5-21-2052111302-1275210071-1644491937-279532] to unix ID (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_idmap_sid_to_unix] (0x0080): Could not convert objectSID [S-1-5-21-2052111302-1275210071-1644491937-279532] to a UNIX ID (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_user] (0x0040): Failed to save user [dwoodhou] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_users] (0x0040): Failed to store user 0. Ignoring. (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): commit ldb transaction (nesting: 0) (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_get_users_process] (0x4000): Saving 1 Users - Done (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_id_op_done] (0x4000): releasing operation connection (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [acctinfo_callback] (0x0100): Request processed. Returned 0,0,Success (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_result] (0x2000): Trace: sh[0x7f3ff10dc190], connected[1], ops[(nil)], ldap[0x7f3ff10cdf30] (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! (Mon Jun 17 12:02:20 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): dbus conn: 7F3FF10BAD00 (Mon Jun 17 12:02:20 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): Dispatching. (Mon Jun 17 12:02:20 2013) [sssd[be[ger.corp.intel.com]]] [sbus_message_handler] (0x4000): Received SBUS method [ping] (Mon Jun 17 12:02:30 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): dbus conn: 7F3FF10BAD00 (Mon Jun 17 12:02:30 2013) [sssd[be[ger.corp.intel.com]]] [sbus_dispatch] (0x4000): Dispatching. (Mon Jun 17 12:02:30 2013) [sssd[be[ger.corp.intel.com]]] [sbus_message_handler] (0x4000): Received SBUS method [ping] Thanks for the bug report. It seems that SSSD authenticated against AD correctly, looked up the user and then failed to convert his SID to ID. Sumit, does the SID look like something SSSD should have trouble converting? ------------ (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_user] (0x1000): Mapping user [dwoodhou] objectSID [S-1-5-21-2052111302-1275210071-1644491937-279532] to unix ID ------------ (In reply to David Woodhouse from comment #0) > I made a fresh install of Fedora 19 in a VM, to test Active Directory > integration. > > I was asked by gnome-initial-setup for my domain, username and password, but > received the error 'Not authorized to perform this action.' This appears to > be a typically unhelpful GNOME-style "haha, I'm not going to tell you what > went wrong" error message, bearing little relation to the more useful > messages I get if I do things manually... > > If I run realmd in a VT and try using g-i-s, I see it complain: > (realmd:2170): DEBUG: rejecting access to method 'Join' on interface > 'org.freedesktop.realmd.KerberosMembership' at > /org/freedesktop/realmd/Sssd/get_corp_intel_com_1 > > If I run 'realm -v join ger.corp.intel.com' manually, I get a complaint that > the computer's host name is not set correctly (it's localhost.localdomain). > > If I set the hostname to something arbitrary, I get an expected failure > since I am not authorized to add new machines in AD: > realm: Couldn't join realm: Insufficient permissions to join the domain > ger.corp.intel.com > The above are g-i-s problem. I'd suggest opening a separate bugzilla report for these, I know quite little about g-i-s, sorry. (In reply to Jakub Hrozek from comment #2) > Thanks for the bug report. It seems that SSSD authenticated against AD > correctly, looked up the user and then failed to convert his SID to ID. > > Sumit, does the SID look like something SSSD should have trouble converting? > > ------------ > (Mon Jun 17 12:02:14 2013) [sssd[be[ger.corp.intel.com]]] [sdap_save_user] > (0x1000): Mapping user [dwoodhou] objectSID > [S-1-5-21-2052111302-1275210071-1644491937-279532] to unix ID > ------------ I tried with some other users, most of which failed similarly. But one worked: S-1-5-21-2052111302-1275210071-1644491937-198384 By default SSSD use a range of 200k POSIX ID to map the AD RID into. In your environments RIDs larger than 200k are used, e.g. 279532. Please try to set ldap_idmap_range_size to a value suitable for your environment, e.g. 400000, delete the cache in /var/lib/sssd/db and restart sssd. We have plans to automatically allocate a new range if RIDs become too large, but it is not implemented yet. Thanks. That gets me a little further. It now admits that I exist, but I cannot authenticate. The log suggests that it's trying to do DNS requests which fail. But tcpdump shows me that it isn't sending any DNS requests. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [sbus_message_handler] (0x4000): Received SBUS method [pamHandler] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_pam_handler] (0x0100): Got request with the following data (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): command: PAM_AUTHENTICATE (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): domain: ger.corp.intel.com (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): user: dwoodhou (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): service: login (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): tty: tty4 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): ruser: (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): rhost: (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): authtok type: 1 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): newauthtok type: 0 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): priv: 1 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [pam_print_data] (0x0100): cli_pid: 6792 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_pam_handler] (0x1000): Wait queue of user [dwoodhou] is empty, running request immediately. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): Added timed event "ltdb_callback": 0x7ffbccc05d90 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): Added timed event "ltdb_timeout": 0x7ffbccc41ce0 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): Running timer event 0x7ffbccc05d90 "ltdb_callback" (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): Destroying timer event 0x7ffbccc41ce0 "ltdb_timeout" (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [ldb] (0x4000): Ending timer event 0x7ffbccc05d90 "ltdb_callback" (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_auth_send] (0x0100): No ccache file for user [dwoodhou] found. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_auth_send] (0x4000): Ccache_file is [not set] and is not active and TGT is not valid. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x1000): Trying with the next one! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'AD' (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 5 seconds (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [resolve_srv_send] (0x0200): The status of SRV lookup is not resolved (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned (5) (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_process] (0x0040): Failed to find a server after 10 attempts (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_resolve_server_done] (0x1000): Server resolution failed: 5 (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_mark_offline] (0x2000): Going offline! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_auth_prepare_ccache_file] (0x4000): Recreating ccache file. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_auth_resolve_done] (0x4000): Preparing for offline operation. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [child_handler_setup] (0x2000): (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [become_user] (0x0200): Trying to become user [217979532][217700513]. Setting up signal handler up for pid [7201] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [child_handler_setup] (0x2000): Signal handler set up for pid [7201] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [write_pipe_handler] (0x0400): All data has been sent! (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [child_sig_handler] (0x1000): Waiting for child [7201]. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [child_sig_handler] (0x0100): child [7201] finished successfully. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [sss_child_handler] (0x2000): waitpid failed [10]: No child processes (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [read_pipe_handler] (0x0400): EOF received, client finished (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [parse_krb5_child_response] (0x1000): child response [0][3][42]. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [cc_dir_cache_for_princ] (0x0400): No principal for david.woodhouse in DIR:/run/user/217979532/krb5cc (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [krb5_auth_done] (0x0020): No ccache for david.woodhouse in DIR:/run/user/217979532/krb5cc? (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [check_wait_queue] (0x1000): Wait queue for user [dwoodhou] is empty. (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_pam_handler_callback] (0x0100): Backend returned: (0, 4, <NULL>) [Success] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_pam_handler_callback] (0x0100): Sending result [4][ger.corp.intel.com] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [be_pam_handler_callback] (0x0100): Sent result [4][ger.corp.intel.com] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [remove_krb5_info_files] (0x0200): Could not remove [/var/lib/sss/pubconf/kdcinfo.GER.CORP.INTEL.COM], [2][No such file or directory] (Mon Jun 17 13:28:14 2013) [sssd[be[ger.corp.intel.com]]] [remove_krb5_info_files] (0x0200): Could not remove [/var/lib/sss/pubconf/kpasswdinfo.GER.CORP.INTEL.COM], [2][No such file or directory] The logs says the SSSD is attempting to do SRV discovery, but failing. Is the AD server set as the DNS server in resolv.conf? How does the domain section of the sssd.conf look like? (In reply to Jakub Hrozek from comment #7) > The logs says the SSSD is attempting to do SRV discovery, but failing. Indeed, As I said, it's failing even to send the *request* to the DNS server. > Is the AD server set as the DNS server in resolv.conf? No, this is a VM, and the DNS server is actually dnsmasq on the *host*. But DNS does work correctly as expected: [root@dwoodhou-mobl3 sssd]# host -t srv _kerberos._tcp.ger.corp.intel.com ;; Truncated, retrying in TCP mode. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 drsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 jrsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 ulsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 irsger203.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 dgsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 bgsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 musger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 100 100 88 idpjsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 hasger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 irsger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 105 100 88 f28sger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 10 100 88 fmsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 hasger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 100 100 88 f14sger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 lzsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 pgsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 100 100 88 idpjsger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 10 100 88 rrsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 100 100 88 f28sger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 gksger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 lcsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 musger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 105 100 88 f24sger202.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 hasger203.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 100 100 88 f24sger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 cesger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 20 100 88 irsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 10 100 88 orsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 ncsger201.ger.corp.intel.com. _kerberos._tcp.ger.corp.intel.com has SRV record 15 100 88 dksger201.ger.corp.intel.com. > How does the domain section of the sssd.conf look like? [domain/ger.corp.intel.com] ldap_idmap_range_size=500000 ad_domain = ger.corp.intel.com krb5_realm = GER.CORP.INTEL.COM realmd_tags = manages-system cache_credentials = True id_provider = ad krb5_store_password_if_offline = True ldap_id_mapping = True use_fully_qualified_names = False fallback_homedir = /home/%d/%u access_provider = ad Can you check if ad_enable_dns_sites=False makes any difference in your case? Also you can remove the krb5_realm stanza from the config file, the AD provider automatically uses the uppercase version of "ad_domain" as the "krb5_realm". After I restart sssd, although I hadn't made any changes since adding the ldap_idmap_range_size setting, this seems to work for local 'su dwoodhou', although ssh fails: [dwmw2@i7 ~]$ SSH_AUTH_SOCK= ssh dwoodhou.122.27 -p 2222 whoami dwoodhou.122.27's password: X11 forwarding request failed debug3: PAM session not opened, exiting and on the server side: debug3: ssh_selinux_setup_variables: setting execution context debug1: PAM: establishing credentials debug3: PAM: opening session PAM: pam_open_session(): Cannot make/remove an entry for the specified session User child is on pid 9669 debug1: PAM: establishing credentials debug3: ssh_selinux_setup_variables: setting execution context debug1: permanently_set_uid: 217979532/217700513 ssh_selinux_copy_context: setcon failed with Permission denied There also seems to be an issue with local caching. Running 'id dwoodhou' takes up to 7 minutes the first time it's run after sssd is restarted. The presumably translates to taking 7 minutes to log in to the system after a reboot, which is doing to be somewhat suboptimal. (In reply to David Woodhouse from comment #10) > After I restart sssd, although I hadn't made any changes since adding the > ldap_idmap_range_size setting, this seems to work for local 'su dwoodhou', > although ssh fails: > > [dwmw2@i7 ~]$ SSH_AUTH_SOCK= ssh dwoodhou.122.27 -p 2222 whoami > dwoodhou.122.27's password: > X11 forwarding request failed > debug3: PAM session not opened, exiting > > That's quite strange. Sumit suggested on IRC that c-ares (the library we use to perform DNS resolution) might have problems switching to TCP. I must say we haven't tested replies as large as yours. > and on the server side: > > debug3: ssh_selinux_setup_variables: setting execution context > debug1: PAM: establishing credentials > debug3: PAM: opening session > PAM: pam_open_session(): Cannot make/remove an entry for the specified > session > User child is on pid 9669 > debug1: PAM: establishing credentials > debug3: ssh_selinux_setup_variables: setting execution context > debug1: permanently_set_uid: 217979532/217700513 > ssh_selinux_copy_context: setcon failed with Permission denied I suspect that is not actually related to SSSD and would at least temporarily go away with "setenforce 0" ? (A bug report to make that functionality work with SELinux would certainly still be needed). (In reply to David Woodhouse from comment #11) > There also seems to be an issue with local caching. Running 'id dwoodhou' > takes up to 7 minutes the first time it's run after sssd is restarted. > Is dwoodhou a member of any very large group? Still, 7 minutes is extremely long, there must be some bug. Can you also raise debug_level in the [nss] section and attach both /var/log/sssd/sssd_nss.log and /var/log/sssd/sssd_$domain.log ? (side-note: it appears we might have broken large group resolution with one of the latest patches. We are tracking the problem right now, the logs should have more info.) > The presumably translates to taking 7 minutes to log in to the system after > a reboot, which is doing to be somewhat suboptimal. Not necessarily, "id" also runs "getgrgid $group" for every group the user is a member of. "id -G dwoodhou" would give you a better estimate on how log initgroups would take. Yeah, I'm guessing the 'domain users' group is fairly large. Strangely, removing the leading '-' from the 'session optional pam_systemd.so' line in /etc/pam.d/password-auth and system-auth makes ssh and gdm authentication work. That doesn't make much sense to me. Logs sent by email. (In reply to David Woodhouse from comment #14) > Yeah, I'm guessing the 'domain users' group is fairly large. > > Strangely, removing the leading '-' from the 'session optional > pam_systemd.so' line in /etc/pam.d/password-auth and system-auth makes ssh > and gdm authentication work. That doesn't make much sense to me. > I suspect that this might be caused by pam_systemd.so tinkering with /run/users/$UID where the ccache is stored. But it's just a hunch. > Logs sent by email. Thank you, I am travelling currently so I won't have the time to inspect them until tomorrow. I'll forward the logs to other team members in the meantime. Note that I didn't remove the pam_systemd.so line; I only removed the leading - sign from it. pam.conf(5) says: If the type value from the list above is prepended with a - character the PAM library will not log to the system log if it is not possible to load the module because it is missing in the system. This can be useful especially for modules which are not always installed on the system and are not required for correct authentication and authorization of the login session. I have absolutely no idea why this might change the behaviour when the module *is* present. Perhaps a PAM bug? David, can you try if setting: dyndns_update = False in the [domain] section improves performance for you? I see quite a few attempts to update the DNS server with the client's address in the logs, which fail and cause reconnection attempts. The typical setup for the AD provider would be to point the machine directly to the AD server even for DNS. Maybe this is something we should point out more explicitly in the man page. (In reply to David Woodhouse from comment #16) > Note that I didn't remove the pam_systemd.so line; I only removed the > leading - sign from it. > > pam.conf(5) says: > If the type value from the list above is prepended with a - character > the PAM library will not log to the system log if it is not > possible to load the module because it is missing in the system. This > can be useful especially for modules which are not always > installed on the system and are not required for correct > authentication and authorization of the login session. > > I have absolutely no idea why this might change the behaviour when the > module *is* present. Perhaps a PAM bug? Yes, sounds like a bug to me, at least with my limited PAM expertise. Added dyndns_update=False and today it isn't allowing me to authenticate at all. Logs sent in email again. I cannot reliably reproduce the working state but *sometimes* it does attempt to let me log in. I managed a GNOME login at least as far as its stupid "something has gone wrong!" screen that doesn't tell me *what* went wrong. In fact, I got that same screen yesterday but pressing the 'log out' button resulted in a working GNOME session so I'd forgotten. I don't know if that's related to SSSD authentication or not. Perhaps just because of the time it takes? Watching with tcpdump while it was *trying* to start my GNOME session, I noticed it starting *logs* of separate LDAP sessions, authenticating each one with Kerberos (with a new exchange to a Kerberos server instead of an existing ticket). So I suspect that could be made a lot faster... (In reply to David Woodhouse from comment #20) > Watching with tcpdump while it was *trying* to start my GNOME session, I > noticed it starting *logs* of separate LDAP sessions, authenticating each > one with Kerberos (with a new exchange to a Kerberos server instead of an > existing ticket). So I suspect that could be made a lot faster... SSSD only maintains a single LDAP session with the same ticket. If you're seeing multiple attempts, it most likely means one of two things: 1) The LDAP provider is crashing/restarting and therefore needs to reconnect. This is the most likely suspect to me. 2) Some part of the communication is failing and not properly setting the SSSD into "offline" mode for two minutes, so subsequent requests are attempting to reconnect to the LDAP server rapidly. Can you confirm that you have 'ldap_referrals = False' in your sssd.conf [domain/ger.corp.intel.com] section? I don't have that, no. Should I try adding it and see what happens? Should 'realm join' have included it? [sssd] domains = ger.corp.intel.com config_file_version = 2 services = nss, pam debug_level=0x1310 debug_timestamps=True [nss] default_shell = /bin/bash debug_level=0x1310 debug_timestamps=True [domain/ger.corp.intel.com] dyndns_update = False debug_level=0x1310 debug_timestamps=True ldap_idmap_range_size=500000 ad_domain = ger.corp.intel.com krb5_realm = GER.CORP.INTEL.COM realmd_tags = manages-system cache_credentials = True id_provider = ad krb5_store_password_if_offline = True ldap_id_mapping = True use_fully_qualified_names = False fallback_homedir = /home/%d/%u access_provider = ad (In reply to David Woodhouse from comment #22) > I don't have that, no. Should I try adding it and see what happens? Should > 'realm join' have included it? The AD provider defaults to ignoring referrals already, so you're good here. (In reply to Jakub Hrozek from comment #23) > (In reply to David Woodhouse from comment #22) > > I don't have that, no. Should I try adding it and see what happens? Should > > 'realm join' have included it? > > The AD provider defaults to ignoring referrals already, so you're good here. Sorry, I missed that you were using the AD provider. Can you check /var/log/sssd/sssd.log and see if there are notes in there about SSSD restarting the providers? That would tell us if a crash is happening. (In reply to Stephen Gallagher from comment #21) > 1) The LDAP provider is crashing/restarting and therefore needs to > reconnect. This is the most likely suspect to me. Hm, now that I have a GNOME session and try 'ssh dwoodhou@localhost' from it, I do get an abrt notification. It says that it reported an error in sssd-ipa but it strangely doesn't tell me where to find the information. There are some crashes in /var/tmp/abrt but those are from yesterday. One of them looks like this: Core was generated by `/usr/libexec/sssd/sssd_pac --debug-to-files'. Program terminated with signal 11, Segmentation fault. #0 0x00007fa818be7646 in pac_lookup_sids_done () #1 0x00007fa818be3707 in pac_lookup_sids_next_done () #2 0x00007fa817cf03f4 in tevent_common_loop_immediate (ev=ev@entry=0x7fa819fb16f0) at ../tevent_immediate.c:135 #3 0x00007fa817cf49c7 in epoll_event_loop_once (ev=0x7fa819fb16f0, location=<optimized out>) at ../tevent_epoll.c:912 #4 0x00007fa817cf30b7 in std_event_loop_once (ev=0x7fa819fb16f0, location=0x7fa817f59e77 "src/util/server.c:602") at ../tevent_standard.c:112 #5 0x00007fa817cefbcd in _tevent_loop_once (ev=ev@entry=0x7fa819fb16f0, location=location@entry=0x7fa817f59e77 "src/util/server.c:602") at ../tevent.c:530 #6 0x00007fa817cefd6b in tevent_common_loop_wait (ev=0x7fa819fb16f0, location=0x7fa817f59e77 "src/util/server.c:602") at ../tevent.c:634 #7 0x00007fa817cf3057 in std_event_loop_wait (ev=0x7fa819fb16f0, location=0x7fa817f59e77 "src/util/server.c:602") at ../tevent_standard.c:138 #8 0x00007fa817f42853 in server_loop () from /usr/lib64/sssd/libsss_util.so #9 0x00007fa818be2242 in main () And another like this: Program terminated with signal 11, Segmentation fault. #0 _tevent_req_data (req=0x0) at ../tevent_req.c:274 274 return req->data; Missing separate debuginfos, use: debuginfo-install sssd-common-1.10.0-11.fc19.beta2.x86_64 (gdb) bt #0 _tevent_req_data (req=0x0) at ../tevent_req.c:274 #1 0x00007f3f17cfbe7b in nsupdate_child_handler () #2 0x00007f3f13d2d738 in child_invoke_callback () from /usr/lib64/sssd/libsss_child.so #3 0x00007f3f172263f4 in tevent_common_loop_immediate (ev=ev@entry=0x7f3f18823550) at ../tevent_immediate.c:135 #4 0x00007f3f1722a9c7 in epoll_event_loop_once (ev=0x7f3f18823550, location=<optimized out>) at ../tevent_epoll.c:912 #5 0x00007f3f172290b7 in std_event_loop_once (ev=0x7f3f18823550, location=0x7f3f1748fe77 "src/util/server.c:602") at ../tevent_standard.c:112 #6 0x00007f3f17225bcd in _tevent_loop_once (ev=ev@entry=0x7f3f18823550, location=location@entry=0x7f3f1748fe77 "src/util/server.c:602") at ../tevent.c:530 #7 0x00007f3f17225d6b in tevent_common_loop_wait (ev=0x7f3f18823550, location=0x7f3f1748fe77 "src/util/server.c:602") at ../tevent.c:634 #8 0x00007f3f17229057 in std_event_loop_wait (ev=0x7f3f18823550, location=0x7f3f1748fe77 "src/util/server.c:602") at ../tevent_standard.c:138 #9 0x00007f3f17478853 in server_loop () from /usr/lib64/sssd/libsss_util.so #10 0x00007f3f17ce31cb in main () > Can you confirm that you have 'ldap_referrals = False' in your sssd.conf > [domain/ger.corp.intel.com] section? Further perusing of the logs suggests that this defaults to false even when not explicitly set (as it wasn't). I'm guessing this spurious offline status explains why I don't seem to have a valid Kerberos TGT. However, there is no /usr/bin/ntlm_auth so even automatic *NTLM* authentication isn't working (as it would have been if I'd been using winbind). I was assuming that SSSD is the right way to do things in Fedora 19, and that we shouldn't be using samba-winbind? Am I trying this too soon, before it's really ready for active use? I suspect that the problem you are having logging in would be caused by the large list of servers returned with the SRV discovery and by SSSD not handling them well. Can you (just for the test), add an "ad_server" directive pointing to a host name of one of your AD servers? > I was assuming that SSSD is the right way to do things in Fedora 19, and
> that we shouldn't be using samba-winbind? Am I trying this too soon, before
> it's really ready for active use?
No, I'm sorry about the trouble you're having. Both crashes you're seeing are in new functionality added in Fedora 19. Also typically we tested the AD provider with DNS pointing to the AD server, so we didn't see the dyndns crash.
Can you get us core files so we can fix these?
Is abrt not uploading them somewhere you can see them? It *told* me it had; the lack of a URL at which I could see my own bug report was, I assumed, just the normal GNOME "sod you, we don't want to give you any information at all" mentality. I can send the core dumps manually (they're lying around in /var/tmp/abrt) but only if you promise to promote an attitude of violence towards whoever's responsible for abrt being broken... :) (In reply to David Woodhouse from comment #28) > Is abrt not uploading them somewhere you can see them? It *told* me it had; > the lack of a URL at which I could see my own bug report was, I assumed, > just the normal GNOME "sod you, we don't want to give you any information at > all" mentality. > > I can send the core dumps manually (they're lying around in /var/tmp/abrt) > but only if you promise to promote an attitude of violence towards whoever's > responsible for abrt being broken... :) Yes, please :) (In reply to David Woodhouse from comment #28) > Is abrt not uploading them somewhere you can see them? It *told* me it had; > the lack of a URL at which I could see my own bug report was, I assumed, > just the normal GNOME "sod you, we don't want to give you any information at > all" mentality. > ABRT isn't written by the GNOME folks, and if it told you it submitted a bug without providing you a URL, that's a bug. I've filed literally dozens of bugs through ABRT, and they all have a link viewable in the ABRT GUI (usually listed as just "Bugzilla" in the right pane on Fedora 19). If you don't see "Reported yes" and the Bugzilla link when you click on a bug in the ABRT GUI, then please file a BZ directly against ABRT. They'll want to know about that. According to the logs send via e-mail (obfuscated version) this is the main problem: ------------------ [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectSID=S-1-5-21-1111111111-1111111111-1111111111-123456)(objectclass=user))][DC=foo,DC=bar,DC=baz]. [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [sAMAccountName] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [unixUserPassword] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uidNumber] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gidNumber] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gecos] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [unixHomeDirectory] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [loginShell] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userPrincipalName] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [name] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [memberOf] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectGUID] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectSID] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [primaryGroupID] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [whenChanged] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [accountExpires] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userAccountControl] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 16 [sssd[be[DOMNAME]]] [sdap_process_result] (0x2000): Trace: sh[0x7ffa8fec14e0], connected[1], ops[0x7ffa8fea3ce0], ldap[0x7ffa8fe42480] [sssd[be[DOMNAME]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sssd[be[DOMNAME]]] [sdap_get_generic_ext_done] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-03100742, data 0, 1 access points ref 1: 'foo.bar.baz' (Tue Jun 18 12:21:45 2013) [sssd[be[DOMNAME]]] [sdap_get_generic_ext_done] (0x0040): Unexpected result from ldap: Referral(10), 0000202B: RefErr: DSID-03100742, data 0, 1 access points ref 1: 'foo.bar.baz' ------------------ At first glance, this looks to me like we attempted an LDAP search where we should have searched using Global Catalog. I'll try to reproduce locally. Note that the repeated ldap searches aren't purely caused by sssd_pac dying and respawning. There are *lots* of these LDAP reconnections, *then* sssd_pac dies and I am finally logged in. Program received signal SIGSEGV, Segmentation fault. 0x00007f938cc72646 in pac_lookup_sids_done (req=0x0) at src/responder/pac/pacsrv_cmd.c:309 309 id = ldb_msg_find_attr_as_uint64(msg->msgs[0], (gdb) p msg $1 = (struct ldb_result *) 0x7f938e2d1e10 (gdb) p msg->msgs $2 = (struct ldb_message **) 0x0 (gdb) bt #0 0x00007f938cc72646 in pac_lookup_sids_done (req=0x0) at src/responder/pac/pacsrv_cmd.c:309 #1 0x00007f938cc6e707 in pac_lookup_sids_next_done (subreq=0x0) at src/responder/pac/pacsrv_cmd.c:1070 #2 0x00007f938cc827fc in sss_dp_internal_get_done (pending=<optimized out>, ptr=<optimized out>) at src/responder/common/responder_dp.c:779 #3 0x00007f938b4f8512 in complete_pending_call_and_unlock (connection=connection@entry=0x7f938e2b8000, pending=0x7f938e2b6890, message=message@entry=0x7f938e2bada0) at dbus-connection.c:2314 #4 0x00007f938b4fb741 in dbus_connection_dispatch (connection=connection@entry=0x7f938e2b8000) at dbus-connection.c:4580 #5 0x00007f938bfc4811 in sbus_dispatch (ev=0x7f938e2ac7f0, te=<optimized out>, tv=..., data=<optimized out>) at src/sbus/sssd_dbus_connection.c:104 #6 0x00007f938bd7e9df in tevent_common_loop_timer_delay (ev=ev@entry=0x7f938e2ac7f0) at ../tevent_timed.c:341 #7 0x00007f938bd7f9d3 in epoll_event_loop_once (ev=0x7f938e2ac7f0, location=<optimized out>) at ../tevent_epoll.c:916 #8 0x00007f938bd7e0b7 in std_event_loop_once (ev=0x7f938e2ac7f0, location=0x7f938bfe4e77 "src/util/server.c:602") at ../tevent_standard.c:112 #9 0x00007f938bd7abcd in _tevent_loop_once (ev=ev@entry=0x7f938e2ac7f0, location=location@entry=0x7f938bfe4e77 "src/util/server.c:602") at ../tevent.c:530 #10 0x00007f938bd7ad6b in tevent_common_loop_wait (ev=0x7f938e2ac7f0, location=0x7f938bfe4e77 "src/util/server.c:602") at ../tevent.c:634 #11 0x00007f938bd7e057 in std_event_loop_wait (ev=0x7f938e2ac7f0, location=0x7f938bfe4e77 "src/util/server.c:602") at ../tevent_standard.c:138 #12 0x00007f938bfcd853 in server_loop (main_ctx=0x7f938e2adc40) at src/util/server.c:602 #13 0x00007f938cc6d242 in main (argc=4, argv=<optimized out>) at src/responder/pac/pacsrv.c:274 (In reply to David Woodhouse from comment #32) > Note that the repeated ldap searches aren't purely caused by sssd_pac dying > and respawning. There are *lots* of these LDAP reconnections, *then* > sssd_pac dies and I am finally logged in. > > Program received signal SIGSEGV, Segmentation fault. > 0x00007f938cc72646 in pac_lookup_sids_done (req=0x0) at > src/responder/pac/pacsrv_cmd.c:309 > 309 id = ldb_msg_find_attr_as_uint64(msg->msgs[0], > (gdb) p msg > $1 = (struct ldb_result *) 0x7f938e2d1e10 > (gdb) p msg->msgs > $2 = (struct ldb_message **) 0x0 > (gdb) bt Thank you for this information. I'll prepare a fix. FWIW it said 'reported: yes' without a bugzilla link. I could still press the 'report' button though. The retrace server failed, probably for the same reason that 'debuginfo-install sssd-ipa' failed; I manually fetched the debuginfo package from koji for myself, and it managed to create the backtrace on my local system (as had I, above). And then it asked me for my bugzilla login details. Now I suspect it *would* report the bug in bugzilla, but I'm not going to bother since it's already here. Daid, can you also post your pam configuration and nsswitch.conf files ? Those are exactly as a clean install followed by 'realm join' created them, except for the weirdness with having to remove the leading '-' from the systemd session configuration as discussed in comments 14-16. # egrep -v '^#|^$' /etc/nsswitch.conf passwd: files sss shadow: files sss group: files sss hosts: files dns bootparams: nisplus [NOTFOUND=return] files ethers: files netmasks: files networks: files protocols: files rpc: files services: files sss netgroup: files sss publickey: nisplus automount: files sss aliases: files nisplus # cat /etc/pam.d/system-auth #%PAM-1.0 # This file is auto-generated. # User changes will be destroyed the next time authconfig is run. auth required pam_env.so auth sufficient pam_fprintd.so auth sufficient pam_unix.so nullok try_first_pass auth requisite pam_succeed_if.so uid >= 1000 quiet_success auth sufficient pam_sss.so use_first_pass auth required pam_deny.so account required pam_unix.so account sufficient pam_localuser.so account sufficient pam_succeed_if.so uid < 1000 quiet account [default=bad success=ok user_unknown=ignore] pam_sss.so account required pam_permit.so password requisite pam_pwquality.so try_first_pass retry=3 authtok_type= password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok password sufficient pam_sss.so use_authtok password required pam_deny.so session optional pam_keyinit.so revoke session required pam_limits.so session optional pam_systemd.so session optional pam_oddjob_mkhomedir.so session [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid session required pam_unix.so session optional pam_sss.so # cat /etc/pam.d/password-auth #%PAM-1.0 # This file is auto-generated. # User changes will be destroyed the next time authconfig is run. auth required pam_env.so auth sufficient pam_unix.so nullok try_first_pass auth requisite pam_succeed_if.so uid >= 1000 quiet_success auth sufficient pam_sss.so use_first_pass auth required pam_deny.so account required pam_unix.so account sufficient pam_localuser.so account sufficient pam_succeed_if.so uid < 1000 quiet account [default=bad success=ok user_unknown=ignore] pam_sss.so account required pam_permit.so password requisite pam_pwquality.so try_first_pass retry=3 authtok_type= password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok password sufficient pam_sss.so use_authtok password required pam_deny.so session optional pam_keyinit.so revoke session required pam_limits.so session optional pam_systemd.so session optional pam_oddjob_mkhomedir.so session [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid session required pam_unix.so session optional pam_sss.so Upstream ticket: https://fedorahosted.org/sssd/ticket/1989 Upstream ticket: https://fedorahosted.org/sssd/ticket/1992 (In reply to Jakub Hrozek from comment #9) > Can you check if ad_enable_dns_sites=False makes any difference in your case? (Fri Jun 21 11:18:36 2013) [sssd[be[ger.corp.intel.com]]] [dp_copy_options] (0x0020): Failed to retrieve value for option (ad_enable_dns_sites) (Fri Jun 21 11:18:36 2013) [sssd[be[ger.corp.intel.com]]] [sssm_ad_access_init] (0x0010): Could not initialize access provider options: [Invalid argument] If I try 'ad_enable_dns_sites = 0' I get this instead: (Fri Jun 21 11:19:56 2013) [sssd[be[ger.corp.intel.com]]] [confdb_get_bool] (0x0040): Value is not a boolean! (Fri Jun 21 11:19:56 2013) [sssd[be[ger.corp.intel.com]]] [confdb_get_bool] (0x0020): Failed to read [ad_enable_dns_sites] from [config/domain/ger .corp.intel.com], error [22] (Invalid argument) (Fri Jun 21 11:19:56 2013) [sssd[be[ger.corp.intel.com]]] [dp_get_options] (0x0020): Failed to retrieve value for option (ad_enable_dns_sites) (Fri Jun 21 11:19:56 2013) [sssd[be[ger.corp.intel.com]]] [common_ad_init] (0x0010): Could not parse common options: [Invalid argument] (In reply to David Woodhouse from comment #39) > (In reply to Jakub Hrozek from comment #9) > > Can you check if ad_enable_dns_sites=False makes any difference in your case? > > (Fri Jun 21 11:18:36 2013) [sssd[be[ger.corp.intel.com]]] [dp_copy_options] > (0x0020): Failed to retrieve value for option (ad_enable_dns_sites) > (Fri Jun 21 11:18:36 2013) [sssd[be[ger.corp.intel.com]]] > [sssm_ad_access_init] (0x0010): Could not initialize access provider > options: [Invalid argument] > Fixed in master. I found this problem independently when working on another fature. I will be releasing another tarball with a couple of fixes later today. I'm not sure if I can get the fix for the referral issue in, that would probably be a scratch build for you to test first. sssd-1.10.0-13.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/sssd-1.10.0-13.fc19 Package sssd-1.10.0-13.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing sssd-1.10.0-13.fc19' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-11899/sssd-1.10.0-13.fc19 then log in and leave karma (feedback). Fixed in sssd-1.10.0-16 which is in stable updates already. sssd-1.10.1-1.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/sssd-1.10.1-1.fc19 sssd-1.10.1-1.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. |