Bug 726467 - SSSD takes 30+ seconds to login
Summary: SSSD takes 30+ seconds to login
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.2
Hardware: x86_64
OS: Linux
Target Milestone: rc
: ---
Assignee: Stephen Gallagher
QA Contact: Chandrasekar Kannan
Depends On:
Blocks: 637248 736857 756082
TreeView+ depends on / blocked
Reported: 2011-07-28 17:51 UTC by Mason Sanders
Modified: 2020-05-02 16:24 UTC (History)
8 users (show)

Fixed In Version: sssd-1.8.0-2.el6.beta2
Doc Type: Bug Fix
Doc Text:
No documentation needed
Clone Of:
Last Closed: 2012-06-20 11:47:38 UTC
Target Upstream Version:

Attachments (Terms of Use)
tar cfz sssd-msanders.tar.gz /etc/sssd/sssd.conf /var/log/messages* /var/log/sssd (3.24 MB, application/x-gzip)
2011-07-28 17:51 UTC, Mason Sanders
no flags Details

System ID Private Priority Status Summary Last Updated
FedoraHosted SSSD 976 0 None None None Never
Github SSSD sssd issues 2018 0 None None None 2020-05-02 16:24:21 UTC
Red Hat Product Errata RHBA-2012:0747 0 normal SHIPPED_LIVE sssd bug fix and enhancement update 2012-06-19 19:31:43 UTC

Description Mason Sanders 2011-07-28 17:51:09 UTC
Created attachment 515776 [details]
tar cfz sssd-msanders.tar.gz /etc/sssd/sssd.conf /var/log/messages* /var/log/sssd

Description of problem:
After a while of having my computer on and suspending/resuming and docking/undocking sssd will go from letting me login in a few seconds to taking 30+ seconds to let me login.  If I reboot the problem is fixed for a matter of time.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. use laptop for a week.  suspend/resume and dock/undock
2. sssd will eventually start taking a long time to login
Actual results:
sssd takes 30+ seconds to login

Expected results:
sssd logins immediately

Additional info:
Logs and config files attached.

Comment 2 Dmitri Pal 2011-07-28 20:04:54 UTC
I have seen this. It is usually the case when the VPN drops in between the last two times SSSD talks to server or you change networks. For example you were on the netweork that had direct access to the server and then close the lid, suspend and go to a place like Whole Foods or Panera and try to resume there. The network connection will be established pretty quickly if you go there from time to time and have non expired certs but SSSD might be confused that it is online and try server with fail over before it will give up and go offline.
Anyways to troubleshoot the issue we would need SSSD logs. I suspect the devug_level should be at least 6 to see what is going on.

Comment 3 Mason Sanders 2011-07-28 20:11:37 UTC

I attached the logs in the tar file I uploaded when I created the bug.  Let me know if you need something additional.


Comment 4 Jenny Severance 2011-07-29 12:45:29 UTC
I have also seen this when the VPN drops while my screen is locked from prolonged inactivity or has been suspended for an extended period of time.

Comment 5 Jakub Hrozek 2011-08-04 19:19:16 UTC
I haven't been able to reproduce the issue yet, but the investigation of the logs revealed a possible cause, which is our improper handling of DNS timeouts.

(Thu Jul 28 13:41:20 2011) [sssd[be[redhat.com]]] [set_server_common_status] (4): Marking server 'kerberos.rdu.redhat.com' as 'resolving name
(Thu Jul 28 13:41:21 2011) [sssd[be[redhat.com]]] [check_fd_timeouts] (9): Checking for DNS timeouts
(Thu Jul 28 13:41:25 2011) [sssd[be[redhat.com]]] [check_fd_timeouts] (9): Checking for DNS timeouts
(Thu Jul 28 13:41:30 2011) [sssd[be[redhat.com]]] [check_fd_timeouts] (9): Checking for DNS timeouts
(Thu Jul 28 13:41:31 2011) [sssd[be[redhat.com]]] [check_fd_timeouts] (9): Checking for DNS timeouts
(Thu Jul 28 13:41:36 2011) [sssd[be[redhat.com]]] [check_fd_timeouts] (9): Checking for DNS timeouts

Our internal resolver library treats its timeout parameter as per-server. I suspect that in the above example, /etc/resolv.conf contained multiple records and resolver waited 5 seconds for every one of them. The same happened for the second server configured in fail over, doubling the total time.

This does not happen if the DNS server is down or unreachable, because the resolver would immediatelly detect that it can't connect and fail over.

I would like to try to reproduce the issue to be sure but I think we need to have a mechanism to cancel the resolving after the timeout and don't rely on the resolver library timeouts.

Comment 12 Jakub Hrozek 2012-04-03 17:21:10 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    New Contents:
No documentation needed

Comment 13 Kaushik Banerjee 2012-04-27 08:46:50 UTC
Verified with sssd-1.8.0-23.el6 that there is an improvement of ~30 seconds with the steps below:

Verification steps:

1. Setup bind dns on nameserver1 and nameserver2. Write a iptables rule to drop packets to port 53 on nameserver1.
2. Resolve hosts ldap.example.com and krb.example.com in the bind server.
3. On the client machine, in /etc/resolv.conf add:
   nameserver nameserver1
   nameserver nameserver2
3. In sssd.conf, the domain section is:
id_provider = ldap
ldap_uri = ldap://invalid1.example.com,ldap://ldap.example.com
ldap_search_base = dc=example,dc=com
debug_level = 0xFFF0
auth_provider = krb5
krb5_server = invalid2.example.com,krb.example.com
krb5_realm = EXAMPLE.COM
4. Perform a auth.

Using sssd-1.5.1-66.el6_2.3:

# time ssh -l puser1 localhost
puser1@localhost's password: 
Last login: Fri Apr 27 13:37:05 2012 from localhost
-sh-4.1$ logout
Connection to localhost closed.

real	0m55.702s
user	0m0.007s
sys	0m0.039s

Using sssd-1.8.0-23.el6:

# time ssh -l puser1 localhost
puser1@localhost's password: 
Last login: Wed Apr 25 20:51:01 2012 from localhost
-sh-4.1$ logout
Connection to localhost closed.

real	0m23.047s
user	0m0.012s
sys	0m0.041s

Comment 15 errata-xmlrpc 2012-06-20 11:47:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.