Red Hat Bugzilla – Bug 634592
console freezes while LDAP server is unavailable with SSD
Last modified: 2011-05-23 14:49:29 EDT
Description of problem:
When I am at my home and my work LDAP server is unavailable, my system hangs from about 20 seconds periodically. It happens more often when I am using konsole but it happen in other apps from time to time as well. After every period of freeze the following messages show up in /var/log/messages:
Sep 16 08:18:01 barfolomew sssd[be[default]]: LDAP connection error: (null)
Sep 16 08:18:07 barfolomew sssd[be[default]]: LDAP connection error: (null)
Sep 16 08:18:13 barfolomew sssd[be[default]]: LDAP connection error: (null)
Version-Release number of selected component (if applicable):
Have not seen this problem too much when completely offline, seems to happen when online but the LDAP server is unavailable.
Steps to Reproduce:
1. Setup F13 for LDAP authentication with SSSD
2. Login as an LDAP user (into KDE)
3. use konsole, eventually freezes
Expect system not to freeze
Linux barfolomew.hra.local 126.96.36.199-54.fc13.x86_64 #1 SMP Sun Sep 5 17:16:27 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
Name : sssd
Arch : x86_64
Version : 1.2.2
Release : 21.fc13
Can you tell me if the problem persists with sssd-1.3.0-35.fc13 or later?
(In reply to comment #1)
> Can you tell me if the problem persists with sssd-1.3.0-35.fc13 or later?
Yes it does. I have been running 1.3.0-35.fc13 since about Oct 11 (according to my yum.log) and continue to have this problem when outside of my network.
The issue also seems to manifest itself when I try to unlock my screensaver, just getting the prompt to put in my password can sometimes take a full minute. Again unlocking the screensaver when connected in my network is nice and fast.
Some additional info, although I wouldn't think this is a factor, because SSSD uses TLS, and the hostname must match the cert common name, I've put an entry in my /etc/hosts file to point the cert common name (public DNS name where LDAP is not allowed through the firewall) to the local IP address, which obviously will never be available outside my network.
(In reply to comment #2)
> Some additional info, although I wouldn't think this is a factor, because SSSD
> uses TLS, and the hostname must match the cert common name, I've put an entry
> in my /etc/hosts file to point the cert common name (public DNS name where LDAP
> is not allowed through the firewall) to the local IP address, which obviously
> will never be available outside my network.
This might be relevant, actually. SSSD doesn't actually read /etc/hosts for name/IP mapping. It relies only on entries from /etc/resolv.conf and DNS. (This is a shortcoming that we need to fix at some point).
So it's possible that the timeout you're experiencing is a DNS timeout trying to contact your DNS server(s) and then eventually giving up and switching to offline authentication.
Could you add the line
debug_level = 6
to your [sssd/<DOMAIN>] section of your /etc/sssd/sssd.conf (replacing <DOMAIN> with the domain name appropriate to your setup, probably "default" if you used authconfig to set it up originally) and restart SSSD.
Next time you experience this issue, look at /var/log/sssd/sssd_<DOMAIN>.log and copy the log information for the relevant time period into this bug (sanitizing server names and IPs if necessary).
I can then use that to track down what's causing the long timeout.
(In reply to comment #3)
> This might be relevant, actually. SSSD doesn't actually read /etc/hosts for
> name/IP mapping. It relies only on entries from /etc/resolv.conf and DNS. (This
> is a shortcoming that we need to fix at some point).
FWIW, this would be nearly trivial patch, c-ares can read /etc/hosts/
Created attachment 459148 [details]
sssd_default.log with debug level set to 6
Although the attachment doesn't show a terrible one, still took it roughly 22 seconds for me to run "sudo ls" in the konsole, sometimes literally just running a non-sudo command will take up to a minute to process.
Without knowing too much what is going on, it looks like sssd figured out that my ldap server was unavailable in the first 4 seconds but decided to do a bunch more stuff before completing the request. I do not think its an issue with /etc/hosts since it does show the internal ip address from /etc/hosts, its resolving that probably from nsswitch right? files is set before ldap.
Please try out sssd-1.5.0-1.fc14 and let me know if this fixes your issue.
Nope. I upgraded the sssd and sssd client from updates-testing to the version you mentioned and still have the hangs on sudo commands and the could not start TLS error in /var/log/messages during the hangup.
Does this happen right after some kind of the network disruption? Like VPN dropped, or you suspended machine and resuming or unplugging machine from a docking station?
Or it just happens periodically when you are working online but your VPN is not connected to the corporate network? How many servers you have configured? If you can attach your sssd.conf would be great.
Does your experience look similar to this: https://fedorahosted.org/sssd/ticket/709 ?
Aaron, is this problem still persisting with SSSD 1.5.4 or later?
From the log posted above, it looks like an error contacting the LDAP server. This could be a certificate issue or a routing problem. Unfortunately, due to bug http://www.openldap.org/its/index.cgi/Incoming?id=6789 we don't get any information back from the ldap client libraries to explain the problem.
Please let us know whether you are still experiencing this issue.
This bug has gone for more than a month without additional data from the reporter.
Please reopen if the requested information is provided.