RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 728343 - SSSD taking 5 minutes to log in
Summary: SSSD taking 5 minutes to log in
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Stephen Gallagher
QA Contact: Chandrasekar Kannan
URL:
Whiteboard:
Depends On:
Blocks: 743047 746654 748864 748893
TreeView+ depends on / blocked
 
Reported: 2011-08-04 19:30 UTC by Matthew Mosesohn
Modified: 2015-01-04 23:50 UTC (History)
10 users (show)

Fixed In Version: sssd-1.5.1-52.el6
Doc Type: Bug Fix
Doc Text:
Cause: SSSD checked for an incorrect DBus return code Consequence: Instead of detecting timeouts properly, the monitor process disconnected from the back end process resulting in failure to be notified about back end going online and visible lags Fix: SSSD checks for a correct DBus code and handles timeouts on the DBus connection better Result: No more unwanted monitor disconnects resulting in less lags
Clone Of:
: 748864 (view as bug list)
Environment:
Last Closed: 2011-12-06 16:39:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs with debug packages (673.21 KB, application/x-bzip)
2011-08-31 16:06 UTC, Matthew Mosesohn
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1529 0 normal SHIPPED_LIVE sssd bug fix and enhancement update 2011-12-06 00:50:20 UTC

Description Matthew Mosesohn 2011-08-04 19:30:01 UTC
Description of problem:
This is a duplicate of 726467 with another example.

Version-Release number of selected component (if applicable):
sssd-1.5.1-34.el6_1.2.x86_64

How reproducible:
intermittently

Steps to Reproduce:
1. Lock screen
2. Drop VPN connection through bad connection
3. Attempt to unlock screen by typing password
  
Actual results:
The screen locks for 5 minutes before deferring to local password

Expected results:
The timeout should be a reasonable length of time

Additional info:
See attached logs for more information

Comment 2 Jakub Hrozek 2011-08-04 19:42:50 UTC
(In reply to comment #0)
> Additional info:
> See attached logs for more information

I think you forgot to attach them?

Comment 5 Dmitri Pal 2011-08-04 21:55:56 UTC
(In reply to comment #0)
> Description of problem:
> This is a duplicate of 726467 with another example.
> 

If it is the same you should have 60 DNS servers in the resolv.conf

Comment 6 Jakub Hrozek 2011-08-05 09:14:13 UTC
It is not the same issue (although there are some DNS timeouts in the logs as well). This looks like improper handling of LDAP timeouts:


(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [fo_resolve_service_send] (4): Trying to resolve service 'LDAP'
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'ldap.rdu.redhat.com' is 'working'
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [get_port_status] (7): Port status of port 389 for server 'ldap.rdu.redhat.com' is 'neutral'
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [get_server_status] (7): Status of server 'ldap.rdu.redhat.com' is 'working'
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [be_resolve_server_done] (4): Found address for server ldap.rdu.redhat.com: [10.11.255.5]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_uri_callback] (6): Constructed uri 'ldap://ldap.rdu.redhat.com'
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_rootdse_send] (9): Getting rootdse
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (6): calling ldap_search_ext with [(objectclass=*)][].
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [*]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [altServer]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [namingContexts]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedControl]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedExtension]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedFeatures]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedLDAPVersion]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [supportedSASLMechanisms]
(Thu Aug  4 11:00:07 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (7): Requesting attrs: [defaultNamingContext]
--------------------------------------------------------------------------------
(Thu Aug  4 11:00:58 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (3): ldap_search_ext failed: Can't contact LDAP server
(Thu Aug  4 11:00:58 2011) [sssd[be[redhat.com]]] [sdap_get_generic_send] (3): Connection error: (null)

There's more than 50 seconds waiting for ldap_search_ext to time out.

Comment 7 Jakub Hrozek 2011-08-10 16:14:57 UTC
We suspect this issue is actually a libldap bug. However, we need additional data to debug this issue further.

I have prepared a scratch build that is able to turn libldap debugging. It is the same RPM as you were using, with one additional patch:
https://brewweb.devel.redhat.com/taskinfo?taskID=3546607

Would you mind testing it out? You would need to
1) install the RPMs
2) tell SSSD to raise the debug level by setting the SSSD_DEBUG_LDAP_SEARCH environment variable. I did so by putting:

export SSSD_DEBUG_LDAP_SEARCH="0xffff"

into /etc/init.d/sssd

3) service sssd restart

When the timeout occurs again, please attach your logs.

Thank you!

Comment 8 Matthew Mosesohn 2011-08-16 14:07:27 UTC
Jakub,

the brew scratch build failed or expired.  Can you please build it again?

Comment 9 Jakub Hrozek 2011-08-16 14:46:24 UTC
Of course:

https://brewweb.devel.redhat.com/taskinfo?taskID=3559904

Comment 15 Matthew Mosesohn 2011-08-23 14:00:11 UTC
There was an openldap update available:

Aug 16 09:37:10 Updated: openldap-2.4.23-15.el6_1.1.x86_64
Aug 16 09:37:16 Updated: openldap-clients-2.4.23-15.el6_1.1.x86_64

Maybe this fixed it

Comment 16 Jakub Hrozek 2011-08-23 14:10:34 UTC
(In reply to comment #15)
> There was an openldap update available:
> 
> Aug 16 09:37:10 Updated: openldap-2.4.23-15.el6_1.1.x86_64
> Aug 16 09:37:16 Updated: openldap-clients-2.4.23-15.el6_1.1.x86_64
> 
> Maybe this fixed it

Jan, is it possible that the update made an ldap_search_ext() timeout go away?

The easiest way to be sure is either revert back to packages without the extra debugging or simply comment out the SSSD_DEBUG_LDAP_SEARCH environment variable to turn off the debugging and see if the bug strikes again.

Comment 17 Jan Vcelak 2011-08-24 07:44:28 UTC
I don't think so. That update fixes just one memory leak (#717738), it should not influence behavior of ldap_search_ext().

Comment 18 Jakub Hrozek 2011-08-26 06:42:44 UTC
Matthew, have you tried reverting to the packages without debugging enabled? Would the bug hit again? (That would confirm that it is a race condition in openldap-libs that doesn't occur when the extra debugging slows down the LDAP search operation)

Comment 20 Matthew Mosesohn 2011-08-31 16:06:39 UTC
Created attachment 520856 [details]
logs with debug packages

See attachment.  It has the issue reproducing itself using the debug packages provided

Comment 23 Stephen Gallagher 2011-09-14 17:28:38 UTC
Okay, I think I've identified the problem here. What happened is that an error was received that caused the data provider backend to disconnect from the monitor process. As a result, the monitor was unable to inform the data provider when the network state had changed (which tells the data provider to refresh its DNS information).

So the back-end was stuck in the offline state, with repeated attempts to reconnect being stymied by a combination of repeated DNS timeouts and slow LDAP connection attempts causing the visible lag.

I've scratch-built this fix at http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3635153

Please test and let me know if it resolves your issue.

Comment 24 Stephen Gallagher 2011-09-15 15:42:59 UTC
I found an additional issue with the previous patch. New scratch-build is here: http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3638970

Please verify this one instead.

Comment 25 Stephen Gallagher 2011-09-15 15:59:28 UTC
Sorry, bad merge of the patch from the master branch on that previous scratch build. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=3638997 should work now.

Comment 28 Kaushik Banerjee 2011-10-17 12:29:10 UTC
Steps to verify:

1. # service sssd stop;rm -f /var/lib/sss/db/*;service sssd start
Stopping sssd:                                             [  OK  ]
Starting sssd:                                             [  OK  ]

2. # killall -STOP sssd_be

3. After backend restarts,
   # killall -CONT sssd_be

4. # getent -s sss passwd kau20
kau20:*:202020:202020:kaushik20:/home/kau20:

Verified in version:
# rpm -qi sssd | head
Name        : sssd                         Relocations: (not relocatable)
Version     : 1.5.1                             Vendor: Red Hat, Inc.
Release     : 58.el6                        Build Date: Sat 15 Oct 2011 12:14:26 AM IST
Install Date: Mon 17 Oct 2011 12:00:06 PM IST      Build Host: x86-001.build.bos.redhat.com
Group       : Applications/System           Source RPM: sssd-1.5.1-58.el6.src.rpm
Size        : 3599114                          License: GPLv3+
Signature   : (none)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
URL         : http://fedorahosted.org/sssd/
Summary     : System Security Services Daemon

Comment 29 Jakub Hrozek 2011-10-27 15:06:53 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: SSSD checked for an incorrect DBus return code
Consequence: Instead of detecting timeouts properly, the monitor process disconnected from the back end process resulting in failure to be notified about back end going online and visible lags
Fix: SSSD checks for a correct DBus code and handles timeouts on the DBus connection better
Result: No more unwanted monitor disconnects resulting in less lags

Comment 30 errata-xmlrpc 2011-12-06 16:39:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1529.html


Note You need to log in before you can comment on or make changes to this bug.