Bug 662075

Summary: Periodic burst of LDAP "Invalid credentials"
Product: Red Hat Enterprise Linux 5 Reporter: Chris Adams <linux>
Component: nss_ldapAssignee: Nalin Dahyabhai <nalin>
Status: CLOSED WONTFIX QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: dpal, jplans
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-03 18:07:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chris Adams 2010-12-10 15:18:57 UTC
I have a RHEL 5 (fully up-to-date as of 2010-12-10) server that uses OpenLDAP for the user database.  Authentication/user lookup works fine most of the time, but periodically (maybe once or twice per day), a daemon will log a burst of messages like (server name changed):

Dec  9 22:18:10 fly restorecond: nss_ldap: failed to bind to LDAP server ldapi:///: Invalid credentials
Dec  9 22:18:10 fly restorecond: nss_ldap: failed to bind to LDAP server ldaps://<backupserver>.hiwaay.net/: Invalid credentials

Sometimes the program logging the problem is dovecot-auth (version 1.2.11 compiled locally from Fedora updates), but since restorecond logged it as well, it appears to be an nss_ldap problem.

When this happens, the calling program thinks the users it is looking up don't exist (so for example, dovecot's deliver bounces emails as "unknown user", which is a major problem).

The LDAP server doesn't log any errors when this is happening.  I don't know what triggers the problem or what makes it go away after a few seconds to few minutes.  It might be happening more when the server is busy (I see more instances during nightly backups for example).

Comment 1 Chris Adams 2010-12-10 15:23:13 UTC
One other thing: I said multiple programs log this, but only one logs it at a time.  For example, I got a burst of errors from dovecot-auth yesterday at 16:15, 17:31, and 23:41-23:42.  I got errors from restorecond at 22:15-22:18.

Is it possible that nss_ldap has some internal resource leak (but eventually resets itself)?

Comment 2 Dmitri Pal 2011-02-01 21:28:05 UTC
It seems that you have an intermittent failure with your LDAP connection. We suggest that you consider taking a look at SSSD. 

Based on the information in the ticket it is hard to try to indetify what is going wrong. It might be caused by intermittent network outages or issues in the unerlaying LDAP library. 

Since it is not possible to reproduce we will not address this issue. Please let us know and reopen if you have additional information that would allow us to reproduce. However SSSD is really a much better solution for the cases when the intermittent network failures are frequent, please consider.

Comment 3 RHEL Program Management 2011-02-01 21:45:05 UTC
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Comment 4 Chris Adams 2011-02-02 15:29:05 UTC
The problem is certainly not a network outage, since the primary OpenLDAP server is on the same host and is accessed via the Unix domain socket (ldapi:///).  There is no indication of any network outage (the secondary server is connected to the same switch, both are on the same subnet/VLAN, and there are no errors on any of the interfaces).

SSSD is not a solution, given the performance problems I saw when trying it on RHEL 6 (BZ 664071, "hopefully" fixed in a new version).  SSSD is also lacking tools to manage the cache (such as invaldating an entry, like "nscd -i passwd <deleted-user>").

Also, if the problem is in the underlying LDAP library, switching to SSSD wouldn't help (since it still uses the same OpenLDAP client library).

Comment 5 Dmitri Pal 2011-05-03 18:07:37 UTC
Pleas open the bug with the ldap library. It does not seem to be the case with nss_ldap.