Bug 404751 - nss_ldap hangs whenLDAP is down, yet again (regression in nss_ldap-226-20)
nss_ldap hangs whenLDAP is down, yet again (regression in nss_ldap-226-20)
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: nss_ldap (Show other bugs)
4.6
i386 Linux
low Severity medium
: ---
: ---
Assigned To: Nalin Dahyabhai
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-29 12:55 EST by Michael Torrie
Modified: 2012-06-20 09:33 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-20 09:33:31 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Michael Torrie 2007-11-29 12:55:17 EST
Description of problem:

The problem of nss_ldap hanging when LDAP is down has returned in the latest
nss_ldap-226-20.i386.rpm update.  


Version-Release number of selected component (if applicable):
nss_ldap-226-20

How reproducible:
After putting in the latest nss_ldap update, if LDAP ever is unavailable,
nss_ldap hangs for all lookups, even for non-ldap users.  This presents an
interesting problem if the LDAP server itself is on the same machine.  Running
/etc/init.d/ldap start will never succeed.

Steps to Reproduce:
1. Install nss_ldap-226-20 update
2. shut down ldap server
3. can no longer id users, or long in as non-ldap users.
  
Actual results:
No logins (sshd or console) work; they just hang.  id root hangs.


Expected results:
id root should work, any attempt to lookup an ldap user should return "user not
found."  If I downgrade to nss_ldap-226-18, everything works this way again.

Additional info:
See https://bugzilla.redhat.com/show_bug.cgi?id=176209 (bug 176209).  It's the
same thing, but apparently the problem exists in this latest RHEL4.6 update
package released yesterday.  This bug report could be considered a duplicate of
the bug report I just referenced, but it's for a different version of the
nss_ldap rpm.  Downgrading to nss_ldap-226-18 does corrects the problem. So
whatever happened, it's a regression between nss_ldap-226-18 and 226-20.
Comment 1 Nalin Dahyabhai 2007-11-29 13:29:57 EST
Does your /etc/ldap.conf configuration file include either
  bind_policy soft
or
  nss_initgroups_ignoreusers root,ldap
in it?  In particular, does adding the second setting help?
Comment 2 Michael Torrie 2007-11-29 16:12:09 EST
Neither option is present, as neither seemed necessary before (/etc/ldap.conf
was originally set by authconfig).  I have added the second option and now root
can log in when ldap is down.  However ldap restart still freezes.  I had to put
in the bind_policy soft to make it all work.  I am a bit worried about using
bind_policy soft, as I already see lots of cases where nss_ldap reports that
it's connection to ldap was lost and it had to reconnect. Would this effect it's
ability to do so?

logs have always reported:
Nov 28 15:53:55 admin sshd[23716]: nss_ldap: reconnecting to LDAP server...
Nov 28 15:53:55 admin sshd[23716]: nss_ldap: reconnected to LDAP server
admin.chem.byu.edu after 1 attempt(s)

This appears to be the normal operation of nss_ldap, probably do to connections
timing out.  Would setting the bind_policy soft option be okay?  I don't want
users to temporarily disappear for very long (except of course if LDAP goes away
entirely).
Comment 3 Nalin Dahyabhai 2007-11-30 15:00:40 EST
(In reply to comment #2)
> Neither option is present, as neither seemed necessary before (/etc/ldap.conf
> was originally set by authconfig).  I have added the second option and now root
> can log in when ldap is down.  However ldap restart still freezes.  I had to put
> in the bind_policy soft to make it all work.  I am a bit worried about using
> bind_policy soft, as I already see lots of cases where nss_ldap reports that
> it's connection to ldap was lost and it had to reconnect. Would this effect it's
> ability to do so?

I think it would.  The logic's pretty tortuous there (one complicated behavior's
been replaced by another, so more possible errors are now retried, subject to
the "hard"/"soft" setting); eventually I'm going to want to replace it.

> logs have always reported:
> Nov 28 15:53:55 admin sshd[23716]: nss_ldap: reconnecting to LDAP server...
> Nov 28 15:53:55 admin sshd[23716]: nss_ldap: reconnected to LDAP server
> admin.chem.byu.edu after 1 attempt(s)
> 
> This appears to be the normal operation of nss_ldap, probably do to connections
> timing out.  Would setting the bind_policy soft option be okay?  I don't want
> users to temporarily disappear for very long (except of course if LDAP goes away
> entirely).

I can't recommend soft without reservations.  Perhaps figuring out what's
causing the directory server startup problem would be simpler.  Are you running
it as the "ldap" user?  Are you also using nss_ldap for protocol, services, rpc,
or host information as configured by /etc/nsswitch.conf?
Comment 4 Garth D. Wiebe 2007-12-12 13:28:23 EST
I updated a machine all the way from 4U5 to 4U6 unsuccessfully, and narrowed 
things down to the update of nss_ldap-226-18 to nss_ldap-226-20 rendering the 
system unusable.  Starting any application (even Konsole) takes exactly 2 
minutes.  Once the application has started, it is fine.  When booting the 
system, the boot process hangs while bringing up eth0.  Downgrading to nss_ldap-
226-18 solves the problem.  There is no /etc/ldap.conf file, but there is an /
etc/ldap.conf.rpmsave file, which has default values.  /etc/nsswitch.conf does 
not reference ldap, only "files", "nss", and "dns".

Is this the same problem?
Comment 5 Michael Torrie 2007-12-12 22:01:18 EST
(In reply to comment #3)
> I can't recommend soft without reservations.  Perhaps figuring out what's
> causing the directory server startup problem would be simpler.  Are you running
> it as the "ldap" user?  Are you also using nss_ldap for protocol, services, rpc,
> or host information as configured by /etc/nsswitch.conf?

Correct.  My nsswitch.conf is set to use files and ldap for users and password
information, not protocol, services, rpc, or host information, which are all set
to "file."


Comment 6 Michael Torrie 2007-12-12 22:04:15 EST
(In reply to comment #4)
> I updated a machine all the way from 4U5 to 4U6 unsuccessfully, and narrowed 
> things down to the update of nss_ldap-226-18 to nss_ldap-226-20 rendering the 
> system unusable.  Starting any application (even Konsole) takes exactly 2 
> minutes.  Once the application has started, it is fine.  When booting the 
> system, the boot process hangs while bringing up eth0.  Downgrading to nss_ldap-
> 226-18 solves the problem.  There is no /etc/ldap.conf file, but there is an /
> etc/ldap.conf.rpmsave file, which has default values.  /etc/nsswitch.conf does 
> not reference ldap, only "files", "nss", and "dns".
> 
> Is this the same problem?

Possibly.  But my nsswitch.conf file does reference ldap (I'm using ldap users
on the machine).  So I can't say.  I have reproduced the problem on two
different RHEL4 (U6) machines that are each LDAP clients and LDAP masters.  In
other words each machines uses ldap users provided by openldap running on the
same machine.
Comment 7 Garth D. Wiebe 2007-12-13 09:26:45 EST
I do not have openLDAP installed.  A "rpm -q -a | grep LDAP" returns only perl-
LDAP-0.31-5.  As far as I know, I am not using LDAP for anything, unless I am 
neglecting to account for something due to my limited experience with Linux.  
Just the act of installing nss_ldap-226-20 kills this system, whereas you only 
experience problems with users logging in, it seems.  I am already logged in as 
root and cannot launch any application without the 2 minute wait.
Comment 8 luca villa 2008-01-15 13:04:46 EST
I've the same problem here.
Downgrading to nss_ldap-226-18 worked around the problem.
Comment 9 Jiri Pallich 2012-06-20 09:33:31 EDT
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.