Bug 713371

Summary: Can't contact LDAP server after update from RHEL6.0 to RHEL6.1
Product: Red Hat Enterprise Linux 6 Reporter: René Hartman <hac.bugzilla>
Component: openldapAssignee: Jan Vcelak <jvcelak>
Status: CLOSED DUPLICATE QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.1CC: ajb, jplans, jvcelak, rmeggins, scottro11, tsmetana
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-16 17:04:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description René Hartman 2011-06-15 07:53:35 UTC
Description of problem:
Running RHEL6.0 with OpenLDAP as client OK.
Update to RHEL6.1: LDAP stops working (LDAP user cannot log in).
/var/log/messages has lots of
Jun  6 12:47:26 hostR6 nslcd[28193]: [8b4567] failed to bind to LDAP server ldaps://ldapserver1/: Can't contact LDAP server
Jun  6 12:47:26 hostR6 nslcd[28193]: [7b23c6] failed to bind to LDAP server ldaps://ldapserver1/: Can't contact LDAP server: Operation now in progress
Jun  6 12:47:26 hostR6 nslcd[28193]: [8b4567] failed to bind to LDAP server ldaps://ldapserver2/: Can't contact LDAP server: Operation now in progress
Jun  6 12:47:26 hostR6 nslcd[28193]: [7b23c6] failed to bind to LDAP server ldaps://ldapserver2/: Can't contact LDAP server: Operation now in progress
Jun  6 12:47:26 hostR6 nslcd[28193]: [7b23c6] failed to bind to LDAP server ldaps://ldapserver3/: Can't contact LDAP server: Operation now in progress
Jun  6 12:47:26 hostR6 nslcd[28193]: [7b23c6] no available LDAP server found, sleeping 1 seconds

Version-Release number of selected component (if applicable):
 nss                   x86_64    3.12.9-9.el6
 nss-softokn           x86_64    3.12.9-3.el6
 nss-softokn-freebl    i686      3.12.9-3.el6
 nss-softokn-freebl    x86_64    3.12.9-3.el6
 nss-sysinit           x86_64    3.12.9-9.el6
 openldap              x86_64    2.4.23-15.el6

How reproducible:
Update to RHEL6.1, reboot

Steps to Reproduce:
1. Update to RHEL6.1 and reboot: LDAP fails to contact server
2. To fix:
    - # yum -y downgrade nss nss-softokn* nss-sysinit openldap
    - # service nslcd restart
3. To reproduce again:
    - # yum update
    - # service nslcd restart
  
Actual results:
LDAP users no longer get resolved as the server cannot be contacted; logins fail

Expected results:
LDAP users can log in, as with RHEL6.0

Additional info:
Issue discussed in greater detail here:
https://www.centos.org/modules/newbb/viewtopic.php?topic_id=31755&forum=14

LDAP is configured with self-signed keys (tls_reqcert allow) and implicit ssl (ssl on)
Port 389 on the server is closed, access is through ldaps (port 636) only.
tcpdump shows ldaps requests being made and responses being received from the server.

Comment 2 Rich Megginson 2011-06-15 18:27:57 UTC
Can you attach your pam_ldap and nss_ldap configuration?  This may be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=713525 but I'll need to see your configuration to make sure.

Comment 4 René Hartman 2011-06-16 05:14:29 UTC
It is fixed. For completeness here are my settings:

/etc/openldap/pam_ldap.conf:
base                            o=MYORG
uri                             ldaps://ldapserver1/ ldaps://ldapserver2/ ldaps://ldapserver3/

timelimit                       120
idle_timelimit                  3600
bind_timelimit                  120
bind_policy                     soft

nss_initgroups_ignoreusers      root,ldap,named,avahi,haldaemon,dbus,radvd,tomcat,radiusd,news,mailman
ssl                             on
tls_reqcert                     allow

nss_base_passwd                 ou=People,o=MYORG?one
nss_base_shadow                 ou=People,o=MYORG?one
nss_base_group                  ou=Group,o=MYORG?one

pam_check_host_attr             yes
pam_password                    md5


I have no nss_ldap.conf on the system. Should I?


This is my /etc/nslcd.conf, which proved to be the culprit:
uid nslcd
gid ldap
tls_cacertdir /etc/openldap/cacerts
tls_reqcert allow
# This comment prevents repeated auto-migration of settings.
uri                             ldaps://ldapserver1/ ldaps://ldapserver2/ ldaps://ldapserver3/
base                            o=MYORG
ssl                             on
timelimit                       120
idle_timelimit                  3600
bind_timelimit                  120


Based on the bug you referred to, I took out the lines
tls_cacertdir /etc/openldap/cacerts
tls_reqcert allow

and restarted nslcd. Still no joy.

Then I reinserted "tls_reqcert allow" and restarted nslcd. Bingo.

So indeed the 'tls_cacertdir' line turned out to be the problem, albeit not in /etc/openldap/pam_ldap.conf, but /etc/nslcd.conf.

During testing I also took out the "tls_reqcert allow" line from /etc/openldap/pam_ldap.conf and it appears not to be missing it, so I left it out. For completeness I should add that I have a symbolic link /etc/pam_ldap.conf pointing to /etc/openldap/pam_ldap.conf.

Thanks. I guess this should be documented in the release notes or something.

Comment 5 René Hartman 2011-06-16 09:58:24 UTC
Oops! It's NOT fixed.

The change above just enabled resolution of LDAP users.

As I had other pressing matters I was satisfied that LDAP users could now be resolved was an indication that authentication would work (based on past exprerience).

Unfortunately, that is not the case. I now get:

Jun 16 11:26:08 hostR6 sshd[16413]: pam_ldap: reconnecting to LDAP server...
Jun 16 11:26:08 hostR6 sshd[16413]: pam_ldap: ldap_simple_bind Can't contact LDAP server
Jun 16 11:26:08 hostR6 sshd[16414]: fatal: Access denied for user ldapuser by PAM account configuration

So despite the fact that user-resolution now works, users still cannot sign on.
This behavior persists after downgrading as I did before, so I also downgraded pam_ldap, which did not make a difference.

As after the presumed fix I also updated the kernel and OpenSSL, I downgraded OpenSSL as well, but again, same difference.

Finally, I rebooted into the previous kernel (the one I had been testing with).
Same difference.

Unfortunately I now have to attend to pressing project matters again, so this will have to wait a bit.

Comment 6 René Hartman 2011-06-16 10:02:03 UTC
In order to avoid confusion with the RHEL6.0 setup:

Looks like the downgrade/upgrade only concerned the LDAP resolution. From root I can do "su - ldapuser" without problems, I just can't establish an ssh session for that user. This definitely worked under RHEL6.0.

Comment 7 Rich Megginson 2011-06-16 14:06:36 UTC
Can we at least say that, for the LDAP part, this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=713525 ?

Comment 8 René Hartman 2011-06-16 15:25:11 UTC
It has similarities, but the other reporter stated his problem was solved after deleting the 'cacertdir' directive, which in my case is clearly not true.

My issue just changed from not being able to resolve users to not being able to perform a simple bind. In fact, the other reporters' confirmation that his issue was solved triggered my premature conclusion.

*Brainwave*: as I reported above, I took out the "tls_reqcert allow" directive from /etc/openldap/pam_ldap.conf.
I just reinserted that, and voila. I can sign on again.

So it appears to be the same issue.

I will update the server to all the latest levels again tomorrow and file a hopefully final report.

Comment 9 René Hartman 2011-06-16 16:59:53 UTC
I just found the time to remotely update the server to the latest levels again, and it still works.

So in summery both /etc/openldap/pam_ldap.conf and /etc/nslcd.conf cause LDAP to fail if they have the tls_cacertdir directive pointing to an empty directory, and if they are missing the tls_reqcert directive when using self-signed certificates.

Symptoms are similar, but different, causing my confusion. The other bug addressed /etc/openldap/ldap.conf (which likely should be /etc/openldap/pam_ldap.conf) and this one originally addressed /etc/nslcd.conf.

'Misconfiguration' of /etc/nslcd.conf causes a blunt connection failure as outlined in the initial post, while 'misconfiguration' of /etc/openldap/pam_ldap causes the ldap_simple_bind failure.

Please note that the 'misconfiguration' was not such before updating to RHEL6.1

Thanks.

Comment 10 René Hartman 2011-06-16 17:03:40 UTC
'/etc/openldap/pam_ldap' in the one-but-last paragraph in the previous comment should have been '/etc/openldap/pam_ldap.conf'. Sorry.

Comment 11 Rich Megginson 2011-06-16 17:04:39 UTC
The bug is actually in the openldap library that's used by pam, nss, nslcd,
etc.  The bug is that if an empty cacertdir is specified openldap fails.  It
should allow an empty cacertdir, and tls/ssl should work if reqcert is
specified as never or allow.  I'm not sure what the default is, but from your
comments, since it doesn't work if it is missing, it is probably try, demand,
or hard.

I'm going to close this as a dup of
https://bugzilla.redhat.com/show_bug.cgi?id=713525

*** This bug has been marked as a duplicate of bug 713525 ***