Bug 811512

Summary: nss_ldap collaboration between two openldap servers fail, when one is "sleep"
Product: Red Hat Enterprise Linux 5 Reporter: David Spurek <dspurek>
Component: nss_ldapAssignee: Nalin Dahyabhai <nalin>
Status: CLOSED WONTFIX QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.8CC: dpal, dspurek, ebenes, jhrozek, jplans, prc
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-16 13:07:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
reproduce test none

Description David Spurek 2012-04-11 10:02:04 UTC
Description of problem:
Setup master/slave openldap servers, in /etc/ldap.conf setup "uri slave master".
Then "sleep" slave server (kill -s SIGSTOP $SLAPD_SLAVE_PID). Try ssh with user in ldap (second server in uri should be used), but ssh fails.

Version-Release number of selected component (if applicable):
nss_ldap-253-49.el5

How reproducible:
always

Steps to Reproduce:
1. unpack attached file nss_ldap_soft_failure_test.tar.gz
2. install needed packeges:
PACKAGES=( "nss_ldap"      \
        "expect"           \
        "authconfig"       \
        "openldap"         \
        "openldap-clients" \
        "openldap-servers" \
        "openssh-clients"  \
	"openldap-servers-overlays" )

(If you don't have beakerlib and beakerlib-redhat, install them too)

3. run command: bash runtest.sh
  
Actual results:
ssh on user in ldap fail

Expected results:
ssh pass

Comment 1 David Spurek 2012-04-11 10:02:36 UTC
Created attachment 576737 [details]
reproduce test

Comment 2 RHEL Program Management 2012-04-19 11:51:12 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 4 Jakub Hrozek 2012-07-14 11:54:39 UTC
I don't think this is fixable in a reasonable way with the 5.9 codebase without introducing a major change to the pam_ldap codebase. Moreover, I think that the bug only affects a corner case, for most realistic cases, the fail over mechanism would work just fine.

Let me explain in a little more detail..

Currently pam_ldap uses a libldap's internal failover mechanism, which means pam_ldap only passes the list of configured servers to ldap_initialize(). That failover works fine if a primary server is not reachable at all - this can be tested by shutting down the server or setting up DROP or REJECT rules.

The reported problem only occurs when libldap actually *can* make a connection but later the bind fails - which, as you found out, can be reproduced by stopping the deamon.

Fixing this issue would require that we implement a similar fail over mechanism as nss_ldap has in pam_ldap. nss_ldap essentially parses the list of servers and only initializes connection to one at a time, moving on if the connection fails. But I think this is out of scope of RHEL 5.9.

Comment 6 RHEL Program Management 2012-07-16 13:07:17 UTC
Development Management has reviewed and declined this request.
You may appeal this decision by reopening this request.