Description of problem: Recently we discovered when installing IBM's Websphere MQ (http://www-306.ibm.com/software/integration/wmq/) on our servers that utilize LDAP that if the nsswitch.conf had files in front of LDAP, that MQ would not function properly, so we had to put LDAP first. This isn't the right way to do it and we wondered why this was the only application to require this. We told IBM that MQ was broken, they said nss_ldap was broken. They gave us code to prove it was broken. In a file attached called ibm.c you can see that if, using the nss_ldap-189 RPM on RHEL AS/ES/WS 2.1, getgrent calls will go into an infinite loop if 'group: files ldap' is in /etc/nsswitch.conf instead of 'group: ldap files'. Compiling and running the code will confirm this problem. IBM's internal tracking # for this issue is PMRs 92341,122 (I think). This behavior does not occur in version 207 of nss_ldap that ships with RHEL 3. Version-Release number of selected component (if applicable): nss_ldap-189 How reproducible: Always. Steps to Reproduce: 1. Compile the attached code (gcc -o ibm ibm) 2. Run the resulting executable. 3. Watch as the results loop through the # of iterations defined on about line 19 (ie if( ++n == 10000 ) break;) Actual results: The getgrent call will loop continuously. Expected results: The getgrent call should list all the GID's in the LDAP directory once and then stop. Additional info: I decided to be proactive and search for a solution myself before posting this bug. I took the 207 source that ships with RHEL 3's nss_ldap SRPM and its patches and the spec file from the 189 SRPM that ships with RHEL 2.1 and merged it so that version 207 is built from the spec file for 189. The resultant SRPM is attached. When tested against the ibm.c code I referenced, the getgrent call functions as expected when nsswitch.conf has 'group: files ldap'. Since I know it is RH policy not to up versions of libraries included with distributions, but rather to backport fixes I will attempt to identify the exact fix and backport that to 189 in the hopes that RH will put out another nss_ldap-189 RPM that fixes this (big) problem.
Created attachment 96568 [details] Code from Kurtis D. Rader @ IBM that exposes flaw in nss_ldap-189
The SRPM mentioned in the bug report can be found here: http://used-blues.com/files/nss_ldap-207-4.src.rpm
I've confirmed that versions 189-203 have this problem. Version 204 is the first version in which this behavior was fixed. Identifying which code fixed it (if I can).
Update from the good folks at RedHat: "This apparently happens with RHEL 2.1's glibc if an enumeration function doesn't set *errnop to ENOENT when it returns NSS_STATUS_NOTFOUND. We are not sure that the version of nss_ldap is a factor, as we can reproduce this problem with version 217 but not on later releases. We are still in process of a patch to correct this issue."
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-109.html