Description of problem: nss_ldap-253-7.el4 fixed an issue with leaking file descriptors, see bug 491419, however, it's still leaking file descriptors even with this patch (although less frequently). Version-Release number of selected component (if applicable): nss_ldap-253-7.el4 nscd-2.3.4-2.43 How reproducible: sometimes Steps to Reproduce: 1. install RHEL 4.8 and configure as an LDAP client with START_TLS 2. enable nscd 3. ??? Actual results: lsof shows the leaking file descriptors as "can't identify protocol" and the number of these sockets slowly climbs up to the max (1024). $ lsof -p `pidof nscd` ... nscd 10938 nscd 12u sock 0,4 1751911 can't identify protocol nscd 10938 nscd 13u sock 0,4 1760055 can't identify protocol nscd 10938 nscd 15u sock 0,4 1765674 can't identify protocol ... Expected results: no leaks Additional info: The patch used to resolve bug 491419 is from PADL BZ 304: http://bugzilla.padl.com/show_bug.cgi?id=304 There is an updated version of the patch in PADL BZ 305: http://bugzilla.padl.com/show_bug.cgi?id=305 The updated patch is not yet included in upstream nss_ldap.
Created attachment 389868 [details] updated patch to fix file descriptor leaks This patch contains the differences between the original patch (PADL BZ 304) and the updated patch (PADL BZ 305). It can be applied on top of nss_ldap-250-fix-fdleak.patch from nss_ldap-253-7.el4.
Other notes: 1. The Debian bug on this issue includes this comment: in the case ssl connections are in use it's just totally broken and can't be fixed. yay. (however thanks to fixing the do_get_our_socket code the drop code is rarely called in the dangerous manner.) http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=401758 Using 'ssl start_tls' in /etc/ldap.conf may be part of the root cause. 2. From PADL BZ 305 so basically testing getsockname/getpeername after having received an error from read/write is flawed... the real fix is going to be monitoring the ldap library results more closely and closing the socket proactively after an error. http://bugzilla.padl.com/show_bug.cgi?id=305#c2 The patch in bug 491419 also mentions this problem and uses getpeername(): + /* + * XXX: We don't pay any attention to return codes in places such as + * do_search_s so we never observe when the other end has disconnected + * our socket. In that case we'll get an ENOTCONN error here... and + * it's best we ignore the error -- otherwise we'll leak a filedescriptor. + * The correct fix would be to test error codes in many places. + */ + else if (getpeername (*sd, (struct sockaddr *) &peername, &peernamelen) != 0)
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Under certain circumstances, the nss_ldap module may have caused the application to leak file descriptors. As a result, the number of open sockets could reach the maximum limit of 1024. With this update, this error has been fixed, and file descriptors are no longer leaking.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Under certain circumstances, the nss_ldap module may have caused the application to leak file descriptors. As a result, the number of open sockets could reach the maximum limit of 1024. With this update, this error has been fixed, and file descriptors are no longer leaking.+Under certain circumstances, the nss_ldap module may have caused the application to leak file descriptors. As a result, the number of open sockets could have reached the maximum limit of 1024. With this update, this error is fixed, and file descriptors are no longer leaking.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -Under certain circumstances, the nss_ldap module may have caused the application to leak file descriptors. As a result, the number of open sockets could have reached the maximum limit of 1024. With this update, this error is fixed, and file descriptors are no longer leaking.+Under certain circumstances, nss_ldap may have leaked file descriptors. As a result, the number of open sockets could have reached the maximum limit of 1024. With this update, this error is fixed and With this update, nss_ldap no longer causes applications to leak file descriptions.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0239.html