Bug 17519
Summary: | nscd deadlocks, halting system activity | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | shuey |
Component: | glibc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED RAWHIDE | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.2 | CC: | drepper, fweimer, shuey |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-10-04 06:50:41 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
shuey
2000-09-14 22:17:31 UTC
assigned to jakub I am seeing the same problems on our deployed RedHat 7.2 servers, again with LDAP as a backend. All the related packages (nss_ldap, pam_ldap, glibc) are all either the default RedHat 7.2 install with most of the machines at the latest released RedHat 7.2 updated package. Is any progress being made here? THanks We are running Novell eDirectory on Red Hat 7.3 server. Without using nscd the server will jam totaly. The problem is that nscd is extreme unstable and it has to restarted on crontab about every minute. Here is snipper what I see with "ps fax" command. Not a pretty sight: 3475 ? S 0:11 /usr/sbin/nscd 3484 ? Z 0:00 \_ [nscd <defunct>] 3684 ? S 0:09 /usr/sbin/nscd 3687 ? Z 0:00 \_ [nscd <defunct>] 3816 ? S 0:08 /usr/sbin/nscd 3819 ? Z 0:00 \_ [nscd <defunct>] 3954 ? S 0:07 /usr/sbin/nscd 3961 ? Z 0:00 \_ [nscd <defunct>] 4147 ? S 0:07 /usr/sbin/nscd 4151 ? Z 0:00 \_ [nscd <defunct>] We also run nscd with an LDAP backend, we are fortunate in that the nscd daemon die abnormally frequently but not deadlock. The nscd daemon dies leaving behind /var/run/nscd.pid and /var/run/.nscd_socket - these need to be removed before nscd can be restarted again. I've tried to increase the number of nscd threads and enabling debug logging but I am still not sure if these resolve the problem. This problem happens on both a RH7.3 box and RH7.1 box with the current nscd/glibc errata RPMs. We've had this problem happen on Red Hat 7.3 Red Hat 8.0 Red Hat ES 2.1 Red Hat ES 3 We kept our RH 7.3 and 8 systems up to date with patches, and Red Hat Network is keeping our ES 2.1 and ES 3 systems completely up to date, and we're still seeing the problem, on multiple different systems. This problem is listed as "ASSIGNED", but that was more than a year ago. What's the holdup? Would nscd debug logs help? The holdup is that the coponent is wrong. Somewhat set this up for some reason but none of the people responsible for the package even knew it existed. The bug should have been filed against glibc since this is the package nscd is part of. There is a problem in nscd which is fixed in the current glibc at least. Use FC3t2 or later when it comes available. Part part of the blame is to be laid on the nss_ldap module which far too often misbehaves. I won't anayze it since I at some point want to eat again. If you have problems with lockups in FC3 let me know by reopening. But we certainly won't touch any code in RHL9 or earlier, FC1, or FC2. |