78695 – nscd quits servicing requests

Bug 78695 - nscd quits servicing requests

Summary: nscd quits servicing requests

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	glibc
Sub Component:
Version:	7.3
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Jakub Jelinek
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-11-27 18:10 UTC by Marc Wallman
Modified:	2016-11-24 15:22 UTC (History)
CC List:	5 users (show)
Fixed In Version:	2.3.3-65
Clone Of:
Environment:
Last Closed:	2004-10-06 06:10:48 UTC
Embargoed:

Attachments	(Terms of Use)

Description Marc Wallman 2002-11-27 18:10:45 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827

Description of problem:
NSCD quits servicing requests on one of our imap servers. When this happens, the
server becomes totally unusable, the load starts to skyrocket, and any processes
waiting on a request from nscd simply wait and do not exit (e.g. we keep
spawning new imap processes until we hit the maximum number specified in the
xinetd config file). The problem tends to occur on weekday mornings as our
server usage begins to increase. Under normal conditions, the load on this box
is always under 4.

An strace on a command like ps -l shows that the command is waiting on to read
from the nscd socket in /var/run. The problem occurs with great frequency--every
morning before I disabled nscd. All packegs on our box are 100% up to date. An
strace on nscd shows that the application is waiting on a read (of what I don't
know since I have only run strace after the nscd process had hung).

The problem occured with both nscd-2.2.5-42 and nscd-2.2.5-39.

A similar problem appears to have been reported in bugs 17519 and 13308.

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
I can't reproduce the conditions, but when it runs in production it happens
regularly--almost every weekday between 10am and 12:30pm.

Additional info:

Hardware info:
CPU: 2x1.26Ghz PIII
RAM: 1Gb
Disk: 2x36G (Raid 1), 34G of SAN storage accessed via a Qlogic 2300 HBA

Comment 1 zweers 2003-03-26 19:28:04 UTC

This problem should be a higher priority.

I've noticed a similar problem on two of our servers.  We are converting our
authentication over to use LDAP.  If the LDAP servers disappear for too long a
period of time, nscd stops responding at all, which in turn locks up the server.

This is a very serious problem as it can effect production environments and can
quickly cause what should be a simple problem to bring down every server.

I did not see if the original problem included interactions with LDAP, but I
wonder if it also depends on a remote server.

Comment 2 Eric Doutreleau 2003-06-11 09:32:10 UTC

I got exactly the same problem.
the nscd die and i have xemacs processes that eat all memory of my server.

Comment 3 Adam Lynch 2003-09-10 15:25:03 UTC

We're seeing this issue as well. We're running nscd 2.2.5-43 and OpenLDAP
2.0.27-2.7.3. 

All system processes go into sleep, no new processes are spawned. System is, for
all intents and purposes, locked. The only way for us to re-engage the system is
a reboot.

Comment 4 Ulrich Drepper 2004-10-06 06:10:48 UTC

(The component was wrong, it should have been glibc not nscd, this is
why this bug went unnoticed.)

There have been countless of c hanges since 7.3 and glibc 2.2.5.  With
glibc-2.3.3-64 and up I don't expect any problems even with the
nss_ldap module anymore.  Upgrade to this or a later version when
available and retest.  Open new bugs for all negative findings.  This
bug is outdated and therefore I close it.

Note You need to log in before you can comment on or make changes to this bug.