Description of problem:
After Installation of Update 4 we encountered several Segmentation Faults in
sendmail (from RedHat), nscd (from RedHat), mimedefang, uxmon (bigsister) and
dsmc (from Tivoli). The main culprit seems to be nscd-2.3.4-2.25, which was
enabled using the default configuration. Disabling the host-cache portion of
nscd "cured" this problem (at least i haven't seen a related segfault since).
Version-Release number of selected component (if applicable):
It happened on three Dell Poweredge (2850 and 1850) server. One is mainly used
as a web-server the other two as mail-servers. Since these are production
servers testing is a bit difficult. On one machine I specifically installed only
nscd and the kernel update and encountered 2 segfaults without rebooting the
machine and several thousand in sendmail after booting the machine. Three other
i386 architecture Machines using the appropriate version have not shown any
remotely comparable problems after installation of update 4. The segmentation
faults occur mostly after a boot of the machine. The frequency was reduced after
several hours of uptime. I am not entirely sure if this coincided with a crash
of nscd itself, which at least in one case seemed to be the reason.
Steps to Reproduce:
1. Install RedHat 4AS x86_64 Update 3
2. Enable nscd with default configuration
3. Install Update 4
segfaults in sendmail up to sendmail crashing
No segfault as before.
By any chance, could this be related to nscd database growing (i.e. do you have
really many concurrent hosts lookups that the default database size is too
We've just been able to reproduce such an issue today and are still working on
You could try to increase suggested-size hosts to a (much) bigger (prime) value,
say 8191, rm -f /var/db/nscd/hosts and restart nscd to see if that's the case.
I am currently trying your suggestion on one of our servers.
It is a medium sized mail server, serving about 30000 mailadresses but not
handling the mailboxes itself. It uses mimedefang and spamassassin for spam
detection so it sees a bit of host lookups but not that many concurrent.
Typically there are not more than 30 sendmail processes running at the same time.
It may take a few hours before I can tell if this makes a difference and will
report back then.
The Server has now been running over 24 hours with the increased host cache size
and there has not been a single segfault. So it looks like the database size was
contains a testing glibc that should hopefully fix this problem. Note this
hasn't gone through QA, no guarantees about it.
To test, you'd need to:
a) decrease suggested-size back in nscd.conf
b) rm -f /var/db/nscd/*
c) restart nscd
so that the database keeps growing again.
So I downloaded only nscd-2.3.4-2.27.x86_64.rpm and installed it on one of ours
servers with the default configuration.
It took about 30 minutes till the server started logging segfaults again. So it
did not help.
I am not entirely sure if downoading the entire glibc should make a difference.
Yes, you need not only new nscd, but also glibc. Most of the changes were
actually on the glibc side (in libc.so.6) that affect the applications that
connect to nscd, only one fix was actually in nscd itself.
Sorry for the misunderstanding on my part.
I have now replaced the entire glibc with the patched version and booted the
machine. I will report back later about success or failure.
The server has now been running 30 hours without a single segfault, so it is
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.