186267 – nscd's suggested-size never gets cleaned up

Bug 186267 - nscd's suggested-size never gets cleaned up

Summary: nscd's suggested-size never gets cleaned up

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Jakub Jelinek
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	181409
TreeView+	depends on / blocked

Reported:	2006-03-22 16:50 UTC by Bastien Nocera
Modified:	2007-11-30 22:07 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHBA-2006-0510
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-08-10 21:34:36 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
nscd-dont-use-mremap.2.patch (5.39 KB, patch) 2006-03-22 16:50 UTC, Bastien Nocera	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2006:0510	0	normal	SHIPPED_LIVE	glibc bug fix update	2006-08-09 04:00:00 UTC
Sourceware	1204	0	None	None	None	Never

Description Bastien Nocera 2006-03-22 16:50:29 UTC

When using RHEL4's nscd, the cache isn't pruned and we get a lot of errors such as:
Jan 30 15:04:51 xxxxx nscd: 10846 no more memory for database 'passwd'

The output of nscd -g for the passwd section (ie. showing that it's not bug
#178940):
---8<---
passwd cache:

           yes  cache is enabled
            no  cache is persistent
           yes  cache is shared
           211  suggested size
        216064  total data pool size
        216040  used data pool size
           600  seconds time to live for positive entries
            20  seconds time to live for negative entries
             0  cache hits on positive entries
             0  cache hits on negative entries
          1286  cache misses on positive entries
             0  cache misses on negative entries
             0% cache hit rate
          2572  current number of cached values
          2572  maximum number of cached values
            23  maximum chain length searched
             0  number of delays on rdlock
             0  number of delays on wrlock
         32464  memory allocations failed
           yes  check /etc/passwd for changes
---8<---

The patch attached below corresponds to the upstream bug:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=1204
and those (parts of) commits:
http://sourceware.org/ml/glibc-cvs/2005-q3/msg00482.html
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/nscd/mem.c.diff?cvsroot=glibc&r1=1.6&r2=1.7

The patch triggers another bug though (not reported upstream, but affecting the
latest Fedora Core glibc nonetheless). The client program would SEGV when the
cache first reaches the suggested-size, additional runs would be fine, nscd
would still be alive and kicking. Here's the bt of the SEGV, with "ls" as the
client program:
#0  __nscd_cache_search (type=GETPWBYUID, key=0xbfe3e4c3 "7606", keylen=5,
   mapped=0x9ebc7e8) at nscd_helper.c:370
370           if (type == here->type && keylen == here->len
(gdb) bt
#0  __nscd_cache_search (type=GETPWBYUID, key=0xbfe3e4c3 "7606", keylen=5,
   mapped=0x9ebc7e8) at nscd_helper.c:370
#1  0x00b488c4 in nscd_getpw_r (key=0xbfe3e4c3 "7606", keylen=5,
   type=GETPWBYUID, resultbuf=0xb7a09c, buffer=0x9ebfb78 "userfoo9819",
   buflen=1024, result=0xbfe3e528) at nscd_getpw_r.c:104
#2  0x00b48bc7 in __nscd_getpwuid_r (uid=2856352412, resultbuf=0x9ebc7e8,
   buffer=0x9ebc7e8 "", buflen=166447080, result=0x9ebc7e8)
   at nscd_getpw_r.c:64
#3  0x00adb8ea in __getpwuid_r (uid=7606, resbuf=0xb7a09c,
   buffer=0x9ebfb78 "userfoo9819", buflen=1024, result=0xbfe3e528)
   at ../nss/getXXbyYY_r.c:162
#4  0x00adb391 in getpwuid (uid=7606) at ../nss/getXXbyYY.c:135
#5  0x08054722 in getuser (uid=7606) at idcache.c:74
#6  0x0804b574 in format_user_width (u=7606) at ls.c:3141
#7  0x0804bd3f in gobble_file (name=0x9ec156f "foofile7606", type=normal,
   explicit_arg=0, dirname=0x9ec04f8 "/root/foo") at ls.c:2609
#8  0x0804e3e9 in print_dir (name=0x9ec04f8 "/root/foo", realname=0x0)
   at ls.c:2272
#9  0x0804faba in main (argc=4, argv=0xbfe3f2a4) at ls.c:1230
#10 0x00a66e23 in __libc_start_main (main=0x804e9d1 <main>, argc=4,
   ubp_av=0xbfe3f2a4, init=0x8056ee8 <__libc_csu_init>,
   fini=0x8056f3c <__libc_csu_fini>, rtld_fini=0xedc690 <_dl_fini>,
   stack_end=0xbfe3f29c) at ../sysdeps/generic/libc-start.c:209
#11 0x08049d11 in _start ()

Comment 1 Bastien Nocera 2006-03-22 16:50:29 UTC

Created attachment 126487 [details]
nscd-dont-use-mremap.2.patch

Comment 3 Bastien Nocera 2006-03-22 16:53:44 UTC

I forgot to mention that Frank Hirtz <fhirtz> provided the fixes to
my original (and broken) backport.

Comment 10 Bob Johnson 2006-04-11 16:48:02 UTC

This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.

Comment 15 Red Hat Bugzilla 2006-08-10 21:34:44 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0510.html

Note You need to log in before you can comment on or make changes to this bug.