Bug 165256 - nscd segfaults on large UID lookup
Summary: nscd segfaults on large UID lookup
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-08-05 22:12 UTC by Rudi Chiarito
Modified: 2007-11-30 22:11 UTC (History)
0 users

Fixed In Version: 2.3.5-10.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-08-08 13:59:21 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Rudi Chiarito 2005-08-05 22:12:06 UTC
Description of problem:
As mentioned in #163538, I am seeing this even in the latest FC4 testing
glibc/nscd (glibc-2.3.5-10.2):

30677: Reloading "14339447" in password cache!
*** Segmentation fault
Register dump:

 EAX: 00000001   EBX: 00e7aca0   ECX: 0000008c   EDX: 00000005
 ESI: b726b3b8   EDI: 6e54a504   EBP: b7067db4   ESP: b7067bac

 EIP: 00e72836   EFLAGS: 00010a13

 CS: 0073   DS: 007b   ES: 007b   FS: 0000   GS: 0033   SS: 007b

 Trap: 0000000e   Error: 00000004   OldMask: 00000000
 ESP/signal: b7067bac   CR2: 6e54a518

Backtrace:
/lib/libSegFault.so[0x483115]
[0x1ad420]
nscd[0xe6d6a0]
/lib/libpthread.so.0[0xdb7b80]
/lib/libc.so.6(__clone+0x5e)[0x6d99ae]

The UID is legit... just quite a bit higher than ordinary UIDs. It smells like
#163538 to me, hence my questions there.

Version-Release number of selected component (if applicable):
$ rpm -q glibc
glibc-2.3.5-10.2
$ rpm -q nscd
nscd-2.3.5-10.2
$ rpm -q glibc-debuginfo
glibc-debuginfo-2.3.5-10.2

How reproducible:
Always

Steps to Reproduce:
1. Start nscd
2. Run any program that involves an uid lookup
  
Actual results:
nscd crashes

Expected results:
No crash

Additional info:
I had glibc-debuginfo already installed at the time of my post in #163538, like
in the rpm output above, but no meaningful stack trace. I don't see the same
happening on FC3 systems with the same nscd.conf and nsswitch.conf (passwd:
files ldap and shadow: files ldap).

Comment 1 Jakub Jelinek 2005-08-06 07:54:35 UTC
Have you wiped the old /var/db/nscd/* cache after you upgraded from 2.3.5-10?
nscd-2.3.5-10 (the original FC4 nscd) was miscompiled, so it is possible it
created a broken persistent cache files.  And when this happens, even
fixed nscd crashes on it (a database checker for nscd databases is still work
in progress).
The above libSegFault.so output is not really very much useful, as nscd is a
PIE.  I can just guess the instruction where it crashed, the backtrace most
probably has the 3rd frame in nscd_run right after the call to prune_cache,
but it might very well be in gc (which would support the theory of broken
database files from 2.3.5-10).
So, can you please remove /var/db/nscd/* (after making a backup copy) and
if you can reproduce the problem with 2.3.5-10.2 even after that, try to
reproduce it with
gdb --args nscd -d
and find more details?

Comment 2 Rudi Chiarito 2005-08-08 13:59:21 UTC
I tried running nscd under gdb as you suggested and I got these two dumps:

19575: Reloading "99" in password cache!
19575: remove GETPWBYNAME entry "pcap"
19575: remove GETPWBYUID entry "77"

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1221579856 (LWP 19601)]
0x0036332f in gc (db=0x36b040) at mem.c:354
354                          && (*next_data)->packet == off_alloc);


19628: remove GETGRBYGID entry "47"

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1223939152 (LWP 19656)]
0x0025c836 in prune_cache (table=0x265140, now=1123507878) at cache.c:242
242               struct datahead *dh = (struct datahead *) (data + runp->packet);

I then removed the cache files and, voila', it works again. Thanks for the help,
I should have thought of that.

Marking as fixed in 2.3.5-10.2; I guess you might want to change it to
FUTURERELEASE if you think the problem is more satisfactorily fixed by the
checker, whenever that is going to be merged in.


Comment 3 Jakub Jelinek 2005-08-09 07:54:46 UTC
rawhide glibc (2.3.90-8) includes a nscd persistent database verifier, which is
run on nscd startup.  If the database is corrupted, nscd will remove it and
recreate it from scratch.  If this works well in rawhide, it will be eventually
backported to FC4 and maybe FC3 as well.



Note You need to log in before you can comment on or make changes to this bug.