Description of problem: nscd crashes after some amount of uptime with a SIGABRT. It's quite repeatable - it *always* dies after 10-30 minutes. Version-Release number of selected component (if applicable): nscd-2.7.90-4 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: I attached gdb to it and waited for a crash, and it said: Program received signal SIGABRT, Aborted. [Switching to Thread 0x4082c950 (LWP 6498)] 0x00007fc883769f95 in raise () from /lib64/libc.so.6 (gdb) where #0 0x00007fc883769f95 in raise () from /lib64/libc.so.6 #1 0x00007fc88376ba40 in abort () from /lib64/libc.so.6 #2 0x00007fc88376310f in __assert_fail () from /lib64/libc.so.6 #3 0x00007fc884934a75 in gc () from /usr/sbin/nscd #4 0x00007fc884933507 in prune_cache () from /usr/sbin/nscd #5 0x00007fc884929430 in nscd_run_prune () from /usr/sbin/nscd #6 0x00007fc8842e7447 in start_thread () from /lib64/libpthread.so.0 #7 0x00007fc88380fbdd in clone () from /lib64/libc.so.6 Running 'nscd --debug' ends with: 7409: GETFDPW 7409: provide access to FD 6, for passwd 7409: handle_request: request received (Version = 2) from PID 7789 7409: GETFDPW 7409: provide access to FD 6, for passwd 7409: Reloading "valdis" in password cache! 7409: remove GETPWBYUID entry "967" 7409: remove GETPWBYNAME entry "valdis" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. Another run: 7822: handle_request: request received (Version = 2) from PID 8253 7822: GETFDPW 7822: provide access to FD 6, for passwd 7822: Reloading "0" in password cache! 7822: Reloading "valdis" in password cache! 7822: remove GETPWBYUID entry "967" 7822: remove GETPWBYNAME entry "valdis" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. Aborted
Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
*** Bug 443388 has been marked as a duplicate of this bug. ***
*** Bug 445656 has been marked as a duplicate of this bug. ***
I get the same results using nscd under F9/i386: May 19 12:35:34 mgmt1 kernel: nscd[16709]: segfault at fffc1ea0 ip b80a9025 sp ad4c1d88 error 6 in nscd[b8098000+1e000] May 19 14:51:23 mgmt1 kernel: nscd[11695]: segfault at ec80d426 ip b80d2b82 sp ad0e6c14 error 5 in nscd[b80c2000+1e000] May 19 16:00:38 mgmt1 kernel: nscd[3555]: segfault at b81ff0fc ip b8011178 sp ad428e08 error 4 in nscd[b8000000+1e000] May 19 21:49:37 mgmt1 kernel: nscd[6838]: segfault at fffe7e90 ip 0021bf46 sp ad187e68 error 6 in libc-2.8.so[1a5000+163000] May 20 01:43:15 mgmt1 kernel: nscd[3736]: segfault at 18 ip b80272fd sp ad43eef4 error 4 in nscd[b8016000+1e000] May 20 12:11:38 mgmt1 kernel: nscd[15682]: segfault at b7f1fec2 ip b7f20a4f sp ad135ea0 error 7 in nscd[b7f0f000+1e000] May 20 13:17:58 mgmt1 kernel: nscd[9917]: segfault at 15a0b22c ip b7f89b82 sp aaccec14 error 4 in nscd[b7f79000+1e000] May 20 23:05:36 mgmt1 kernel: nscd[31898]: segfault at 12acf62d ip b805e236 sp ad275078 error 4 in nscd[b804e000+1e000] May 21 01:01:01 mgmt1 kernel: nscd[25308]: segfault at 1773c42c ip b8067b82 sp ad07ac14 error 4 in nscd[b8057000+1e000] May 21 11:39:20 mgmt1 kernel: nscd[20512]: segfault at 18 ip b7f632fd sp ab3afec4 error 4 in nscd[b7f52000+1e000] May 21 14:24:25 mgmt1 kernel: nscd[2447]: segfault at fffe50f8 ip b80172de sp ad42ee18 error 4 in nscd[b8006000+1e000] May 21 15:23:36 mgmt1 kernel: nscd[6218]: segfault at ed0ffedc ip b7f0a2c6 sp ad11fe2c error 5 in nscd[b7ef9000+1e000] May 22 14:40:12 mgmt1 kernel: nscd[26487]: segfault at ffffe0f8 ip b7f102de sp ad327df4 error 4 in nscd[b7eff000+1e000] May 22 15:49:21 mgmt1 kernel: nscd[3267]: segfault at ca8 ip b7ff5185 sp ab23fe38 error 4 in nscd[b7fe4000+1e000] May 23 17:08:37 mgmt1 kernel: nscd[3308]: segfault at f777124c ip b7fe121e sp ad1f8078 error 5 in nscd[b7fd1000+1e000]
You should change the Hardware tag to all instead of x86_64 since I am experiencing the same problem on i386 as does Andrew (comment #4) and one of bugs marked as duplicate of this one (Bug 443388). Otherwise we should mark the duplicates as non-duplicates and continue with two bug reports (one for architecture).
This is the info I get from gdb: 29408: handle_request: request received (Version = 2) from PID 29633 29408: GETFDHST 29408: provide access to FD 11, for hosts 29408: handle_request: request received (Version = 2) from PID 29633 29408: GETFDPW 29408: provide access to FD 7, for passwd 29408: handle_request: request received (Version = 2) from PID 29633 29408: GETFDGR 29408: provide access to FD 9, for group 29408: handle_request: request received (Version = 2) from PID 29633 29408: GETFDHST 29408: provide access to FD 11, for hosts 29408: handle_request: request received (Version = 2) from PID 29634 29408: GETFDPW 29408: provide access to FD 7, for passwd 29408: handle_request: request received (Version = 2) from PID 29634 29408: GETFDGR 29408: provide access to FD 9, for group 29408: handle_request: request received (Version = 2) from PID 29634 29408: GETFDHST 29408: provide access to FD 11, for hosts 29408: handle_request: request received (Version = 2) from PID 29625 29408: GETPWBYUID (0) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xaf86fb90 (LWP 29417)] 0xb80adb82 in cache_search () from /usr/sbin/nscd Is there a way to build an nscd-debuginfo package? With rpm --rebuild glib.... I only get the glib-debuinfo packages.
(In reply to comment #6) <snip> > Is there a way to build an nscd-debuginfo package? With rpm --rebuild glib.... > I only get the glib-debuinfo packages. That's because the packages you're looking for are called "glibc", not "glib". Debuginfo packages are most likely already available. See: http://fedoraproject.org/wiki/StackTraces
A little more info: I have several systems using LDAP + NSCD on x86 (non 64bit) hardware. The most heavily loaded system seems to crash the most often. NSCD pegs the CPU at 100% just before crashing. Bug ID 448916 may be a duplicate of this bug.
I'm seeing this too on x86_64. Not sure if this is related, but I also started seeing a bunch of SELinux denials (I'm running in permissive mode) shortly thereafter, which I think was after the next reboot. The denial always pops up when NetworkManager connects, and sometimes nscd crashes around this time. I'm using local authentication only.
Should work fine in the current release.