From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0 Description of problem: It seems a process where the main thread exits is marked as a zombie, and one is unable to e.g. debug it with gdb -p <pid> Version-Release number of selected component (if applicable): glibc-2.3.4-2.fc3,kernel-2.6.9-1.724_FC3 How reproducible: Always Steps to Reproduce: 1. Little testcase, compile gcc -g pt.c -lpthread #include <stdio.h> #include <unistd.h> #include <pthread.h> void* sleeper(void *a) { for(;;) sleep(1000); return NULL; } int main(void) { pthread_t id; pthread_create(&id, 0, sleeper, 0); pthread_exit(0); } Run it, and ps -ef shows noselasd 8383 5387 0 09:03 pts/2 00:00:00 [a.out] <defunct> (Doesn't happen on e.g. Solaris, where one now can attach a debugger , all looks fine.) Additional info:
Well, if you know the tid, you can still attach the debugger to it.
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you.
This problem still exists upstream and I'm sure that the update kernel didn't change it.
This is a mass-update to all currently open Fedora Core 3 kernel bugs. Fedora Core 3 support has transitioned to the Fedora Legacy project. Due to the limited resources of this project, typically only updates for new security issues are released. As this bug isn't security related, it has been migrated to a Fedora Core 4 bug. Please upgrade to this newer release, and test if this bug is still present there. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. Thank you.
This issue still exists with current rawhide.
This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you.
I'm going to move this to a 'devel' bug, as this may need considerable work to fix this up, and that is unlikely to happen before FC4 reaches end-of-life.
The kernel side of this was fixed upstream some time ago. The ps display is a procps issue.
Is this really fixed ? $ ps -ef | grep a.out davej 31613 31487 0 16:29 pts/3 00:00:00 [a.out] <defunct> gdb shows.. (gdb) attach 31613 Attaching to process 31613 ptrace: Operation not permitted. (gdb) Doing it as root seems to make gdb explode... (gdb) attach 31613 Attaching to process 31613 ../../gdb/linux-nat.c:1057: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. A problem internal to GDB has been detected, further debugging may prove unreliable.
I think the kernel part is probably fixed. AFAIK, the underlying kernel problem here was that /proc/PID/task could not be read when the initial thread had died. I think all these other problems are ps and gdb being confused. gdb can't attach to the initial thread because it's a zombie, but it should be able to attach to the remaining threads, which it should now be able to find due to /proc/PID/task working. ps is just being confusing; with -m it's less confusing.
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
This bug has been in NEEDINFO for more than 30 days since feedback was first requested. As a result we are closing it. If you can reproduce this bug in the future against a maintained Fedora version please feel free to reopen it against that version. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp