glibc-2.3.2-95.37 The test case attached segfaults when pthread_kill() is run on an inexistant thread. #0 __pthread_kill (threadid=3076430768, signo=0) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:36 36 if (INVALID_TD_P (pd)) (gdb) bt #0 __pthread_kill (threadid=3076430768, signo=0) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:36 #1 0x0804853e in is_dead (tid=3076430768) at test.c:16 #2 0x0804858e in all_dead (num_threads=10, thread_list=0xbfffbd70) at test.c:31 #3 0x08048656 in main () at test.c:62
Created attachment 119662 [details] test.c
That's expected behaviour with NPTL. pthread_t is just a cookie, in LinuxThreads it used to be a small integer from 0 to maximum number of threads, where it was fairly cheap to test whether the cookie is valid and belongs to a running thread or not. In NPTL pthread_t is a pointer to the thread's control structure, that is freed after either the thread has been pthread_join'ed, or has exited when in detached state. Accessing it afterwards is similar to passing a fclosed FILE * descriptor to fwrite, etc. POSIX usually says that function XXX may fail if: [ESCRH] No thread could be found corresponding to that specified by the given thread ID. This means it is allowed to report that as error, but it must not necessarily do so, so both LinuxThreads and NPTL behaviour is correct. There are a few places where POSIX used to say "shall fail" rather than "may fail" when talking about pthread_t (or other cookie) validity checking, but those places have either been already corrected in the standard, or are the fixes for them are pending.
Jakub, the opengroup spec states "shall fail" instead of "may fail". Is there a place I can check to verify the new wording is in the queue to be corrected? Reference: http://www.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html J
The standard in no place requires that invalid thread descriptors are recognized. pthread_kill is a weird function in that it is modelled after kill. But the assumptions which can be made about way the target is specified cannot be carried forward to pthread_kill. There has been a lot of discussion around this over the years, you can find it if you look for it. The result was (so far) that the text stays as if although it is recognized to be problematic. If you look at implementations, the thread IDs (just like PIDs) can be reused and this reuse is completely beyond the control of the program or any user. This means there *never* is a guarantee that any ID passed to kill or pthread_kill is invalid since some process/thread might have been created (maybe without direct control of the program) and then the kill/pthread_kill would find a correct target (and do something bad). I.e., pthread_kill and kill can really only ever be used correctly if the target is known to exist. There will be no change.
I think the fact that "shall" is used here is causing some confusion and disagreement: http://www.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html If "shall" should be "may", why is it not so on the opengroup site? It seems the expectation of ESRCH is set there, instead of what we get, a SIGSEGV. J