Bug 169995

Summary: Crash running pthread_kill on dead thread
Product: Red Hat Enterprise Linux 3 Reporter: Bastien Nocera <bnocera>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: benl, drepper.fsp
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-06 18:54:05 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 170445    
Attachments:
Description Flags
test.c none

Description Bastien Nocera 2005-10-06 06:36:06 EDT
glibc-2.3.2-95.37

The test case attached segfaults when pthread_kill() is run on an inexistant thread.

#0  __pthread_kill (threadid=3076430768, signo=0)
   at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:36
36        if (INVALID_TD_P (pd))
(gdb) bt
#0  __pthread_kill (threadid=3076430768, signo=0)
   at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:36
#1  0x0804853e in is_dead (tid=3076430768) at test.c:16
#2  0x0804858e in all_dead (num_threads=10, thread_list=0xbfffbd70)
   at test.c:31
#3  0x08048656 in main () at test.c:62
Comment 1 Bastien Nocera 2005-10-06 06:36:07 EDT
Created attachment 119662 [details]
test.c
Comment 2 Jakub Jelinek 2005-10-06 18:54:05 EDT
That's expected behaviour with NPTL.
pthread_t is just a cookie, in LinuxThreads it used to be a small integer from
0 to maximum number of threads, where it was fairly cheap to test whether
the cookie is valid and belongs to a running thread or not.
In NPTL pthread_t is a pointer to the thread's control structure, that is
freed after either the thread has been pthread_join'ed, or has exited when
in detached state.  Accessing it afterwards is similar to passing a fclosed
FILE * descriptor to fwrite, etc.
POSIX usually says that function XXX may fail if:
[ESCRH] No thread could be found corresponding to that specified by the given
thread ID.
This means it is allowed to report that as error, but it must not necessarily
do so, so both LinuxThreads and NPTL behaviour is correct.  There are a few
places where POSIX used to say "shall fail" rather than "may fail" when talking
about pthread_t (or other cookie) validity checking, but those places have
either been already corrected in the standard, or are the fixes for them
are pending.
Comment 3 Johnray Fuller 2005-10-07 13:25:19 EDT
Jakub, the opengroup spec states "shall fail" instead of "may fail". Is there a
place I can check to verify the new wording is in the queue to be corrected?

Reference:

http://www.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html

J
Comment 4 Ulrich Drepper 2005-10-08 12:08:33 EDT
The standard in no place requires that invalid thread descriptors are
recognized.  pthread_kill is a weird function in that it is modelled after kill.
 But the assumptions which can be made about way the target is specified cannot
be carried forward to pthread_kill.  There has been a lot of discussion around
this over the years, you can find it if you look for it.  The result was (so
far) that the text stays as if although it is recognized to be problematic.

If you look at implementations, the thread IDs (just like PIDs) can be reused
and this reuse is completely beyond the control of the program or any user. 
This means there *never* is a guarantee that any ID passed to kill or
pthread_kill is invalid since some process/thread might have been created (maybe
without direct control of the program) and then the kill/pthread_kill would find
a correct target (and do something bad).

I.e., pthread_kill and kill can really only ever be used correctly if the target
is known to exist.

There will be no change.
Comment 8 Johnray Fuller 2005-11-07 18:09:34 EST
I think the fact that "shall" is used here is causing some confusion and
disagreement:

http://www.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html

If "shall" should be "may", why is it not so on the opengroup site?

It seems the expectation of ESRCH is set there, instead of what we get, a SIGSEGV.

J