Bug 2273757 - pthread_kill(t, 0) returns 0 even the thread t has exited.
Summary: pthread_kill(t, 0) returns 0 even the thread t has exited.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 39
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-04-06 08:36 UTC by Jean-frederic Clere
Modified: 2024-04-12 18:32 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-06 20:13:13 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jean-frederic Clere 2024-04-06 08:36:45 UTC
I am calling pthread_kill(t, 0) to check if a thread has exited.
on old fedora pthread_kill() return 3 on fedora39 it return 0.

Reproducible: Always

Steps to Reproduce:
1. build the following example:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#define NUM_THREADS     5

void *PrintHello(void *threadid)
{
   long tid;
   tid = (long)threadid;
   printf("Hello World! It's me, thread #%ld!\n", tid);
   pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t threads[NUM_THREADS];
   int rc;
   long t;
   for(t=0; t<NUM_THREADS; t++){
      printf("In main: creating thread %ld\n", t);
      rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t);
      if (rc){
         printf("ERROR; return code from pthread_create() is %d\n", rc);
         exit(-1);
      }
   }
   sleep(10);
   for(t=0; t<NUM_THREADS; t++){
      rc = pthread_kill(threads[t], 0);
      if (rc){
         printf("ERROR; return code from pthread_kill() is %d\n", rc);
         exit(-1);
      } else {
         printf("thread: %ld still running\n", threads[t]);
      }
   }

   /* Last thing that main() should do */
   pthread_exit(NULL);
}

2. use make to build
3. run it.
Actual Results:  
jfclere@fedora:~/test$ ./threads 
In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
Hello World! It's me, thread #0!
In main: creating thread 3
Hello World! It's me, thread #1!
Hello World! It's me, thread #2!
In main: creating thread 4
Hello World! It's me, thread #3!
Hello World! It's me, thread #4!
thread: 139819774641856 still running
thread: 139819764156096 still running
thread: 139819753670336 still running
thread: 139819743184576 still running
thread: 139819732698816 still running


Expected Results:  
[root@neo3 test]# ./threads 
In main: creating thread 0
In main: creating thread 1
Hello World! It's me, thread #0!
In main: creating thread 2
Hello World! It's me, thread #1!
In main: creating thread 3
Hello World! It's me, thread #2!
In main: creating thread 4
Hello World! It's me, thread #3!
Hello World! It's me, thread #4!
ERROR; return code from pthread_create() is 3

Comment 1 Florian Weimer 2024-04-06 20:13:13 UTC
This is a deliberate change. POSIX strongly suggests the current behavior:

“
Existing implementations vary on the result of a pthread_kill() with a thread ID indicating an inactive thread (a terminated thread that has not been detached or joined). Some indicate success on such a call, while others give an error of [ESRCH]. Since the definition of thread lifetime in this volume of POSIX.1-2017 covers inactive threads, the [ESRCH] error as described is inappropriate in this case. In particular, this means that an application cannot have one thread check for termination of another with pthread_kill().
”

<https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_kill.html>

The application needs to be fixed.

Comment 2 Yann Ylavic 2024-04-12 13:46:02 UTC
The only usefulness of ESRCH was for "inactive" threads because any usage of the pthread API on a thread after its lifetime (i.e. after pthread_join/detach) is undefined behaviour already.
So unless I'm missing something, conforming to POSIX here seems to not help anyone, it just makes pthread_kill(,0) completely useless and breaks existing applications.

Comment 3 Florian Weimer 2024-04-12 18:32:37 UTC
(In reply to Yann Ylavic from comment #2)
> The only usefulness of ESRCH was for "inactive" threads because any usage of
> the pthread API on a thread after its lifetime (i.e. after
> pthread_join/detach) is undefined behaviour already.
> So unless I'm missing something, conforming to POSIX here seems to not help
> anyone, it just makes pthread_kill(,0) completely useless and breaks
> existing applications.

Only applications which are relinked, we have a compatibility symbol for the old behavior.

If you rebuild the application anyway, you might as well fix it. The old behavior was not reliable due to TID reuse. The application could keep getting 0 even though the original thread had exited.


Note You need to log in before you can comment on or make changes to this bug.