Description of problem:
pthread_cond_timedwait()/pthread_cond_wait()/pthread_cond_signal() use a mutex
(cond->__data.__lock) to protect internal data structures. This internal mutex
is not PIP making the condition variable implementation subject to priority
Version-Release number of selected component (if applicable):
Attached test reproduces the problem.
gcc -o ./pitest -D_GNU_SOURCE ./pitest.c -lpthread -lrt
run on a single processor:
taskset -c 0 ./pitest
It hangs because of the priority inversion...
- Thread 1 runs at priority 50
- Thread 1 grabs mutex mx
- Thread 1 starts Thread 2 at priority 70
- Thread 2 blocks waiting for mutex mx, Thread 1 is boosted to priority 70
- Thread 1 starts Thread 3 at priority 60. Thread 3 will loop taking all
the cpu if it can
- Thread 1 calls pthread_cond_wait() on condition variable cv with mutex
mx. Internally, it grabs the non-PI lock lx, then releases mutex mx. As
soon it drops mx, its priority drops to 50.
- Thread 2 resumes execution, releases mx, and calls pthread_signal on
cv. Internally it tries to grab the non-PI lock lx. It is owned by
Thread 1 so Thread 2 gives up the cpu.
- Thread 3 starts running and take all the CPU, thread 1 can't run:
Steps to Reproduce:
Created attachment 298787 [details]
This bug report seems to have been ignored for some time now, yet the problem is
obviously known as it is referred to in bug 447871. Has this bug been
mis-classified? If so please move it into the right place.
This bug appears to have been completely ignored so I've moved it to a category where perhaps someone will at least take a look at it, and move it to the appropriate place.
The effort to make condition variable PI-aware is ongoing, but happening upstream (on lkml and linux-rt-users mailing lists).
There is currently code being prototyped in the kernel to handle this, but the problem is coming up with a modification to glibc that is acceptable to all the maintainers (both kernel and libc). The sticking point currently is that all solutions in glibc currrently impose a performance penalty on any condvar, even one that doesn't need PI. This is deemed unacceptable, so work continues.
Thanks for the update Clark! I knew moving this to kernel would get a prompt response :)
as far as I can tell, there is no movement on this in the upstream libc community. Closing with WONTFIX.