Bug 605506

Summary: glibc: [RFE] pthread condvars are not priority inheritance aware
Product: Red Hat Enterprise Linux 8 Reporter: IBM Bug Proxy <bugproxy>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED UPSTREAM QA Contact: qe-baseos-tools-bugs
Severity: high Docs Contact:
Priority: low    
Version: 8.2CC: ashankar, codonell, dj, fweimer, kuo-lang.tseng, law, mnewsome, pfrankli, triegel
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: 8.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-06 14:06:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description IBM Bug Proxy 2010-06-18 06:49:47 UTC
=Comment: #0=================================================
Darren V. Hart <dvhltc.com> - 
When using a PTHREAD_PRIO_INHERIT mutex with a condvar, the pthread_cond* calls can still cause an
unbounded priority inversion via the internal condvar lock. The POSIX specification doesn't provide
a mechanism to specify the protocol of the condvar.

We would like to do this at runtime, but unfortunately it is legal to call pthread_cond_signal() or
pthread_cond_broadcast() without first waiting on the lock, so the mutex type may not be known the
first time the condvar is used.

A new API, pthread_condattr_setprotocol_np() and pthread_condattr_getprotocol_np(), would allow the
user to create a PTHREAD_PRIO_INHERIT condvar. This would use a PTHREAD_PRIO_INHERIT mutex for the
internal condvar lock, eliminating the potential for hitting an unbounded priority inversion on that
lock.

This topic was initially discussed here:
http://sources.redhat.com/ml/libc-alpha/2010-01/msg00011.html

And again here:
http://sources.redhat.com/ml/libc-alpha/2010-02/msg00089.html

More details about this bug, including a prototype C implementation including make check glibc
testcases is available on libc-alpha, and is also being tracked on the glibc bugzilla here:

http://sourceware.org/bugzilla/show_bug.cgi?id=11588

Before investing more time in testcases or optimizations, a consensus needs to be had regarding the
proposed API.

Comment 5 Carlos O'Donell 2016-11-25 14:47:49 UTC
We have only just put into Fedora Rawhide a new condvar algorithm written by Torvald Riegel. I'll leave it to Torvald to comment here about priority inheritance and the new algorithm.

One of the real blockers here is the addition of two new symbols to control per-condvar PI properities. We can't add new symbols to RHEL7, therefore this would have to be a RHEL8 feature since there we can expand the API during the major version number transition.

Despite that I'm going to keep this on the RHEL7 track since it's an interesting feature we should review during each update and make sure it carries over to RHEL8.

Comment 6 Torvald Riegel 2016-11-29 14:12:12 UTC
In a nutshell, we do not know how to implement a POSIX-conforming condition variable that is also PI aware using the kernel's futex operations as of today. I have discussed this also with Darren Hart and others at the recent Linux Real-Time Summit, and nobody else had ideas or solutions so far.
Solving this may require new futex operations, or perhaps a completely different condition variable algorithm that trades off other properties important for the general case (eg, good performance) against being able to support PI.

Comment 8 Carlos O'Donell 2019-06-07 02:56:54 UTC
Given that RHEL 8.0 is out I'm moving this RFE there for us to track this as potential upstream work.

Comment 9 Carlos O'Donell 2020-07-06 14:06:56 UTC
We are going to track this bug upstream here:
https://sourceware.org/bugzilla/show_bug.cgi?id=11588

This needs significant discussion upstream with IBM, Intel, and the real-time kernel community.

We need to find a solution that balances the requirements of the general users of a condition variable and those that need priority inheritance.

I'm marking this CLOSED/UPSTREAM, and we can revisit when we have an upstream solution in place.