Bug 865716 - Fix race in takeover of a dead futex in futex_lock_pi
Fix race in takeover of a dead futex in futex_lock_pi
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
Unspecified Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
: Patch
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-12 04:47 EDT by Siddhesh Poyarekar
Modified: 2015-09-13 20:23 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-04-08 00:32:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Siddhesh Poyarekar 2012-10-12 04:47:31 EDT
Description of problem:
The futex_lock_pi operation used by pthread_mutex_lock for robust, PI mutexes has a race when trying to take over a dead futex.  The higher level problem was reported in upstream glibc bugzilla:

http://sourceware.org/bugzilla/show_bug.cgi?id=14076

The race is between the futex_lock_pi_atomic and handle_futex_death.  futex_lock_pi_atomic assumes that a robust futex with TID==0 is fine for a direct takeover.  This assumption is wrong when there are waiters on the futex, since handle_futex_death wakes a blocked task with futex_wake.

How reproducible:
Consistently.

Steps to Reproduce:
Compile and run:

http://sourceware.org/bugzilla/attachment.cgi?id=6442

$ gcc -D_GNU_SOURCE futexCase1_r1.c -o futexCase1_r1 -lpthread
$ ./futexCase1_r1
  
Actual results:

8279: created mutex: 0xf7f1a000
8419: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
8438: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
8439: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
…
8279: Done! lock concurrency: 0, max: 7
$

Expected results:

8279: created mutex: 0xf7f1a000
8279: Done! lock concurrency: 0, max: 7

Additional info:

I have posted a fix that works for me on lkml:

http://lkml.indiana.edu/hypermail/linux/kernel/1210.1/02508.html
Comment 1 Fedora End Of Life 2013-04-03 11:41:29 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Comment 2 Justin M. Forbes 2013-04-05 15:14:53 EDT
Is this still an issue with the 3.9 kernels in F19?
Comment 3 Siddhesh Poyarekar 2013-04-08 00:32:19 EDT
This was fixed upstream with 59fa6245192159ab5e1e17b8e31f15afa9cff4bf, which has been in since 3.7.  It works fine on my F17 (3.8.4) now, so closing this as fixed.

Note You need to log in before you can comment on or make changes to this bug.