Bug 865716 - Fix race in takeover of a dead futex in futex_lock_pi
Summary: Fix race in takeover of a dead futex in futex_lock_pi
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-10-12 08:47 UTC by Siddhesh Poyarekar
Modified: 2015-09-14 00:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-08 04:32:19 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Siddhesh Poyarekar 2012-10-12 08:47:31 UTC
Description of problem:
The futex_lock_pi operation used by pthread_mutex_lock for robust, PI mutexes has a race when trying to take over a dead futex.  The higher level problem was reported in upstream glibc bugzilla:

http://sourceware.org/bugzilla/show_bug.cgi?id=14076

The race is between the futex_lock_pi_atomic and handle_futex_death.  futex_lock_pi_atomic assumes that a robust futex with TID==0 is fine for a direct takeover.  This assumption is wrong when there are waiters on the futex, since handle_futex_death wakes a blocked task with futex_wake.

How reproducible:
Consistently.

Steps to Reproduce:
Compile and run:

http://sourceware.org/bugzilla/attachment.cgi?id=6442

$ gcc -D_GNU_SOURCE futexCase1_r1.c -o futexCase1_r1 -lpthread
$ ./futexCase1_r1
  
Actual results:

8279: created mutex: 0xf7f1a000
8419: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
8438: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
8439: pthread_mutex_consistent_np failed: 0xf7f1a000 22 Invalid argument
…
8279: Done! lock concurrency: 0, max: 7
$

Expected results:

8279: created mutex: 0xf7f1a000
8279: Done! lock concurrency: 0, max: 7

Additional info:

I have posted a fix that works for me on lkml:

http://lkml.indiana.edu/hypermail/linux/kernel/1210.1/02508.html

Comment 1 Fedora End Of Life 2013-04-03 15:41:29 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 2 Justin M. Forbes 2013-04-05 19:14:53 UTC
Is this still an issue with the 3.9 kernels in F19?

Comment 3 Siddhesh Poyarekar 2013-04-08 04:32:19 UTC
This was fixed upstream with 59fa6245192159ab5e1e17b8e31f15afa9cff4bf, which has been in since 3.7.  It works fine on my F17 (3.8.4) now, so closing this as fixed.


Note You need to log in before you can comment on or make changes to this bug.