Bug 799385 - Thread can dead lock in migrate timers
Summary: Thread can dead lock in migrate timers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel
Version: 2.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 2.2
: ---
Assignee: John Kacur
QA Contact: David Sommerseth
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-02 15:58 UTC by Steven Rostedt
Modified: 2016-05-22 23:34 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: spin_trylock in migrate_timers disables preemption Consequence: Deadlock Fix: Allow the lock to block (sleep), and protect data by disabling cpu migration. Result: Works as expected - no deadlock.
Clone Of:
Environment:
Last Closed: 2012-09-19 18:03:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Use cpu_local_var() and let the spin lock block (3.2-rt) (1.77 KB, patch)
2012-03-02 15:59 UTC, Steven Rostedt
no flags Details | Diff
Use cpu_local_var() and let the spin lock block (3.0-rt) (1.77 KB, patch)
2012-03-02 16:00 UTC, Steven Rostedt
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:1282 0 normal SHIPPED_LIVE Moderate: kernel-rt security, bug fix, and enhancement update 2012-09-19 22:02:30 UTC

Description Steven Rostedt 2012-03-02 15:58:02 UTC
The migrate timers code has a while (spin_trylock()); loop so that the spin lock that is converted to a mutex wont schedule out, because preemption is disabled at this point. This makes the mutex act more like a spinlock.

But! If the task preempts the holder of this lock, and the holder of this lock will have preemption disabled (because that's what the RT kernel does to spin locks converted to mutexes; it disables migration when the lock is taken), this task will spin forever.

The task has preemption disabled, it preempted the holder of the lock which is pinned to the current CPU, and now this task will spin waiting for the one it preempted to finish. But this task will never give up the CPU to let the other task finish. Dead lock!

Comment 1 Steven Rostedt 2012-03-02 15:59:05 UTC
Created attachment 567093 [details]
Use cpu_local_var() and let the spin lock block (3.2-rt)

Patch to fix 3.2-rt

Comment 2 Steven Rostedt 2012-03-02 16:00:01 UTC
Created attachment 567094 [details]
Use cpu_local_var() and let the spin lock block (3.0-rt)

Patch to fix 3.0-rt

Comment 3 John Kacur 2012-05-23 13:24:05 UTC
The fix for this is equivalent to 7864ac1
git describe --contains 7864ac1
v3.2.14-rt24~11

Modifying the changelog in kernel-rt.spec to document this.

Comment 6 John Kacur 2012-06-20 20:12:09 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: spin_trylock in migrate_timers disables preemption
Consequence: Deadlock
Fix: Allow the lock to block (sleep), and protect data by disabling cpu migration.
Result: Works as expected - no deadlock.

Comment 8 Steven Rostedt 2012-07-03 14:18:49 UTC
The 3.2 version of this patch was picked up and added to 3.2.14-rt24 (upstream stable).

Comment 11 errata-xmlrpc 2012-09-19 18:03:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1282.html


Note You need to log in before you can comment on or make changes to this bug.