Bug 784733
| Summary: | BUG at kernel/rtmutex.c:472 MRG 2.1 3.0.9-rt26.45.el6rt.x86_64 kernel | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | IBM Bug Proxy <bugproxy> |
| Component: | realtime-kernel | Assignee: | Steven Rostedt <srostedt> |
| Status: | CLOSED ERRATA | QA Contact: | David Sommerseth <davids> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 2.1 | CC: | bhu, jkachuck, jkacur, jkastner, lgoncalv, ovasik, tglx, wgomerin, williams |
| Target Milestone: | 2.1.4 | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
When a sleeping task, waiting on a futex (fast userspace mutex), tried to get the spin_lock(hb->lock) RT-mutex, if the owner of the futex released the lock, the sleeping task was put on a futex proxy lock. Consequently, the sleeping task was blocked on two locks and eventually terminated in the BUG_ON() function. With this update, the WAKEUP_INPROGRESS pseudo-lock has been added to be used as a proxy lock. This pseudo-lock tells the sleeping task that it is being woken up so that the task no longer tries to get the second lock. Now, the futex code works as expected and sleeping tasks no longer crash in the described scenario.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-02-23 20:24:30 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
IBM Bug Proxy
2012-01-25 23:40:34 UTC
Looks like this is due to the futex_requeue work done by Darren Hart. It sets up a proxy lock when requeueing with a PI futex. My hunch is the waiter that is being requeued is some how blocked on an rt spin lock turned mutex. Trying to add it to a PI lock where it is already attached to a PI lock will cause this bug. This is just a hunch. More analysis would be needed.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause: sleeping task waiting on futex waking up and trying to grab spin_lock(hb->lock), which on MRG is an rtmutex, while the owner of futex, releases it putting sleeping task on futex proxy lock, causing the sleeping task to blocked on two locks
Consequence: The sleeping task hits a BUG_ON() because it can not sleep on two locks at the same time.
Fix: Add another pseudo lock WAKEUP_INPROGRESS to set the proxy lock to, that will tell the sleeping task that it is being woken up and that it wont try to grab the second lock.
Result: Task no longer crashes and the futex code works as designed.
------- Comment From niv.com 2012-02-15 18:15 EDT------- Closing bug on IBM side. Steven's patch fixed this issue and updated MRG 2.1 eraly errata kernel rpms (.50) also passed testing with the JTC reproduction scenario (See bug #78082).
Technical note updated. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
Diffed Contents:
@@ -1,7 +1 @@
-Cause: sleeping task waiting on futex waking up and trying to grab spin_lock(hb->lock), which on MRG is an rtmutex, while the owner of futex, releases it putting sleeping task on futex proxy lock, causing the sleeping task to blocked on two locks
+When a sleeping task, waiting on a futex (fast userspace mutex), tried to get the spin_lock(hb->lock) RT-mutex, if the owner of the futex released the lock, the sleeping task was put on a futex proxy lock. Consequently, the sleeping task was blocked on two locks and eventually terminated in the BUG_ON() function. With this update, the WAKEUP_INPROGRESS pseudo-lock has been added to be used as a proxy lock. This pseudo-lock tells the sleeping task that it is being woken up so that the task no longer tries to get the second lock. Now, the futex code works as expected and sleeping tasks no longer crash in the described scenario.-
-Consequence: The sleeping task hits a BUG_ON() because it can not sleep on two locks at the same time.
-
-Fix: Add another pseudo lock WAKEUP_INPROGRESS to set the proxy lock to, that will tell the sleeping task that it is being woken up and that it wont try to grab the second lock.
-
-Result: Task no longer crashes and the futex code works as designed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0333.html |