Bug 799389
Summary: | lglocks can be taken and never released on cpu offline and onlining | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Steven Rostedt <srostedt> | ||||
Component: | realtime-kernel | Assignee: | John Kacur <jkacur> | ||||
Status: | CLOSED ERRATA | QA Contact: | David Sommerseth <davids> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 2.2 | CC: | bhu, jkacur, jkastner, lgoncalv, ovasik | ||||
Target Milestone: | 2.2 | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
Cause: RT Versions of lglock uses for_each_online_cpu()
Consequence: locks can be reactived after a cpu comes online again, but with an owner that hasn't released it. This causes various problems such as blocking the original owner of the lock.
Fix: Convert the RT versions to use the lglock specific cpumasks
Result: Locks will be taken for CPUs that are offline. But they are also released when the CPU is offline and it doesn't cause the issue where a lock may be left with an owner that abandoned it.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-09-19 18:03:33 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Steven Rostedt
2012-03-02 16:13:09 UTC
Created attachment 567097 [details]
Use a separate cpumask for taking lglocks, not the online mask
The non-RT code for taking lglocks uses its own cpumask to take the locks. A CPU bit is set in the mask when it comes online and is never released. That means the locks will be taken for CPUs that are offline. But they are also released when the CPU is offline and it doesn't cause the issue where a lock may be left with an owner that abandoned it.
This patch converts the RT side to simulate the non-RT and fixes the deadlocks.
This patch is equivalent to 7837aec git describe --contains 7837aec v3.2.14-rt24~7 Updating kernel-rt.spec to reflect this. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: RT Versions of lglock uses for_each_online_cpu() Consequence: locks can be reactived after a cpu comes online again, but with an owner that hasn't released it. This causes various problems such as blocking the original owner of the lock. Fix: Convert the RT versions to use the lglock specific cpumasks Result: Locks will be taken for CPUs that are offline. But they are also released when the CPU is offline and it doesn't cause the issue where a lock may be left with an owner that abandoned it. This patch was added to 3.2.14-rt24 (upstream stable-rt). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1282.html |