Bug 1302389
Summary: | Deadlocks Occurring in Shared Mutex Code | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alan Matsuoka <alanm> | ||||
Component: | glibc | Assignee: | Torvald Riegel <triegel> | ||||
Status: | CLOSED WONTFIX | QA Contact: | qe-baseos-tools-bugs | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.7 | CC: | ashankar, codonell, dkochuka, fweimer, metze, mnewsome, pandrade, pfrankli, triegel | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-04-03 15:48:10 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1269194 | ||||||
Attachments: |
|
Description
Alan Matsuoka
2016-01-27 17:05:07 UTC
Created attachment 1118834 [details]
reproducer
Summary: ======== RHEL 6.10 is in production phase 3 and only selected Urgent Priority Bug Fixes will be considered. The upstream bugs that have been fixed are all pre-existing issues in RHEL 6 and not new regressions. Fixing these bugs in RHEL 6 could have significant destabilizing effects on the robust mutex implementation. It is with this consideration that I'm marking this issue as CLOSED/WONTFIX for RHEL 6. We will continue to enhance RHEL 7 with any fixes required for robust mutex support. Technical details: ================== The process shared robust mutex support was audited by the Red Hat Platform Tools Base OS team (glibc team) and it was found that there were several key defects that could cause deadlocks and hangs in the implementation under certain conditions. The team has since then fixed several upstream issues including: Fix lost wakeups with robust mutex: https://sourceware.org/bugzilla/show_bug.cgi?id=20973 Fix x86 robust mutex assembly code: https://sourceware.org/bugzilla/show_bug.cgi?id=20985 Deadlock with robust shared mutex: https://sourceware.org/bugzilla/show_bug.cgi?id=19402 Correction of compiler barriers around robust shared mutex code: Commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e The most complex problem here is that the existing x86 assembly optimized robust mutex code is simply wrong with regards to condition handling and it's removal allows the use of the generic C version and reliance on modern compiler optimizations to achieve an optimal balance between provably correct and performance. This kind of change is not possible to make in the RHEL 6 release since such a change would drastically change the behaviour of the robust mutex. To compound things there is still one more case which is too costly to be fixed without kernel help. One might call it a design flaw in the shared algorithm between the kernel and userspace: "File corruption race condition in robust mutex unlocking" https://sourceware.org/bugzilla/show_bug.cgi?id=14485 Bug 14485 is not intended to be alarmist and the race window is considered exceedingly small, but still there, like other latent bugs we don't know about in the implementation. Thus the amount of change required, and future design constraints, make this risky for RHEL 6. The risks are quite hight the changes could destabilize products already built upon RHEL 6. In RHEL 7 we have a lot more flexibility when it comes to fixing these issues (except 14485 where it needs a broader fix upstream). As a final note, here is Torvald Riegel's excellent technical summary of the current status: https://sourceware.org/ml/libc-alpha/2016-12/msg00950.html Again, we recommend working with Red Hat to ensure RHEL 7 meets your needs with respect to robust mutexes. |