Bug 107644
Summary: | futex lock implementation doesn't scale. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Randy Pafford <rpafford> |
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> |
Status: | CLOSED WORKSFORME | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | drepper, petrides |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2003-12-16 20:02:34 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Randy Pafford
2003-10-21 17:32:49 UTC
We are running on 4-way IBM servers with Xeon processors. The exact workload is a program that is being run internally by our company and cannot be released. Our investigation shows that there have been/still are known bugs in the futex implementation that appear to map directly to this problem, although we could not find a specific bug report. Here is part of one discussion about what appears to be the same problem: ----------------------------------------------------- From: "Hu, Boris" <boris hu intel com> To: "Bill Soudan" <bsoudan brass com>, "Perez-Gonzalez, Inaky" <inaky perez- gonzalez intel com> Cc: <phil-list redhat com>, "Ramanujam, Ram" <ram ramanujam intel com>, "Ingo Molnar" <mingo elte hu>, "Jakub Jelinek" <jakub redhat com>, "John Levon" <levon movementarian org> Subject: RE: Poor thread performance on Linux vs. Solaris Date: Tue, 9 Sep 2003 15:38:57 +0800 -------------------------------------------------------------------------------- Try the futex_q_lock-0.2 patch. It is also against linux-2.6.0-test4. It does the following things: * Remove the global futex_lock as the previous futex_q_lock patch did. * Add bucket spinlock recursively check as Jakub mentioned. * Move vcache_lock out of lock/unlock_futex_mm() and only to protect the actual vcache operations. * Shrink some lock/unlock_futex_mm() scopes. boris --- linux-2.6.0-test4.orig/kernel/futex.c 2003-08-23 07:53:39.000000000 +0800 +++ linux-2.6.0-test4/kernel/futex.c 2003-09-09 14:15:02.000000000 +0800 @@ -57,9 +57,16 @@ struct file *filp; }; ----------------------------------------------------- There was a glibc fix that avoids a livelock bug in the mutex code. This fix was not in taroon-beta2, it's only in taroon-final - if this is a vanilla -beta2 system then could you please upgrade to taroon-final (or just to latest taroon-glibc) and re-test? Do you still get the same problem? This has been open with a request to retest with newer code for over a month and a half. Closing. If this is still a problem, please reopen, including information on reetesting. |