Bug 142551
Summary: | getrusage(RUSAGE_SELF) doesn't count threads | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Johan Walles <johan.walles> | ||||
Component: | kernel | Assignee: | Ernie Petrides <petrides> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.0 | CC: | dmaley, karlamrhein, mingo, nhorman, peterm, petrides, riel, roland, sct, tao, tburke, woodard | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-05-17 03:18:59 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Johan Walles
2004-12-10 16:17:20 UTC
Created attachment 108325 [details]
Repro case
It seems as if I forgot to state the actual problem. Duh... The problem is that getrusage(RUSAGE_SELF) returns values appropriate for the current thread only, not for the whole process as the man page says it should. Without kernel help this is really hard to do. This has already been changed in the upstream kernel as of 2.6.9, and that change will be in RHEL4. The behavior of getrusage with regard to NPTL threads is a known limitation in RHEL3 and we do not anticipate changing the well-understood semantics of RHEL3 system calls in a bug-fix update. This and several other kernel issues regarding POSIX semantics of system calls in the presence of multiple threads under NPTL are being addressed in RHEL4. Do you know of any good way to either probe for the current semantics or to request certain semantics of the getrusage() syscall? Or is what I'm doing in the repro case the best way to probe? To my knowledge, no kernel that doesn't report a version number of 2.6.9 or higher has the new semantics. So you could just do a version test, though that is always in principle less reliable than an empirical feature test. Since the fixed getrusage also reports threads that have died, there is a different approach to the test you could take that would not be subject to false results in unusual situations of scheduling and the like, i.e. not timing dependent. That is, create a thread that chews a little and samples itself with getrsuage to make sure some progress has happened, then dies. Then create a second thread that immediately calls getrusage. In an old kernel, the new thread will see almost no time on its counters, less than the total seen by the first thread; in a new kernel, it will always see a total at least as high as the sample the first thread took. This change is guaranteed to break applications. I say "guaranteed" because the semantics change from RHEL3 to RHEL4 did break some of the benchmarks Shak has run, so clearly there is code out there that relies on the current behaviour in RHEL3. Strictly speaking, RHEL3 behaviour may be wrong, but we know for a fact that code exists that relies on it. The consensus seems that this should be closed as WONTFIX. The correct behavior is already in RHEL4. |