Bug 142989 - Terminated threads' resource usage is hidden from procps
Summary: Terminated threads' resource usage is hidden from procps
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Ingo Molnar
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 156322
TreeView+ depends on / blocked
 
Reported: 2004-12-15 16:48 UTC by Lev Makhlis
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version: RHSA-2005-514
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-10-05 12:36:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Test program (241 bytes, text/plain)
2004-12-15 16:50 UTC, Lev Makhlis
no flags Details
backport of those two upstream changesets to 2.6.9-5.EL (4.41 KB, patch)
2005-01-18 22:57 UTC, Roland McGrath
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:514 0 qe-ready SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 2 2005-10-05 04:00:00 UTC

Description Lev Makhlis 2004-12-15 16:48:28 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20041020

Description of problem:
In a multithreaded process, the resource usage of terminated threads
is tracked in ->signal->utime/stime/etc.  When the process dies, those
counters are rolled into the parent process' cutime/cstime/etc.  But
as long as the process is alive, getrusage(2) is the only way to see
them.  They don't show anywhere in /proc, nor consequently in ps(1) or
top(1).  This is a problem with processes that spawn many threads
which do work and then terminate.


Version-Release number of selected component (if applicable):
2.6.9-1.675_EL

How reproducible:
Always

Steps to Reproduce:
1. See attached rusage.c
2. cc rusage.c -lpthread
3. ./a.out & sleep 5 ; ps SH -p $! ; sleep 5 ; ps SH -p $! ; kill $!
    

Actual Results:  [1] 16894
  PID TTY      STAT   TIME COMMAND
16894 pts/1    Sl     0:00 ./a.out
16894 pts/1    Rl     0:05 ./a.out
  PID TTY      STAT   TIME COMMAND
16894 pts/1    S      0:00 ./a.out
[1]+  Terminated              ./a.out


Expected Results:  Second "ps" should show 0:10

Additional info:

See also my original description of the problem at
http://www.uwsg.iu.edu/hypermail/linux/kernel/0409.0/0813.html.

Please consider merging Albert Cahalan's and mine patches from 2.6.10-rc1:
http://linux.bkbits.net:8080/linux-2.5/cset@1.1988.75.34
http://linux.bkbits.net:8080/linux-2.5/cset@1.1988.75.35

Comment 1 Lev Makhlis 2004-12-15 16:50:06 UTC
Created attachment 108630 [details]
Test program

Comment 2 Jason Baron 2004-12-23 21:28:10 UTC
looks like good U1 material. adding to U1 blocker list.

Comment 3 Roland McGrath 2005-01-18 04:34:24 UTC
Upstream patches by the bug reporter fixed this between 2.6.9 and 2.6.10:
http://linus.bkbits.net:8080/linux-2.5/user=Lev_Makhlis/cset@4174ac12low7VnmD6QeTWOGO0A06nw?nav=!-|index.html|stats|!+|index.html|ChangeSet@-6M

Comment 5 Roland McGrath 2005-01-18 22:57:28 UTC
Created attachment 109951 [details]
backport of those two upstream changesets to 2.6.9-5.EL

# /proc/TGID/stat with multithreaded CPU time totals (#142989)
%patch1702 -p1

Comment 7 Elena Zannoni 2005-06-09 20:24:08 UTC
This should not be on Roland plate anymore. He posted the patch to the list,
back in January. I don't think there is anything else we can do to help.
 

Comment 13 Red Hat Bugzilla 2005-10-05 12:36:14 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html



Note You need to log in before you can comment on or make changes to this bug.