Following the changes for bug #152430, per-thread CPU usage is broken when using top. Before the patch, the testcase below would show (running in top `ps -La|grep loop |grep -v grep|awk '{print "-p "$2}'`) Tasks: 3 total, 2 running, 1 sleeping, 0 stopped, 0 zombie Cpu(s): 10.6% us, 27.3% sy, 0.0% ni, 52.7% id, 7.7% wa, 0.1% hi, 1.7% si Mem: 256044k total, 70552k used, 185492k free, 10908k buffers Swap: 524280k total, 0k used, 524280k free, 30420k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2971 root 25 0 22576 400 312 R 45.6 0.2 0:06.64 loop 2972 root 25 0 22576 400 312 R 39.7 0.2 0:06.70 loop 2970 root 17 0 22576 400 312 S 0.0 0.2 0:00.00 loop Now the testcase shows: Tasks: 3 total, 2 running, 1 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3% us, 0.3% sy, 0.0% ni, 99.3% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 255980k total, 240732k used, 15248k free, 39940k buffers Swap: 524280k total, 0k used, 524280k free, 147088k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6924 root 17 0 22524 396 312 S 93.5 0.2 0:43.59 loop 6925 root 25 0 22524 396 312 R 93.5 0.2 0:43.59 loop 6926 root 25 0 22524 396 312 R 93.5 0.2 0:43.59 loop The CPU usage is attributed to all the threads, instead of per-thread with the old behaviour. Source code of the testcase attached. One solution would be to backport the "per task display" features from the newer procps instead of changing the kernel (again).
Created attachment 121670 [details] loop.c
This very useful trouble-shooting functionality -- sadly missing in rhel4 compared to rhel3 -- can be emulated using ps(1), but it would be very handy to have it back in top(1). % ps -o pcpu,pid,rtprio,nlwp,psr,comm,command aH | grep firefox-bin | sort -r Mon Jan 23 16:58:11 2006 0.0 31233 - 5 3 firefox-bin /usr/lib/firefox-1.0.7/firefox-bin -UILocale en-US 0.0 31233 - 5 3 firefox-bin /usr/lib/firefox-1.0.7/firefox-bin -UILocale en-US 0.0 31233 - 5 2 firefox-bin /usr/lib/firefox-1.0.7/firefox-bin -UILocale en-US 0.0 31233 - 5 2 firefox-bin /usr/lib/firefox-1.0.7/firefox-bin -UILocale en-US 0.0 31233 - 5 0 firefox-bin /usr/lib/firefox-1.0.7/firefox-bin -UILocale en-US 0.0 31228 - 1 1 run-mozilla.sh /bin/sh /usr/lib/firefox-1.0.7/run-mozilla.sh /usr/lib/firefox-1.0.7/firefox-bin -UILoca le en-US 0.0 18024 - 1 2 grep grep firefox-bin 0.0 18022 - 1 1 sh sh -c ps -o pcpu,pid,rtprio,nlwp,psr,comm,command aH | grep firefox-bin | sort -r 0.0 18009 - 1 0 watch watch --int=1 ps -o pcpu,pid,rtprio,nlwp,psr,comm,command aH | grep firefox-bin | sort - r
I can backport "per tash display" ('H' key/option) feature, but it will be pretty invasive patch to the top command. This backport will *not* fix something like: top `ps h -Le | grep loop | grep -v grep | awk '{print "-p "$2}'` because the "-p" option expects PID and no Thread ID! It means example from the comment #1 is unexpected and wrong usage of the top command. This wrong usage probably appears because kernel bug has been fixed and there is real difference between process and thread (task) now. From my point of view there is not any bug -- there is missing feature which is available in new procps versions (and will be available in FC5 and RHEL5).
Having a look at resources consumption of each individual thread is indeed a feature that is missing in rhel4. Since it was here in rhel3 & it will be here in rhel5, I consider its absence into rhel4 as a regression. In case you can not port the 'H' patch, I would appreciate if you could at least provide a work-around that can help us trouble-shooting existing deployed rhel4 system with processes including hundreds on threads.
Reopening.
In the upstream procps changelog: ---8<--- procps-3.2.5 --> procps-3.2.6 <snip> top: can do per-task display -- thanks John Blackwood rh114012 ---8<--- Attached patch is a backport.
Created attachment 131348 [details] procps-3.2.6-backport_Show_THREADS.patch
Hmm... Sorry, I've originally thought that the libproc doesn't have useful functionality for tasks in the 3.2.3 version. The patch doesn't look so invasive, but it still will be require massive tests. Thanks!
*** Bug 199992 has been marked as a duplicate of this bug. ***
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
bug 152430 and bug 199992, both referenced from this bug, are not accessible This is a common problem. I can not believe that these are all security-related. You should fix your process to be more open. There is a whole Open Source community out here that might help you if you didn't exclude it. (actually, the bugzilla login requirement is already bad -- see how Debian does things if you want a good example)
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0237.html