Description of problem: top does not appear to count the processor time of children threads, but instead only looks at the process leader. Here is an example: top - 09:24:58 up 1 day, 22 min, 4 users, load average: 0.50, 0.19, 0.06 Tasks: 100 total, 1 running, 98 sleeping, 0 stopped, 1 zombie Cpu(s): 1.3% us, 0.5% sy, 0.0% ni, 97.8% id, 0.0% wa, 0.1% hi, 0.3% si Mem: 8005640k total, 7948604k used, 57036k free, 6356664k buffers Swap: 4192956k total, 6532k used, 4186424k free, 252676k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1900 iom_user 16 0 0 0 0 Z 0.0 0.0 0:00.30 startIOM.sh <defunct> 1902 iom_user 16 0 29888 3388 2192 S 0.0 0.0 0:00.00 sendmail 1942 iom_user 15 0 3600m 991m 29m S 0.0 12.7 0:19.62 java ***** 1965 iom_user 17 0 2716 528 412 S 0.0 0.0 1:07.97 startMonitor.sh 8203 iom_user 16 0 38720 17m 1896 S 0.0 0.2 0:00.19 vim 11184 iom_user 16 0 36140 2856 2108 S 0.0 0.0 0:00.00 sshd 11185 iom_user 16 0 49860 1840 1092 S 0.0 0.0 0:00.10 csh 11207 iom_user 16 0 47392 1704 1304 S 0.0 0.0 0:00.04 bash 11277 iom_user 16 0 44968 572 468 S 0.0 0.0 0:00.03 tail 11278 iom_user 17 0 42320 784 588 S 0.0 0.0 0:01.99 grep 14103 iom_user 18 0 44932 556 468 S 0.0 0.0 0:00.00 sleep I have noted the "java" process with some asterisks. top shows it has used about 20 seconds of CPU time. However, if you use "ps auxwH | grep java" to see the thread usage, it is ***much higher***. (java -version output just in case: java version "1.4.2_04" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_04-b05) Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode) ) From ps auxwH: iom_user 1942 0.1 12.6 3687360 1015596 ? Sl 05:10 0:19 /home/iom_user/bea/java/bin/java <snipped> iom_user 1942 0.1 12.6 3687360 1015596 ? Sl 05:10 0:30 /home/iom_user/bea/java/bin/java <snipped> iom_user 1942 0.0 12.7 3687360 1017540 ? Sl 05:10 0:12 /home/iom_user/bea/java/bin/java <snipped> ps shows that about 51 seconds have been used, whereas top shows 20 seconds. Clearly, top is not counting the CPU time of threads within the process. Is this normal? (The -S flag makes no difference to top in this case.) Version-Release number of selected component (if applicable): procps-3.2.3-8.1 How reproducible: Every time. Steps to Reproduce: 1. ES 4, run java. 2. Run top, run ps. 3. Compare output. Actual results: top shows thread leader only. Expected results: top should show total CPU usage for all threads within the process, not just the thread leader. Additional info:
Also, top doesn't show any CPU usage (%CPU column) for the java process, despite the load average on the machine showing that java is doing something. This appears to stem from the same issue.
Over a year ago someone posted to the procps mailing list a trivial program that can be used to demonstrate this serious bug in top: http://sourceforge.net/mailarchive/forum.php?thread_id=4071694&forum_id=12454 Yet there seems to have been no response on that list to that post, nor to other people's periodic inquiries about the status of NPTL support. Last week someone posted a patch: http://sourceforge.net/mailarchive/message.php?msg_id=8763954 This patch also seems to have generated no interest. Maybe Sourceforge's mailing list archive is just broken.
It's kernel problem that doens't sum pre-threads data into the process (see bug #114012 or bug #116783). It's fixed in upstream kernel >=2.6.10. The buffix missing in 2.6.9-11.EL. Well, reassign to kernel guys.
*** This bug has been marked as a duplicate of 152430 ***