Description of problem: Only processing time spent in the main() thread gets counted. Version-Release number of selected component (if applicable): kernel-2.6.3-1.97 procps-3.1.15-5 How reproducible: Test case attached.
Created attachment 98029 [details] Test case Run this with 'time java proctest', then run again and watch it with ps and top
Here is a fix that works on my system. It only addresses the issue at hand... Before: [brian@selkirk /]$ ps -ef brian 5277 5276 0 Mar04 pts/0 00:00:00 java -server ... After: [brian@selkirk /]$ ps -ef brian 5277 5276 0 Mar04 pts/0 00:02:04 java -server ... Here is the diff: [root@selkirk proc]# diff readproc.h readproc.h.bak 215,217d214 < //add the task times to our process time < extern void add_task_info(PROCTAB *restrict const PT, const proc_t *restrict const p); < [root@selkirk proc]# diff readproc.h readproc.h.bak 215,217d214 < //add the task times to our process time < extern void add_task_info(PROCTAB *restrict const PT, const proc_t *restrict const p); < [root@selkirk proc]# diff readproc.c readproc.c.bak 554,558d553 < if (p->nlwp > 1) < { < add_task_info(PT, p); < } < 564,579d558 < static void add_task_info(PROCTAB *restrict const PT, const proc_t *restrict const p) < { < proc_t *task; < < for (;;) < { < task = readtask(PT, p, NULL); < if (task == NULL) < break; < p->utime += task->utime; < p->stime += task->stime; < p->utime += task->cutime; < p->stime += task->cstime; < } < } < BTW, these patches are against 3.2.0-1 Since I've done the hard work, please consider fixing this! :)
added patch to procps-3_2_0-3
Created attachment 98923 [details] Test to ensure procps works properly
Removed the patch because of comments from upstream maintainer. >+static void add_task_info(PROCTAB *restrict const PT, >> + const proc_t *restrict const p) >> +{ >> + proc_t *task; >> + for (;;) >> + { >> + task = readtask(PT, p, NULL); >> + if (task == NULL) >> + break; >> + p->utime += task->utime; >> + p->stime += task->stime; >> + p->utime += task->cutime; >> + p->stime += task->cstime; >> + } >> +} >> + >> ////////////////////////////////////////////////////////////////////////////////// >> // This reads /proc/*/task/* data, for one task. >> // p is the POSIX process (task group summary) (not needed by THIS implementation) >> --- procps-3.2.0/proc/readproc.h~ 2004-02-01 18:22:39.000000000 -0500 >> +++ procps-3.2.0/proc/readproc.h 2004-03-25 15:30:56.539717488 -0500 >> @@ -212,6 +212,9 @@ >> //fill out a proc_t for a single task >> extern proc_t * get_proc_stats(pid_t pid, proc_t *p); >> >> +//add the task times to our process time >> +extern void add_task_info(PROCTAB *restrict const PT, const proc_t *restrict const p); >> + >> // openproc/readproctab: >> // >> // Return PROCTAB* / *proc_t[] or NULL on error ((probably) "/proc" cannot be First, "static void" and "extern void" don't match. Second, you completely broke thread support. Third, performance has surely gotten worse. I can see what you're trying to do, but this really needs kernel support. For that matter, %CPU on Linux has never worked. The BSD and SysV kernels maintain a per-process decaying average; Linux only does this for system load. I'll attach a copy of the procps test suite. Untar it over the procps source, run "make" at the top level, then cd to the test directory and run "./runtests". Here are the failing tests: FAIL ps/thread-nosort-H err-1 FAIL ps/thread-nosort-HL err-1 FAIL ps/thread-nosort-L err-1 FAIL ps/thread-nosort-m eval-fn FAIL ps/thread-nosort-mL eval-fn FAIL ps/thread-sorted-H err-1 FAIL ps/thread-sorted-HL err-1 FAIL ps/thread-sorted-L err-1 FAIL ps/thread-sorted-m eval-fn FAIL ps/thread-sorted-mL eval-fn >+static void add_task_info(PROCTAB *restrict const PT, >> + const proc_t *restrict const p) >> +{ >> + proc_t *task; >> + for (;;) >> + { >> + task = readtask(PT, p, NULL); >> + if (task == NULL) >> + break; >> + p->utime += task->utime; >> + p->stime += task->stime; >> + p->utime += task->cutime; >> + p->stime += task->cstime; >> + } >> +} In addition to the problems listed in my other email: 4. huge memory leak 5. double-counting of the first task
Created attachment 98924 [details] Test cases for procps.
I'll outline a solution below. While non-kernel hacks are possible, they are unacceptably slow -- they nearly wipe out the whole reason for having threads in task directories. If I accept a user-space hack, there will be far less motivation for fixing the kernel. Here's part of an email I sent to somebody who was interested in solving this problem. He had plenty of code experience, but no kernel experience and little time. I think these instructions should be simple to follow. The only concerns are locking and cache line bouncing. Aside from that, the code change is trivial. ----------------------------------------------------------- > If you think there is something that I can do to help you in a 1 to 2 > day time frame, let me know. I will work on it at night and over the > weekend. No, but you can produce a 90%-correct hack for yourself in a day. This involves the kernel source, not procps. In fs/proc/base.c find the proc_pident_lookup function. Change this: case PROC_TID_STAT: case PROC_TGID_STAT: inode->i_fop = &proc_info_file_operations; ei->op.proc_read = proc_pid_stat; break; Into this: case PROC_TID_STAT: inode->i_fop = &proc_info_file_operations; ei->op.proc_read = proc_tid_stat; break; case PROC_TGID_STAT: inode->i_fop = &proc_info_file_operations; ei->op.proc_read = proc_tgid_stat; break; In fs/proc/array.c, copy proc_pid_stat to make a second function. Name one copy proc_tid_stat, and the other copy proc_tgid_stat. There's also an extern declaration that you need to duplicate in the fs/proc/base.c file. Use grep to see if I missed anything. Now you have separate per-process and per-thread functions, but they show the same thing. Modify the per-process (tgid) one to take CPU usage data from some new variables. You can put these variables in the task_struct with the others, causing non-leader tasks to have some wasted space. The fancy solution would involve the signal struct, but that requires different locking. That leaves only one thing left. You need to insure that your new struct members get updated. Find the places where this happens for the per-thread variables. Either use the group_leader pointer for this (in case you placed the new data in the task_stuct of the leader) or... you're on your own for using the signal struct. There. Done, except for some rare glitches when multiple threads try to update the counter at the exact same moment.
*** Bug 120460 has been marked as a duplicate of this bug. ***
Created attachment 103381 [details] kernel-side fix This should apply to a very recent kernel from Linus's BitKeeper tree, and maybe to the upcoming 2.6.9 release. There may need to be an additional small adjustment to the code that updates the values on the clock tick, to make sure that the per-process ones are done then instead of on thread death.
the 2.6.10 kernel has a fix
I'm running 2.6.11-1.14_FC3 and I still don't see per-thread CPU usage in top. Am I doing something wrong?
David, the problem wasn't in "per-thread CPU usage" in the top, but the problem was in the kernel <2.6.10. It doesn't include thread specific data into sum of information about process ("parent" of threads). It means thread specific CPU usage was hidden for top or ps. The top from procps <= 3.2.5 doesn't display per-thread (/proc/<pid>/tash/<tid>/*) data. It displays information about main process ("parent" of threads) only. The support for separate tasks is new in procps-3.2.6. And it will be available in FC5.