Red Hat Bugzilla – Bug 89959
"sa" shows wrong cpu time for killed processes
Last modified: 2007-11-30 17:06:55 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Description of problem:
For processes beeing killed (e.g. CTRL-C, KILL, TERM) "sa" utility shows
a "cpu" value about 5 times larger then the actual consumed cpu user and
system time, using an SMP kernel: 2.4.18-27.7.x.cernsmp (modified at CERN:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Start a cycling process consuming ~100% of one CPU
2.Run it for 10 minutes
3.Kill it with TERM, KILL or CTRL-C
Actual Results: "sa -u" shows 3067.60 cpu for the process
Expected Results: "sa -u" should show about 600.00 cpu
On our interactive nodes we use a daemon process "beniced" to kill user
processes consuming more then 30 minutes CPU time and now these processes are
accounted with about 5 times more CPU time, then actually consumed.
We are having the same problem, but it is worse than simply killed processes.
Kernel 2.4.18-10 is fine. Upgrading to 2.4.18-27.7.x has broken processing
Here is an example of it being broken:
> time openssl speed
314.796u 0.056s 5:21.43 97.9% 0+0k 0+0io 232pf+0w
> lastcomm |grep openssl
openssl dwb7 stdin 1611.81 secs Sun May 4 19:47
as you can see, openssl speed took about 5 minutes to run. But lastcom shows the
time being about 5 times that number. Reproducible 100% of the time on ANY
computer running 2.4.18-27.
This is a HUGE problem as it results in users being overcharged for cpu time.
We believe we've found the problem.
In 2.4.18-27, the HZ value is set to 512 instead of 100 for i686 and athlon (but
for some odd reason, still set to 100 for i386 and i586). You can confirm this
by grepping for "_HZ" in the configs for kernel source:
> grep "_HZ" *
For things like proc file output, a conversion is done so that you see the right
values in realtime:
but, if you look at acct.c, which records the accounting files, this conversion
is not done.
So, either acct.c needs to be patched to do the conversion, or you have to go
download later source for gnuacct which supports the --ahz option allowing you
to specify the hz value of 512 to get correct output. Or, you have to edit the
configs to define CONFIG_HZ as 100 for the appropriate archs instead of as 512.
*** Bug 114617 has been marked as a duplicate of this bug. ***
*** Bug 117817 has been marked as a duplicate of this bug. ***
This is fixed in FC3t3 and AS2.1. Moving this bug to RHEL3 so that it
gets fixed there, too.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.