From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Description of problem: For processes beeing killed (e.g. CTRL-C, KILL, TERM) "sa" utility shows a "cpu" value about 5 times larger then the actual consumed cpu user and system time, using an SMP kernel: 2.4.18-27.7.x.cernsmp (modified at CERN: http://www.cern.ch). Version-Release number of selected component (if applicable): psacct-6.3.2-19 How reproducible: Always Steps to Reproduce: 1.Start a cycling process consuming ~100% of one CPU 2.Run it for 10 minutes 3.Kill it with TERM, KILL or CTRL-C Actual Results: "sa -u" shows 3067.60 cpu for the process Expected Results: "sa -u" should show about 600.00 cpu Additional info: On our interactive nodes we use a daemon process "beniced" to kill user processes consuming more then 30 minutes CPU time and now these processes are accounted with about 5 times more CPU time, then actually consumed.
We are having the same problem, but it is worse than simply killed processes. Kernel 2.4.18-10 is fine. Upgrading to 2.4.18-27.7.x has broken processing accounting. Here is an example of it being broken: > time openssl speed (SNIP) 314.796u 0.056s 5:21.43 97.9% 0+0k 0+0io 232pf+0w > lastcomm |grep openssl openssl dwb7 stdin 1611.81 secs Sun May 4 19:47 as you can see, openssl speed took about 5 minutes to run. But lastcom shows the time being about 5 times that number. Reproducible 100% of the time on ANY computer running 2.4.18-27. This is a HUGE problem as it results in users being overcharged for cpu time.
We believe we've found the problem. In 2.4.18-27, the HZ value is set to 512 instead of 100 for i686 and athlon (but for some odd reason, still set to 100 for i386 and i586). You can confirm this by grepping for "_HZ" in the configs for kernel source: > grep "_HZ" * kernel-2.4.18-athlon.config:CONFIG_HZ=512 kernel-2.4.18-athlon.config:CONFIG_HZ=512 kernel-2.4.18-athlon-smp.config:CONFIG_HZ=512 kernel-2.4.18-athlon-smp.config:CONFIG_HZ=512 kernel-2.4.18-i386-BOOT.config:CONFIG_HZ=100 kernel-2.4.18-i386.config:CONFIG_HZ=100 kernel-2.4.18-i586.config:CONFIG_HZ=100 kernel-2.4.18-i586-smp.config:CONFIG_HZ=100 kernel-2.4.18-i686-bigmem.config:CONFIG_HZ=512 kernel-2.4.18-i686-bigmem.config:CONFIG_HZ=512 kernel-2.4.18-i686.config:CONFIG_HZ=512 kernel-2.4.18-i686.config:CONFIG_HZ=512 kernel-2.4.18-i686-debug.config:CONFIG_HZ=512 kernel-2.4.18-i686-debug.config:CONFIG_HZ=512 kernel-2.4.18-i686-smp.config:CONFIG_HZ=512 kernel-2.4.18-i686-smp.config:CONFIG_HZ=512 For things like proc file output, a conversion is done so that you see the right values in realtime: from linux-2.4.18-variablehz.patch: + hz_to_std(task->times.tms_utime), + hz_to_std(task->times.tms_stime), + hz_to_std(task->times.tms_cutime), + hz_to_std(task->times.tms_cstime), but, if you look at acct.c, which records the accounting files, this conversion is not done. So, either acct.c needs to be patched to do the conversion, or you have to go download later source for gnuacct which supports the --ahz option allowing you to specify the hz value of 512 to get correct output. Or, you have to edit the configs to define CONFIG_HZ as 100 for the appropriate archs instead of as 512.
*** Bug 114617 has been marked as a duplicate of this bug. ***
*** Bug 117817 has been marked as a duplicate of this bug. ***
This is fixed in FC3t3 and AS2.1. Moving this bug to RHEL3 so that it gets fixed there, too.
Erratum RHBA-2004:497-05
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-497.html