This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 137583 - failures of the process accounting in ps, top, and time
failures of the process accounting in ps, top, and time
Status: CLOSED UPSTREAM
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Rik van Riel
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-10-29 12:48 EDT by Allen Brown
Modified: 2007-11-30 17:07 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-11-01 15:51:18 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Allen Brown 2004-10-29 12:48:11 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6)
Gecko/20040207 Firefox/0.8

Description of problem:
cat /etc/redhat-release 
Red Hat Enterprise Linux ES release 3.90 (Nahant)

Process timing measurement is incorrect.  Also note that top, ps, and
/proc will not charge processor time to tasks which complete their
load in less than 1/HZ (a "jiffy").



Version-Release number of selected component (if applicable):
kernel-smp-2.6.8-1.528.2.10

How reproducible:
Always

Steps to Reproduce:
1. exec top (see output #1 below)
2. exec'ed an app (eatcpu.linux)
3. see output #2 below
4. kill eatcpu
5. see output #3 below

Actual Results:  #1:
top - 12:36:24 up 2 days, 20:06,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  75 total,   1 running,  74 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.1% sy,  0.0% ni, 99.9% id,  0.0% wa,  0.0% hi, 
0.0% si
Mem:   3976640k total,   399000k used,  3577640k free,   200328k buffers
Swap:  4096312k total,        0k used,  4096312k free,    96852k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
         
13122 root      16   0  3228  928 1652 R  0.5  0.0   0:00.09 top     
          
    1 root      16   0  3620  492 1396 S  0.0  0.0   0:01.37 init    
          

--------------------------------------------------
#2:

top - 12:38:14 up 2 days, 20:08,  1 user,  load average: 0.66, 0.19, 0.06
Tasks:  76 total,   2 running,  74 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.1% us,  0.0% sy,  0.0% ni, 74.9% id,  0.0% wa,  0.0% hi, 
0.0% si
Mem:   3976640k total,   399896k used,  3576744k free,   200328k buffers
Swap:  4096312k total,        0k used,  4096312k free,    97632k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
         
13123 root      25   0  2208  688 2164 R 99.9  0.0   1:42.34
eatcpu.linux       
    1 root      16   0  3620  492 1396 S  0.0  0.0   0:01.37 init    
          
    2 root      RT   0     0    0    0 S  0.0  0.0   0:00.24
migration/0        

-----------------------------------------------------
#3:

top - 12:39:27 up 2 days, 20:09,  1 user,  load average: 0.90, 0.37, 0.13
Tasks:  75 total,   1 running,  74 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.1% sy,  0.0% ni, 99.9% id,  0.0% wa,  0.0% hi, 
0.0% si
Mem:   3976640k total,   399896k used,  3576744k free,   200344k buffers
Swap:  4096312k total,        0k used,  4096312k free,    97616k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
         
    1 root      16   0  3620  492 1396 S  0.0  0.0   0:01.37 init    
          
    2 root      RT   0     0    0    0 S  0.0  0.0   0:00.24
migration/0        
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00
ksoftirqd/0        
    4 root      RT   0     0    0    0 S  0.0  0.0   0:00.02
migration/1        


Expected Results:  Per CPU%'s reported in CPU states should match (or
total) CPU%'s reported per PID.



Additional info:
Comment 2 Allen Brown 2004-11-01 10:49:20 EST
Now that I look at my initial description, it was not accurate and 
didn't do the problem justice. We are running thousands of programs 
that are completing within a millisecond (or two). These processes 
consume a considerable amount of processing power but are not charged 
processor time because tasks which complete their load in less than 
2/HZ (a "jiffy") are not charged.
In previous kernel releases poll latency was defined as 1/Hz with one 
run queue. Now it appears process scheduling is done with 2 run 
queues making the poll latency two "jiffies". 

If it will help, I might be able to provide some sample code to 
illustrate this. But based on my limited research, this is a known 
issue within the community. Albert Calahan (the maintainer of top and 
ps) suggested we make some kernel hacks to get this working, but that 
would render our release unsupportable by Red Hat.

Please advise ... thanks.
Comment 3 Rik van Riel 2004-11-01 15:51:18 EST
Indeed, this is a known issue upstream.  I would be happy to work
upstream with you to get the issue fixed there.  You are right in that
the changes would probably be so invasive that such a hacked kernel
would not be Red Hat supportable...

Let me know if you want help fixing this issue in the community. IMHO
it is worth fixing, just not sure if Linus will agree ;)
Comment 4 Allen Brown 2004-11-02 16:30:41 EST
We would like to have this resolved upstream and your help would be 
greatly appreciated. We were told it would render our release 
unsupportable if we made the changes ourselves.
Please advise .... 

Note You need to log in before you can comment on or make changes to this bug.