Bug 805392 - idle & iowait ticks overflow
Summary: idle & iowait ticks overflow
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-21 06:44 UTC by colyli
Modified: 2013-01-15 14:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-15 14:46:02 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description colyli 2012-03-21 06:44:09 UTC
Description of problem:
under /proc/stat, idle and iowait ticks overflows, which results top/sar reporting high CPU utilization.
NOTE: This bug is fixed, the patch will be mentioned in bellowed information.


Version-Release number of selected component (if applicable):
2.6.32-220.7.1

How reproducible:

This bug is introduced after we merged 3 upstream patches to fix iowait accouting problem,
1)nohz: Fix update_ts_time_stat idle accounting
2)nohz: Make idle/iowait counter update conditional
3)Consider NO_HZ when printing idle and iowait times

After using these patches, we observed abnormal CPU utilization numbers, which are,
1) all user/sys/idle/iowait are 0%
2) user util% are more than 200%

After some debug, it seems when uptime more then 2 hours, idle ticks on some CPU core from /proc/stat are observed being decreased, for 16 core machine, we observe the overflow around 5-8 minutes.

Actual results:
After check the code, it seems there is idle and iowait ticks overflow which is introduced by Michal's above 3 fixes. And we found this issue is already fixed by upstream.

The core fix is,
procfs: do not overflow get_{idle,iowait}_time for nohz

We backport this patch, and corresponding implementation of nsecs_to_jiffies64(), run the fix for more then 12 hours, no overflow and mistaken CPU utilization number reported.

Comment 3 colyli 2012-03-28 04:40:28 UTC
running on 14 servers for 5+ days, NO report for idle & iowait ticks overflow.

Comment 4 RHEL Program Management 2012-05-03 05:24:49 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 5 colyli 2012-05-03 06:16:59 UTC
FYI, up to today, there are 20+ servers running 40+ days, NO report for idle & iowait ticks overflow

Comment 6 Prarit Bhargava 2013-01-15 14:46:02 UTC
(In reply to comment #5)
> FYI, up to today, there are 20+ servers running 40+ days, NO report for idle
> & iowait ticks overflow

Since this BZ hasn't been updated recently and this is the last comment from the reporter, I'm closing this as NOTABUG for now.

If this is still an issue, please reopen the BZ.

P.


Note You need to log in before you can comment on or make changes to this bug.