| Summary: | idle & iowait ticks overflow | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | colyli |
| Component: | kernel | Assignee: | Prarit Bhargava <prarit> |
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.2 | CC: | eguan, Jes.Sorensen, prarit |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-01-15 14:46:02 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
running on 14 servers for 5+ days, NO report for idle & iowait ticks overflow. Since RHEL 6.3 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. FYI, up to today, there are 20+ servers running 40+ days, NO report for idle & iowait ticks overflow (In reply to comment #5) > FYI, up to today, there are 20+ servers running 40+ days, NO report for idle > & iowait ticks overflow Since this BZ hasn't been updated recently and this is the last comment from the reporter, I'm closing this as NOTABUG for now. If this is still an issue, please reopen the BZ. P. |
Description of problem: under /proc/stat, idle and iowait ticks overflows, which results top/sar reporting high CPU utilization. NOTE: This bug is fixed, the patch will be mentioned in bellowed information. Version-Release number of selected component (if applicable): 2.6.32-220.7.1 How reproducible: This bug is introduced after we merged 3 upstream patches to fix iowait accouting problem, 1)nohz: Fix update_ts_time_stat idle accounting 2)nohz: Make idle/iowait counter update conditional 3)Consider NO_HZ when printing idle and iowait times After using these patches, we observed abnormal CPU utilization numbers, which are, 1) all user/sys/idle/iowait are 0% 2) user util% are more than 200% After some debug, it seems when uptime more then 2 hours, idle ticks on some CPU core from /proc/stat are observed being decreased, for 16 core machine, we observe the overflow around 5-8 minutes. Actual results: After check the code, it seems there is idle and iowait ticks overflow which is introduced by Michal's above 3 fixes. And we found this issue is already fixed by upstream. The core fix is, procfs: do not overflow get_{idle,iowait}_time for nohz We backport this patch, and corresponding implementation of nsecs_to_jiffies64(), run the fix for more then 12 hours, no overflow and mistaken CPU utilization number reported.