Bug 650934
Summary: | Idle System has high load without visible cause | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dennis Jacobfeuerborn <dennisml> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 14 | CC: | andreasfleig, covex, dougsland, fabsh, fedora, gansalmon, goeran, hidoufr, itamar, jeff, jonathan, jurek.bajor, kernel-maint, kmcmartin, madhu.chinakonda, mattdm, mike, misek, mishu, mrunge, ondrejj, orion, pbrobinson, rissko, rmj, sampos |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.35.10-72.fc14 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-12-09 16:10:33 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dennis Jacobfeuerborn
2010-11-08 12:29:02 UTC
Seems to me this is the same issue as: https://bugzilla.redhat.com/show_bug.cgi?id=635813 (In reply to comment #1) > Seems to me this is the same issue as: > https://bugzilla.redhat.com/show_bug.cgi?id=635813 The profiles in that bug don't show the high impact of rescheduling interrupts so this might be a separate issue. Are you sure? Looks to me like the rescheduling is waking up the CPU a lot which causes the high load. (In reply to comment #3) > Are you sure? Looks to me like the rescheduling is waking up the CPU a lot > which causes the high load. That certainly is part of the problem but if you compare the profiles then "[Rescheduling interrupts] <kernel IPI>" only shows up significantly in mine but not the others. I'm not enough of a kernel guru to determine if these are related though. Oh, true. Never mind, ignore me. Ok, so after booting with "nohz=off" my profile now looks like this: Top causes for wakeups: 73.3% (2002.0) [kernel scheduler] Load balancing tick 16.0% (437.7) [Rescheduling interrupts] <kernel IPI> 4.0% (110.6) plugin-containe 3.6% ( 99.4) pulseaudio 0.7% ( 19.2) firefox 0.6% ( 17.0) thunderbird-bin 0.5% ( 13.5) [ICE1712] <interrupt> 0.4% ( 10.6) skype 0.2% ( 4.6) [eth0] <interrupt> 0.1% ( 2.4) [sata_nv] <interrupt> This looks much more like the profiles in the other bug. The net result is that the load now goes toward 0.0 after idling on the desktop for a few minutes. Not sure what to make of that though. Ubuntu 10.04 had the same issue with Kernel 2.6.32: http://ubuntuforums.org/showthread.php?t=1471010 https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910 The launchpad bug seems to be specific to EC2, but actually I noticed high system loads on every 10.04 desktop machine I encountered. *** Bug 635062 has been marked as a duplicate of this bug. *** Hi, It looks like there is an ongoing discussion about issues with ~high~ load (0.60 with a 2.6.36) when CPU is IDLE in the kernel mailing list (looks like this issue started from 2.6.35). http://lkml.indiana.edu/hypermail/linux/kernel/1010.3/00310.html Regards Hamidou Dia http://kyle.fedorapeople.org/kernel/2.6.35.9-62.bz635813.fc14/x86_64/ Try this build, please? (In reply to comment #10) > http://kyle.fedorapeople.org/kernel/2.6.35.9-62.bz635813.fc14/x86_64/ > > Try this build, please? This build is noticeably better. What a nice Thanksgiving treat. Hope you had a good Thanksgiving and weren't busy on this kernel bug. Typing an e-mail or leaving the system at idle now results in desired load averages (0.10 or lower). Load is also correct when running a demanding application and it settles back down after stopping it. Good work! It's unfortunate Linus was more concerned with how his Adobe Flash was working compared to the load average counters. Is this an x86_64 issue only, or is there a i686 kernel I can test. I think I see this on my P4 box. No, shouldn't be. I just can't be bothered building 32-bit images constantly, since it triples the amount of time it takes me. http://kyle.fedorapeople.org/kernel/2.6.35.9-62.bz650934.fc14/ peterz sent out a new patch this morning, so please try this. I included i686 images this time. NOTE: it will fix only the high running idle load average, and NOT any bugs about high cpu usage. (In reply to comment #14) > http://kyle.fedorapeople.org/kernel/2.6.35.9-62.bz650934.fc14/ Did you also capture the -devel package? I need it for my... special modules. Looks good to me - got to below .1 quickly after boot with nothing running. Now at ~.4 with firefox running at 50% cpu. no, sorry, i threw it out already. there src.rpm is there though. My compiled build from comment 14 seems to be acting the same as the kernel from comment 10. As noted by some of the folks in the LKML thread, it takes time (~5 minutes) of zero activity to see low idle load being reported for the last minute load counter. I would expect the last minute counter to reflect the last minute, but this is much better than the vanilla .35 kernel. Kernel from comment 14 run very well (little better than kernel from comment 10) and is much better than last official kernel (kernel-2.6.35.6-48.fc14). After booting to the desktop with the kernel from comment 14 and waiting for a few minutes the load goes all the way down to 0.00 for me. Looks good. I'm wondering though why it takes several minutes to do so. Given that once the desktop is loaded the system activity stays constant and the load value is defined as the average over one minute I would expect for the load to reach the minimum value pretty much exactly one minute after the system activity dies down. Why this bug was closed as "RAWHIDE", even if it was filled for Fedora 14? This still persist in Fedora 14 stable, reopening. (In reply to comment #21) > Why this bug was closed as "RAWHIDE", even if it was filled for Fedora 14? > This still persist in Fedora 14 stable, reopening. "Fixed in rawhide" is sometimes (often!) the best fix for bugs filed against a stable release, particularly when a bug isn't critical and the fix itself may cause disruption for other people. I'm not commenting on whether that's the case with this particular bug, but just noting that it's a reasonable action in general. Because I fixed it there too. (In reply to comment #23) > Because I fixed it there too. Well then. :) (In reply to comment #21) > Why this bug was closed as "RAWHIDE", even if it was filled for Fedora 14? > This still persist in Fedora 14 stable, reopening. Fixed in F14[1], too. Meaning: Wait for the next F14 kernel. ;) Thanks, Kyle! [1] http://pkgs.fedoraproject.org/gitweb/?p=kernel.git;a=commit;h=bed92c4e508998dbcbf358183f61892363277e15 Please, next time leave this open and let bodhi to close it. Then it will not close before it will be pushed to stable. Thank you. No, because then bodhi will close things which are not appropriate. There's no way to flag them individually, so I have to do it by hand when fixes get committed. kernel-2.6.35.10-68.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-68.fc14 kernel-2.6.35.10-69.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-69.fc14 kernel-2.6.35.10-72.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/kernel-2.6.35.10-72.fc14 kernel-2.6.35.10-72.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report. |