This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 866212 - power consumption raised significantly with 3.6.1 kernel
power consumption raised significantly with 3.6.1 kernel
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-14 13:02 EDT by Denis Auroux
Modified: 2014-01-06 11:50 EST (History)
45 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-16 09:44:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Denis Auroux 2012-10-14 13:02:36 EDT
Description of problem:
My thinkpad X220T used to run on about 11 watts of battery power when idle, using kernel 3.5.6-1.fc17.x86_64; after upgrading to 3.6.1-1.fc17.x86_64 it's running hot, the fan is at full speed, and powertop reports 23 watts of battery consumption; battery life went down from 5 hours to 2 hours.

top reports no processes using any significant amount of CPU, so I'm blaming the kernel.

(I'm using pcie_aspm=force i915.i915_enable_rc6=1 kernel options; not sure if these are still needed to avoid spurious power consumption).


Version-Release number of selected component (if applicable):
kernel 3.6.1-1.fc17.x86_64


How reproducible:
unsure: doesn't happen right at boot, but seems to happen after an hour of uptime and moderate work (thunderbird, text editor, ...).  It could be that the issue is with reverting to a low consumption CPU state, rather than right at boot.

Given the lack of precise directions for reproducing, I don't expect this bug report to be actionable as is, but I'm hoping others with a similar issue can help provide a clearer picture of the problem and this will eventually lead to a fix. (Or, if the problem is just with my config, that the lack of other corroborating bug reports will indicate so).
Comment 1 Joseph D. Wagner 2012-10-15 02:13:00 EDT
Can you confirm that the old behavior is restored simply by booting into a previous version of the kernel?

See here if you need previous versions.
http://koji.fedoraproject.org/koji/packageinfo?packageID=8
Comment 2 Josh Boyer 2012-10-15 06:13:26 EDT
I've seen something like this on my X220 as well, but only after resuming from suspend.  A cold power on runs fine.  Powertop reports about 22-29 watts when the issue hits.

Matthew, are you seeing anything like this on the X220?
Comment 3 Denis Auroux 2012-10-15 10:14:40 EDT
Joseph: booting back into 3.5.6-1 fixed things indeed. 

However, today I booted again into 3.6.1-1 (and suspended + resumed, Josh is right that I did this before the problem hit me the 2-3 times I encountered it) and am not encountering the issue anymore.

So it's unclear to me if this bug is reproducible enough to be understood and addressed.

Denis
Comment 4 Denis Auroux 2012-10-15 20:36:51 EDT
Issue is back now (after two suspend-resumes under kernel 3.6.1) -- 26 watts discharge rate. So there is certainly an issue somewhere, but since it's hard to figure out exactly what triggers the problem, I am not sure how you would go about figuring out how to fix it.

powertop 2.1 and /proc/cpuinfo report the CPU cores are mostly in C7 state, and mostly at 800 MHz, which seems to be normal as far as I can tell; so it's presumably not an issue with cpu frequency. Powertop reports an estimated 12 watts used by the laptop fan, which might or might not be correct, I'd imagine some of it comes from the CPU (it certainly feels like a lot of hot air being blown out, not just a laptop fan running crazy on its own); none of the other devices is using an abnormal amount of power according to powertop.

Denis
Comment 5 Joseph D. Wagner 2012-10-15 22:31:03 EDT
Since you provided a specific steps to reproduce (resume from suspend) and two very close kernel versions, it should be a matter of figuring out 1) what code is called when that event happens, and 2) how did that change between the two versions, especially in a way that might effect power.

Unfortunately, I can do neither of those.
Comment 6 Andrew Hutchings 2012-10-16 10:42:35 EDT
Can confirm this happens in my x220 too.  I believe this is something in the GPU.  After resume powertop shows the GPU in 100% Active state, not hitting RC6 at all.
Comment 7 Andrew Hutchings 2012-10-16 10:47:25 EDT
Also reproduced with the 3.6.0-3.fc18 kernel
Comment 8 Andrew Hutchings 2012-10-16 10:54:58 EDT
3.5.5-2.fc17 is good though, RC6 works after resume and power drain is at expected levels.  Please let me know if you want me to test any other versions.
Comment 9 Andrew Hutchings 2012-10-16 11:26:25 EDT
Sorry, just realised I missed out this info: by "reproduce" I mean a simple suspend/resume triggers it for me.
Comment 10 Andrew Hutchings 2012-10-17 15:29:23 EDT
This is the same bug in kernel bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=48791

I also believe this is related and has a bisect of the patch causing the problem:
https://bugzilla.kernel.org/show_bug.cgi?id=48721
Comment 11 Andrew Hutchings 2012-10-21 09:57:15 EDT
Also: https://bugs.freedesktop.org/show_bug.cgi?id=54089
Comment 12 Kahlil Hodgson 2012-11-07 16:24:53 EST
Been having same problem with my Lenovo X220.

kernel 3.6.3-1 makes matters even worse, 
kernel 3.6.5-1 seems to resolve the problem for me for sleep/wake cycles,
but plug/unplug cycles are still a problem
Subsequent sleep/wake cycle fixes the immediate problem, which is a lot better than a reboot.

K
Comment 13 Denis Auroux 2012-11-10 12:12:39 EST
I still have the problem with kernel 3.6.5-1 (and a sleep/wake cycle does not fix the problem when it occurs).

Denis
Comment 14 Robert Keersse 2012-11-11 13:43:14 EST
Same problem with kernel 3.6.6-1.fc17 on T420 i7-2620m 

Robert
Comment 15 Kahlil Hodgson 2012-11-11 14:10:02 EST
(In reply to comment #13)
> I still have the problem with kernel 3.6.5-1 (and a sleep/wake cycle does
> not fix the problem when it occurs).

Seems I was way too quick with the sleep/wake cycle trick.  Only occasionally works.
Comment 16 Kahlil Hodgson 2012-11-11 14:20:53 EST
The last time this happened, I check powertop I noticed that my libvirtd network devices (nic:virbr0) was consuming 16.5 W.  Very odd, given that I has no guests running, and haven't for some time.  I disabled libvirtd and rebooted,
but when I ran powertop I got similar power consumption for em1, even though I was running wireless.
Comment 17 John Poelstra 2012-11-13 14:36:36 EST
Experiencing this problem on Lenovo x220 4291-CL9

Let me know if I can help troubleshoot or test a fix
Comment 18 Felipe Aranda G 2012-11-14 12:34:17 EST
I experiencing this problem on Lenovo x220 4291SWW too.

Kernel 3.6.6-1.fc17.x86_64

Felipe
Comment 19 mswal28462 2012-11-14 23:57:31 EST
Looks like there is a fix in process:  https://bugs.freedesktop.org/show_bug.cgi?id=54089#c41
Comment 20 Georg Sauthoff 2012-11-16 17:10:33 EST
I observe the same issue on a Thinkpad x220 (with kernel 3.6.6-1.fc17.x86_64).

With a 3.5.x (probably 3.5.4) kernel (and previous ones) the ondemand power management was quite perfect (long runtime on battery, low fan activity etc.). 

Powertop reports mostly > 90 % on C7 but all the time 100 % Active GPU.
Comment 21 dekellum 2012-11-20 14:53:09 EST
Same issue in T520, SandyBridge, Intel graphics for 3.6 series, inclusing new test kernel-3.6.7-2.fc17.x86_64
Comment 22 Paul W. Frields 2012-11-27 05:59:50 EST
Not sure whether this needed to be stated outright, but for the benefit of users who aren't on the pre-release yet, this "i915 not returning to rc6" bug also exists in Fedora 18 (kernel 3.6.7-5.fc18.x86_64).  If needed, I can file separately for that release.  My x220 Sandybridge laptop is now on F18 and I'm willing to help test as needed.
Comment 23 Josh Boyer 2012-11-27 09:14:31 EST
(In reply to comment #22)
> Not sure whether this needed to be stated outright, but for the benefit of
> users who aren't on the pre-release yet, this "i915 not returning to rc6"
> bug also exists in Fedora 18 (kernel 3.6.7-5.fc18.x86_64).  If needed, I can
> file separately for that release.  My x220 Sandybridge laptop is now on F18
> and I'm willing to help test as needed.

There's no need to file a separate bug.

At the moment there's nothing really to test.  The "fix" queued up in the drm-next tree (headed for the 3.8 kernel) isn't an explicit fix for this, it just happens to solve it as a side-effect of something else.  It depends on other code being reworked, so that isn't a particularly good candidate for backporting.  There is mention of another fix being worked on in the upstream bug, but I have no idea what that is.  I'll try chasing it down today.
Comment 24 mswal28462 2012-11-27 09:21:30 EST
Thanks, Josh.  This is becoming a major bug for me.  I can not shut down everyday because the kernel won't handle suspend, forcing me to close all work in progress.  This is getting to be a significant issue for me ... and one that doesn't sound like it will be fixed in Fedora 17 .. even 18.
Comment 25 Gustav 2012-11-27 10:00:30 EST
Same issue for me with Thinkpad X220 and 3.6.7-4.fc17.x86_64.
My workaround is to hibernate instead of sleep/suspend.
Comment 26 fedora 2012-11-27 10:21:25 EST
Downgrading to 3.5.6-1.fc17.x86_64 worked for me. I will stay at this version until the bug gets fixed, as this really is a show stopper for me.
Comment 27 Josh Boyer 2012-11-27 12:13:46 EST
This is the other fix mentioned in the upstream bug report:

http://thread.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/16196/focus=16253

It fixes the RC6 issue, but it seems to cause performance issues on some machines and might not be restoring all the proper bits of state it needs to.  We'll keep an eye on it.
Comment 28 Gustav 2012-11-27 13:29:51 EST
Sorry, hibernation is not a workaround. Same problem after resuming from hibernation.
Comment 29 mswal28462 2012-12-10 07:12:22 EST
So it's been a couple of weeks since I've heard anything on this bug.  What is the outlook on getting this fixed?
Comment 30 Kahlil Hodgson 2012-12-10 17:31:34 EST
A little concerned here.  When the issue occurs on my x220 it gets very hot.
Heading into summer in Australia here and not sure how its going to behave if 
the issue occurs and I accidentally left the laptop is plugged and the room temperature reaches 40 degrees C.  Is it going to toast my laptop?  Is it going to start a fire?
Comment 31 Josh Stone 2012-12-10 17:44:48 EST
FWIW, on my x220 I found I can cat /sys/class/drm/card0/power/rc6_residency_ms a
few times as a check, as the problem is intermittent.  If that number is climbing
quickly, then it's ok, but if it stays fixed then the driver is apparently in a
bad state.  (This is also how powertop will check for %-active GPU.)

I find usually one or two suspend/resume cycles will get it back in the right
state.  Annoying, but at least that's something of a workaround.
Comment 32 Denis Auroux 2012-12-10 19:57:34 EST
I keep hoping things will get better with new kernel releases, but to no avail so far. From what I understand of the upstream discussion, the fix won't go in until 3.8... so, in the meantime, I shall stay with 3.5.6. The 3.6.x series is just completely unusable (I can't live without suspend, and I can't live without decent battery life). I'm quite disappointed that kernel releases can be labelled "stable" (twice: once by kernel developers, and a second time by Fedora testers) with such major regressions -- it broke some of my illusions about the thoroughness of the pre-release testing process.

Denis
Comment 33 mswal28462 2012-12-10 20:02:31 EST
I agree with you, Denis ... this is major functionality that broke.  Why wasn't the offending code backed out and reworked?
Comment 34 dekellum 2012-12-10 21:10:20 EST
The severity of the issue (for me on T520) and delayed resolution is disconcerting. This is compounded by the fact that the best workaround of the available 3.5 kernel is End-Of-Life, with its last patch update about 2 months ago.  

I just downloaded the last Fedora 17 kernel 3.4.6 src rpm, updated it to 3.4.23 (not EOL, latest patch release from today), built and I'm running that now.  

Hindsight is 20-20 of course, but perhaps it was a bit aggressive for even Fedora 16 to have been upgraded to kernel 3.6?  Would be nice if we had a more conservative but maintained kernel line (like 3.4) to fall back on when these sort of problems occur.
Comment 35 John Poelstra 2012-12-10 22:30:01 EST
My work around was to revert to the Fedora 17 GA kernel and stay with it.  The other work around for the broken kernel is not not suspend or resume and cold boot each time.
Comment 36 Paul W. Frields 2012-12-11 10:03:24 EST
Actually, I found that adding 'i915.i915_enable_rc6=7' to my kernel boot line in /boot/grub2/grub.cfg takes care of the problem most of the time. If I resume and find the unit heating up (checking either with powertop or as in comment 31 above), I can suspend and resume again and almost invariably the problem is gone on the next resume.  It's an annoyance but certainly not a showstopper.
Comment 37 Paul W. Frields 2012-12-11 10:10:07 EST
Also, I should point out that the "=7" above allows GPUs to use all available power states, including rc6pp which is deepest sleep/lowest power on my system. So in that case I would cat /sys/class/drm/card0/power/rc6pp_residency_ms instead.  If you use "=1" (rc6 only) then stick with comment 31.
Comment 38 Kahlil Hodgson 2012-12-18 01:59:31 EST
The situation seems to be getting worse for me with the newer kernels.  Reboot does not always solve the problem (only works half the time).  Just did more than 10 sleep/wake cycles before giving up and rebooting.
Comment 39 mswal28462 2012-12-18 08:27:18 EST
Some observations:

1)  i915.i915_enable_rc6=7 option did not help me. It worked about as often as without

2)  Starting with 3.6.9 I started, periodically, experiencing the ole heating up with a fresh reboot.  Not good!
Comment 40 Andrew Hutchings 2012-12-18 08:34:34 EST
I experienced graphics corruptions with i915.i915_enable_rc6=7.  I also occasionally have this happen on cold boots so I check powertop straight after I log in on every boot.
Comment 41 Kahlil Hodgson 2013-01-15 00:23:15 EST
Is anyone aware of any alternatives to a sequence of sleep/wake cycles to fix the issue?   Sometimes I get lucky and one or two cycles gets me there, but more than 5 is more common and I've had situations where its taken more than 15 cycles. This is beginning to seriously impact my productivity.  I've been forced to disable screen lock so I can run through the cycles more quickly (obviously not good from a security perspective).
Comment 42 Petr Penicka 2013-01-16 09:39:41 EST
(In reply to comment #41)
> Is anyone aware of any alternatives to a sequence of sleep/wake cycles to
> fix the issue?

As mentioned in comments above, downgrading to a previous kernel version works around the issue. I've been running my X220 with 3.5.6-1.fc17 since December and avoiding any kernel updates. Works fine so far, but a fix in 3.6 would be much appreciated.
Comment 43 Andrew Hutchings 2013-01-16 10:48:42 EST
Don't try this at home kids, but I have put the Fedora 19 3.8.0 rc3 kernel into my Fedora 18 installation and judging by a quick test earlier it seems to have solved the problem for me.
Comment 44 Paul W. Frields 2013-01-16 16:24:27 EST
You should not expect a backport to 3.6 kernel -- the underlying subsystem changes are too drastic to do this according to the kernel devs I asked about it.  I was told the real fix was destined for 3.8, and comment 43 seems to support this.
Comment 45 Denis Auroux 2013-01-16 17:46:51 EST
Any chance then that the 3.8 kernel will make it into Fedora 17 once it is ready for public consumption? It is not reasonable to leave Fedora 17 users dangling with such a buggy kernel.
Comment 46 Kahlil Hodgson 2013-01-16 18:30:00 EST
(In reply to comment #44)
> You should not expect a backport to 3.6 kernel -- the underlying subsystem
> changes are too drastic to do this according to the kernel devs I asked
> about it.  I was told the real fix was destined for 3.8, and comment 43
> seems to support this.

Fedora 19 (with the 3.8 kernel) is still at least 4 months away. Any chance that 3.8 will hit Fedora 18 or Fedora 17 before then?
Comment 47 Josh Boyer 2013-01-16 19:34:41 EST
(In reply to comment #46)
> (In reply to comment #44)
> > You should not expect a backport to 3.6 kernel -- the underlying subsystem
> > changes are too drastic to do this according to the kernel devs I asked
> > about it.  I was told the real fix was destined for 3.8, and comment 43
> > seems to support this.
> 
> Fedora 19 (with the 3.8 kernel) is still at least 4 months away. Any chance
> that 3.8 will hit Fedora 18 or Fedora 17 before then?

3.8 will be brought back to F17 and F18 likely around the time it hits 3.8.1.  It's in rawhide now at 3.8-rc3 for those that wish to use that kernel.
Comment 48 mswal28462 2013-01-19 12:56:27 EST
I, like several others, have downloaded the 3.8.0 rc3 kernel from http://www.kernel.org/, compiled it using these directions  http://www.howopensource.com/2011/08/how-to-install-compile-linux-kernel-3-0-in-fedora-15-and-14/ (making sure to replace the kernel name) and it is working fine for me .... so great to have suspend/resume working again!
Comment 49 dekellum 2013-01-19 13:10:25 EST
FWIW: On T520, F17 I'm also seeing stability with my own build of kernel 3.4.26 released two days ago (the older kernel line that is not EOL). I've been having better luck with these kernel line since originally building 3.4.23. It would be nice if Fedora made available and older/stable non-EOL kernel line like this.
Comment 50 Kahlil Hodgson 2013-01-22 17:21:43 EST
> 3.8 will be brought back to F17 and F18 likely around the time it hits
> 3.8.1.  It's in rawhide now at 3.8-rc3 for those that wish to use that
> kernel.

Thanks Josh.  I've been running the rawhide kernel on F17 for 5 days now and it seems reasonably stable.  Suspend is no longer triggering the bug.  No more concerns about overheating.  Power consumption is better, but still not great. In the 3.6 kernel (without the bug being triggered) I could get a prediction of more than 9 hours for powertop.  With the 3.8 kernel I struggle to get more than 5 hours.
Comment 51 Kahlil Hodgson 2013-01-22 17:27:41 EST
(In reply to comment #48)
> I, like several others, have downloaded the 3.8.0 rc3 kernel from
> http://www.kernel.org/, compiled it using these directions 
> http://www.howopensource.com/2011/08/how-to-install-compile-linux-kernel-3-0-
> in-fedora-15-and-14/ (making sure to replace the kernel name) and it is
> working fine for me .... so great to have suspend/resume working again!

You might want to consider using rawhide:

    yum install fedora-release-rawhide
    yum update kernel --enablerepo=rawhide
Comment 52 Justin M. Forbes 2013-01-29 14:13:14 EST
Or, if you can't stand the debug being turned on, you can always use rawhide-nodebug:

https://fedoraproject.org/wiki/RawhideKernelNodebug
Comment 53 Andrew Hutchings 2013-02-04 05:08:03 EST
Finally managed to trigger this again today when undocking my X220 with F18 and 3.8.0-0.rc6.git0.2.fc19.x86_64 kernel :(

I normally have it docked closed with an HP ZR30w hooked to the dock DisplayPort (gnome-tweak-tool used to disable suspend on lid close, I prefer to manually suspend).  I'm guessing the switch of displays did it.

On a positive note a reboot fixed it, it wouldn't have on previous kernels.
Comment 54 Kahlil Hodgson 2013-02-04 18:30:52 EST
(In reply to comment #52)
> Or, if you can't stand the debug being turned on, you can always use
> rawhide-nodebug:
> 
> https://fedoraproject.org/wiki/RawhideKernelNodebug

Excellent tip. Running that now and getting much better performance.
Comment 55 dekellum 2013-02-11 13:21:19 EST
FWIW: On T520/Sandybridge/i915 with F17, I've finally achieved complete i915 graphics stability including no further occurrence of this particular power problem as of kernel 3.4.28.  The latest of that line, 3.4.30 includes additional i915 fixes. This is a first in terms of complete stability of this laptop over last 10 months since purchasing and installing F16.

Looks like I'll be waiting 3.8.x to settle down before even considering an update to F18.

I would encourage anyone with similar hardware or experiencing this problem to build and try the latest 3.4.x kernel.
Comment 56 Mengxuan Xia 2013-02-13 14:34:40 EST
I tried dekellum's suggestion to build 3.4.30. On E430/Sandybridge/i915, bug is still present on 3.4.30. I haven't tried 3.4.28. Also on 3.4.30 there's problem with RTL8188CE wifi after resuming from suspend.
Comment 57 dekellum 2013-02-13 21:39:29 EST
Xia, either there are multiple sub-classes of this power issue or 3.4.30 back ports the regression? On 3.4.29 as of today and haven't yet seen any issue with multiple suspend/resume cycles. I'm using Intel WiMAX 6250 (iwlwifi), which isn't an issue in my case.
Comment 58 Mark van Rossum 2013-02-26 16:46:21 EST
Hi 

I think this is the same problem:
https://bbs.archlinux.org/viewtopic.php?pid=1176363#p1176363
Basically, the CPU does scale back to lower freqs.

Here is a temporary work-around that I use on my Lenovo X220
You can disable Speedstep in the Bios altogether; this
will lock you in at the lowest CPU speed.

I also tried the latest kernel from 
fedora-rawhide-kernel-nodebug.repo
but got weird USB suspend problems (continuous suspending)
Comment 59 dekellum 2013-02-26 23:25:37 EST
I agree with Mark R. that the archlinux thread is the same issue (I've been following that one as well for a while.)

With T520/Sandybridge/i915 on F17, I continue to be stable (~12 suspend/resume cycles) on the latest 3.4.33 kernel:

Linux retro 3.4.33-1.dek.fc17.x86_64 #1 SMP Sun Feb 24 09:02:32 PST 2013 x86_64 x86_64 x86_64 GNU/Linux

I think I also saw some RTL wifi patches in the change log between 3.4.31 and 3.4.33, if thats an issue for you.

Ultimately I am sure Kernel 3.8+ is the way forward. I'm personally going to wait for release availability of such a Kernel on F18 and some more positive signs of stability before attempting the Fedora+Kernel update.
Comment 60 Kahlil Hodgson 2013-02-26 23:42:20 EST
I've been running 3.8.0 for awhile now, tracking updates via rawhide-nodebug. 
Currently on 3.8.0-0.rc7.git4.2.fc19.x86_64.  Has been reasonably stable. Still triggers the bug occasionally, but a single suspend/resume cycle seems to fix it.

Rawhide has now moved on to 3.9 kernels.  I'm nowhere near brave enough to try that out.  Hopefully we'll get 3.8.1 in F18 updates soon, but its still not a real solution -- as I understand, the code has just been shuffled, making the bug less likely to be triggered.
Comment 61 Andrew Hutchings 2013-02-27 02:02:02 EST
I have been running 3.8.0 for a while too, now on rawhide's 3.9.  Seeing the same thing as Kahil.  Also undocking my X220 which has a DisplayPort monitor in the dock crashes the driver 50% of the time.  Still having that problem in 3.9 (also had Firefox crash the kernel in 3.9 which was interesting :)

Rolling back to 3.4/3.5 may be the answer for me.
Comment 62 Mark van Rossum 2013-02-27 07:34:39 EST
This is weird! Even with speedstep disabled, I have now occasionally seen 
high CPU speeds after resume.

(at least that is what i7z reported, and battery life suggested)

Bios version 8DET67WW (1.37 ) on X220
Comment 63 mswal28462 2013-02-27 08:03:06 EST
I HIGHLY recommend downloading Kernel 3.8.0 from http://www.kernel.org/, compiled it using these directions  http://www.howopensource.com/2011/08/how-to-install-compile-linux-kernel-3-0-in-fedora-15-and-14/ (making sure to replace the kernel name).  It really isn't difficult if you just follow the directions.  I've been running with 3.8.0 rc3 kernel that I downloaded and compiled for weeks without any issues on Fedora 17 and now Fedora 18 ... and Suspend is working fine.
Comment 64 fedora 2013-02-27 08:26:45 EST
Whatever - we definitly need an official kernel for F17 that fixes the issues. Especially in the light of security vulnerabilities/issues that have accumulated since the original FC17 kernel. I am not going back to rolling my own kernel on a machine I need for work.

By now I am really amazed on how such a regression does not get fixed over 5 or 6 major releases, given the "no regression" mantra. Shouldn't Intel be interested in fixing this? They also have all the information about the hardware.
Comment 65 Kahlil Hodgson 2013-03-04 07:29:57 EST
Just installed 3.8.1-201 for F18.  For the first time in many months, power consumption is back below 7W when my X220 is idle. Hopefully this is the end of the madness. :-)
Comment 66 Mengxuan Xia 2013-03-06 13:44:41 EST
Could anyone who built generic 3.4 kernel share their .config file with me?
Comment 67 dekellum 2013-03-15 13:47:52 EDT
=== 3.4 kernels ===

I'm now on Kernel 3.4.36, without issue thus far.

I'm building these by modifying the last kernel-3.4.6-2.fc17.src.rpm, available here or yum:

http://kojipkgs.fedoraproject.org//packages/kernel/3.4.6/2.fc17/src/kernel-3.4.6-2.fc17.src.rpm

And following instructions here:

http://fedoraproject.org/wiki/Building_a_custom_kernel

There are a bunch of fedora patches that need to be commented out in the spec given changes since 3.4.6.

=== 3.8 kernels ===

3.8.3 kernels are now available in testing for both F17 and F18

https://admin.fedoraproject.org/updates/search/kernel

There are some hopeful i915 changes:

https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.8.3

I'm hoping someone already committed to 3.8 with Sandybridge/i915 will test these and report stability over several suspend/resume cycles?
Comment 68 dekellum 2013-03-15 14:14:21 EDT
See also: https://bugzilla.redhat.com/show_bug.cgi?id=911986 which suggests that 3.8.1 at least, is not a complete fix.
Comment 69 Mengxuan Xia 2013-03-15 23:10:17 EDT
dekellum, will you be able/wanting to share your srpm? I'd like to test on 3.4.36. With Linux 3.7.9-104.fc17.x86_64 on F17, this bug is less likely to get triggered when resuming from suspend.
Comment 70 dekellum 2013-03-16 14:29:00 EDT
For 3.4.36, here is the only custom part of minor interest:

kernel.spec: https://gist.github.com/dekellum/5177655

Otherwise follow, above referenced instructions.
Comment 71 dekellum 2013-03-17 20:57:43 EDT
Bug 922304 isn't encouraging for Kernel 3.8.3.
Comment 72 Adam Williamson 2013-04-19 17:02:40 EDT
Both F17 and F18 are now on kernel 3.8: kernel-3.8.8-202.fc18 and kernel-3.8.4-102.fc17 are the current stable builds. Can someone please comment on the status of this bug with those kernels? Thanks.
Comment 73 John Poelstra 2013-04-19 17:06:26 EDT
I'm on Fedora 17 with 3.8.4-102.fc17.x86_64 on Lenovo x220.  I don't believe I'm seeing the original problem any more, but I'm not 100% sure.
Comment 74 Denis Auroux 2013-04-19 17:20:07 EDT
I'm also on 3.8.4-102.fc17.x86_64 on Fedora 17, and don't seem to be having the original problem anymore -- on the other hand my system now often crashes on resume, so I can't tell if that's an improvement...

Denis
Comment 75 Björn Ruberg 2013-04-19 17:25:44 EDT
Dito on kernel 3.8.7-201.fc18.x86_64. Many crashes after resume.

I'm quite sure that I've seen extremly raised power consumption several times on kernel 3.8.6. The good thing is, that it went away after another sleep and resume. Can't remember having seen the problem on kernel 3.8.7. But that may be purely random. And indeed, as the system crashs often at resume, there are much less resumes than before at which the problem might occur.
Comment 76 dekellum 2013-04-19 17:33:00 EDT
T520/Sandybridge/i915 and still stable on 3.4.41-1.dek.fc17.x86_64

Still waiting for more encouraging signs of stability on 3.8.x or 3.9 before re-joining the flock.
Comment 77 Jean-François Fortin Tam 2013-04-19 17:57:21 EDT
For the infamous Intel Sandybridge GPU power consumption issue (I'm not aware of other significant power management issues), this is not fixed with any of the latest kernels and it's been going on since the 3.6 kernel (last known good kernels were in the 3.5 series).


As hinted in upstream https://bugs.freedesktop.org/show_bug.cgi?id=54089 the kernel 3.8 does not fix the issue at all, at least not with Sandybridge Intel chipsets. It just makes the conditions to trigger it less deterministic, more random.

I've been testing *all* the kernels in the 3.8 series on Fedora 18 (including 3.8.3, .4, .5, 6, 7, etc), and the issue is still hitting me on a regular basis. As I understand it, it's a race condition so you just need the right amount of chaos in the system to trigger it.

The way I monitor the problem is by having powertop running at all times and watching the GPU section of its 2nd tab ("Idle stats"). Randomly, on resume my GPU will stay at 100% "powered on" instead of going into the RC6 state. Another easy indicator is that your computer temperature and fan speed go crazy.

When it happens, I need to re-suspend and re-resume, sometimes 3 or 4 times before the GPU will finally accept going into RC6 state while idling on the desktop.

This is hell.
Comment 78 João Gomes 2013-04-19 18:05:45 EDT
It happens with me exactly what is described in comment 77.
I have been trying several versions of 3.8 until 3.8.7.
Tomorrow, I'll try version 3.8.8.
Comment 79 Adam Williamson 2013-04-19 18:16:34 EDT
Thanks, folks. Just so no-one gets their hopes up too far - I'm just the QA monkey, working on the Common Bugs page. I'm not a developer able to actually help fix the issue, I was just updating the commonbugs entry for this issue. I assume the kernel devs are still working to fix this, but I don't have any specific info/advice on it. Thanks again for the feedback.
Comment 80 João Gomes 2013-04-20 07:08:00 EDT
The problem is still present in kernel 3.8.8.
I also noticed that if I turn the laptop on and I take some time to login, it will enter in the same condition, with the GPU not going to rc6.
Comment 81 Kahlil Hodgson 2013-05-13 18:17:56 EDT
Does anyone know if this power issue affect systems with Ivy Bridge CPUs? 
Hoping this issue will go away with an system upgrade.
Comment 82 Mark van Rossum 2013-06-19 06:40:18 EDT
Just a few comments:

- Still there in FC19, kernel 3.9.5-301.fc19.x86_64.

- I find i7z the most reliable tool to see if the bug occurs.

- Does anyone here NOT have a Lenovo?
Comment 83 Mark van Rossum 2013-06-19 06:40:41 EDT
Just a few comments:

- Still there in FC19, kernel 3.9.5-301.fc19.x86_64.

- I find i7z the most reliable tool to see if the bug occurs.

- Does anyone here NOT have a Lenovo?
Comment 84 fedora 2013-06-19 07:09:18 EDT
I have given up hope to ever be able to upgrade the kernel. I wonder why such a regression can be ignored for so long. Maybe we should send a Lenovo to Linus, so this bug gets his attention. I guess the bug would be fixed within days then...
Comment 85 Adam Williamson 2013-06-19 11:32:54 EDT
Well, it's not being ignored. Kernel work mostly happens upstream, so it seems sensible to follow the upstream bug, which was linked early on:

https://bugzilla.kernel.org/show_bug.cgi?id=48791

that has developer activity circa 05-06 - "Can you please all test whether https://patchwork.kernel.org/patch/2481431/ improves the rc6 behaviour?"

The answer seems to be 'no', but at least he's trying.
Comment 86 dekellum 2013-06-19 12:46:29 EDT
I'm still on Fedora 17 and my own kernel:

Linux retro 3.4.48-1.dek.fc17.x86_64 #1 SMP Mon Jun 10 18:34:02 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux

...which continues to be stable for me.  Would be nice if I could make it to kernel 3.10/Fedora 19 some day!

Is Lenovo just the largest market share of Intel Graphics 3000/Sandy Bridge (SNB) only laptops for Linux?  Anyone reproduced this on Intel Graphics 4000/Ivy Bridge? Not that I've found.  Are all the Kernel/drm/i915  developers using Ivy Bridge and not seeing this?
Comment 87 Adam Williamson 2013-06-19 13:02:47 EDT
Someone on the upstream bug has a Samsung, so it's apparently not Lenovo-specific.
Comment 88 Fedora End Of Life 2013-07-03 19:32:11 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 89 dekellum 2013-07-17 14:43:46 EDT
Sounds like they finally found a fix:

commit 7dcd2677ea912573d9ed4bcd629b0023b2d11505
Author: Konstantin Khlebnikov <khlebnikov@openvz.org>
Date:   Wed Jul 17 10:22:58 2013 +0400

    drm/i915: fix long-standing SNB regression in power consumption after resume

http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-fixes&id=7dcd2677ea912573d9ed4bcd629b0023b2d11505

Would love to know what kernels this will land in and when...
Comment 90 Joseph D. Wagner 2013-07-17 16:04:41 EDT
Also, if the patch cited in #89 resolves the issues, could it be back ported?
Comment 91 Josh Boyer 2013-07-22 08:25:44 EDT
(In reply to dekellum from comment #89)
> Sounds like they finally found a fix:
> 
> commit 7dcd2677ea912573d9ed4bcd629b0023b2d11505
> Author: Konstantin Khlebnikov <khlebnikov@openvz.org>
> Date:   Wed Jul 17 10:22:58 2013 +0400
> 
>     drm/i915: fix long-standing SNB regression in power consumption after
> resume
> 
> http://cgit.freedesktop.org/~danvet/drm-intel/commit/?h=drm-intel-
> fixes&id=7dcd2677ea912573d9ed4bcd629b0023b2d11505
> 
> Would love to know what kernels this will land in and when...

Based on the branch name, probably 3.11.  If not, 3.12.  It's CC'd to stable, so it should make its way to 3.10.y eventually.
Comment 92 Paul W. Frields 2013-08-03 08:52:19 EDT
I imagine some people would like a simple desktop notification that lets them know when this bug has popped up, before the machine heats up and burns their legs. I will attach two files: a shell script "gpu-check.sh" which should live in /lib/systemd/system-sleep (needs to be executable, chmod +x), and a notifier that needs to run locally. You can put the notifier in $HOME/bin and use gnome-session-properties to add it to your startup programs. This is designed only for GNOME, and you need pygobject3 and dbus-python installed, but most systems should have this already (required by firewalld and other things). If you run into problems, PLEASE DO NOT REPORT THEM IN THIS BUG. Email me privately and I will help as much as I can. Caveat emptor of course.
Comment 93 Paul W. Frields 2013-08-03 09:12:36 EDT
Having thought better of it, I uploaded the scripts and instructions here: http://pfrields.fedorapeople.org/bz866212-i915-wake/
Comment 94 Josh Boyer 2013-08-03 09:17:10 EDT
FWIW, the supposed fixes for this bug should be included in the upcoming 3.10.5 release.
Comment 95 Paul W. Frields 2013-08-15 15:13:35 EDT
They are.  Note that Fedora is up to 3.10.6 on F19 so anyone on that release should hopefully be seeing this bug solved.
Comment 96 Josh Boyer 2013-08-16 09:44:55 EDT
We've had multiple reports that this issue seems resolved with the latest 3.10.y rebases.  I'm going to close this bug report out.  Thanks to all that helped, and thank you for your patience while upstream worked out the issues.
Comment 97 Kamil Páral 2013-08-20 04:52:58 EDT
I have tried 10 resumes in a row with 3.10.6 and the problem hasn't appeared. Thanks.
Comment 98 Mark van Rossum 2013-12-18 10:34:13 EST
This bug is back for me under FC20 using
the kernel 
 3.12.5-301.fc20.x86_64 #1 SMP
Comment 99 Björn Ruberg 2013-12-28 07:15:08 EST
Confirmed, I see occassionally increased power usages on the X220 again.
Comment 100 Andrew Hutchings 2014-01-06 08:07:46 EST
Observed on my X220 too.  GPU is spennding 100% of time in the Powered On state, no RC6 states with that kernel.
Comment 101 Kamil Páral 2014-01-06 11:50:48 EST
Guys, could one of you please create a new bug report, add all useful details in there, and link it here? It might be a different problem and this bug report is already long enough, it would be best to have it reported separately. Thanks.

Note You need to log in before you can comment on or make changes to this bug.