Bug 532161 - Wrong ACPI temperature after suspend/resume
Summary: Wrong ACPI temperature after suspend/resume
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 12
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-10-30 23:53 UTC by Agustin Barto
Modified: 2013-04-16 18:42 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-23 16:24:08 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Output of lspci >lspci.log 2>&1 (2.28 KB, application/octet-stream)
2009-12-03 18:36 UTC, Agustin Barto
no flags Details
Output of dmesg >dmesg.log 2>&1 (right after cold boot) (43.21 KB, application/octet-stream)
2009-12-03 18:37 UTC, Agustin Barto
no flags Details
Output of dmesg >dmesg_after.log 2>&1 (right after resume) (55.74 KB, application/octet-stream)
2009-12-03 18:38 UTC, Agustin Barto
no flags Details

Description Agustin Barto 2009-10-30 23:53:17 UTC
Description of problem:

When using kernel 2.6.30.9-90.fc11.i686.PAE ACPI reports all temperature zones are 0 C. When this happens the cooling fans are never turned on causing instant shutdowns when the temperatures reach a dangerous level.


Version-Release number of selected component (if applicable): 2.6.30.9-90.fc11.i686.PAE

With the previous stable kernel version:

[abarto@roadrunner ~]$ uname -r
2.6.30.8-64.fc11.i686.PAE
[abarto@roadrunner ~]$ sudo cat /proc/acpi/thermal_zone/TZ01/temperature
temperature:             39 C

(after suspend/resume)

[abarto@roadrunner ~]$ sudo cat /proc/acpi/thermal_zone/TZ01/temperature
temperature:             39 C

With the latest kernel version and all updates from the testing repo:

[abarto@roadrunner ~]$ uname -r
2.6.30.9-90.fc11.i686.PAE
[abarto@roadrunner ~]$ sudo cat /proc/acpi/thermal_zone/TZ01/temperature 
temperature:             39 C

(after suspend/resume)

[abarto@roadrunner ~]$ sudo cat /proc/acpi/thermal_zone/TZ01/temperature 
temperature:             0 C

Comment 1 Agustin Barto 2009-11-06 12:11:25 UTC
Problem persists with 2.6.30.9-96.fc11.i686.PAE.

Comment 2 Agustin Barto 2009-11-19 12:28:41 UTC
The issue seems to be resolved in 2.6.31.5-127.fc12.x86_64

Comment 3 Agustin Barto 2009-11-30 23:21:52 UTC
Problem persists with 2.6.31.6-152.fc12.i686.PAE (from Koji).

Comment 4 Agustin Barto 2009-12-01 00:36:38 UTC
Problem also happens with 2.6.31.6-145.fc12.i686.PAE (latest from updates-testing).

Comment 5 Chuck Ebbert 2009-12-03 17:45:00 UTC
What kind of system (vendor and model) is this happening on?

Comment 6 Agustin Barto 2009-12-03 18:36:00 UTC
Created attachment 375870 [details]
Output of lspci >lspci.log 2>&1

Comment 7 Agustin Barto 2009-12-03 18:37:41 UTC
Created attachment 375871 [details]
Output of dmesg >dmesg.log 2>&1 (right after cold boot)

Comment 8 Agustin Barto 2009-12-03 18:38:48 UTC
Created attachment 375872 [details]
Output of dmesg >dmesg_after.log 2>&1 (right after resume)

Comment 9 Agustin Barto 2009-12-03 18:41:12 UTC
The system is a Dell Studio 1555 running 2.6.31.6-161.fc12.i686.PAE (from Koji). The problem happens in all other versions cited before.

Comment 10 Agustin Barto 2009-12-12 20:03:43 UTC
Is there a workaround for this? It's rather annoying not being able to suspend the machine.

Comment 11 Agustin Barto 2009-12-16 00:59:53 UTC
Problem persists with 2.6.31.8-169.fc12.i686.PAE (from Koji)

Comment 12 Agustin Barto 2009-12-31 03:04:39 UTC
Problem persists with 2.6.31.9-174.fc12.x86_64.

Comment 13 Agustin Barto 2010-01-07 18:38:48 UTC
Problem persists with kernel-PAE-2.6.32.3-10.fc12.i686 (from Koji)

Comment 14 Agustin Barto 2010-01-21 13:02:28 UTC
Problem persists with kernel-PAE-2.6.31.12-174.2.3.fc12.i686.

Comment 15 Agustin Barto 2010-01-23 14:48:04 UTC
I tried to reproduce the issue with an Ubuntu 9.10 live-cd (which also uses 2.6.31) and I couldn't, so it seems it's a Fedora specific bug.

Comment 16 tucsonjohn 2010-01-25 18:59:05 UTC
I have the same problem and so does at least one other person.

Here is a link to another report.
http://forums.fedoraforum.org/showthread.php?p=1322435

The details are in the link above. But here is some data.
hardware: Dell Studio 1537
dist: Debian unstable
kernel: 2.6.32.5
video driver: fglrx 9.12
suspend software: pm-suspend v 1.2.6.1

I find that unplugging the AC (it can then be plugged in immediately)
fixes the problem. But this is, of course, not the best solution.

Comment 17 Agustin Barto 2010-01-25 20:09:39 UTC
Plugging/unplugging the power cord doesn't work in my case.

Comment 18 tucsonjohn 2010-02-08 21:11:09 UTC
 Now running 2.6.31.9, I do not have the problem.
(but lid close no longer works; pm-suspend does
work) It may have to do with kernel configuration options, but sifting through them and comparing them without knowing what you are looking for is extremely
tedious and timeconsuming.

Since Agustin has the problem with 2.6.31.9 and I do not, I suppose it
could be some configuration problem.

Among problems with recent kernels are: filesystem corruption on every shutdown, wifi dropping regularly, backlight coming up stuck in some random state. I am trying to find one kernel that has none of the worst ones. But, for every problem, I can find a kernel that
does not have it.
Edit/Delete Message

Comment 19 Agustin Barto 2010-02-21 22:31:57 UTC
Problem persists with 2.6.32.8-58.fc12.i686.PAE. FANTASTIC.

Comment 20 maximilian.mehnert 2010-02-25 10:32:26 UTC
Problem persists with 2.6.33-rc8

Comment 21 Agustin Barto 2010-03-02 11:46:43 UTC
Problem persistis with 2.6.32.9-67.fc12.i686.PAE. AWESOME.

Comment 22 maximilian.mehnert 2010-03-02 12:39:23 UTC
Since I see this on a custom built kernel on debian, I filed 
http://bugzilla.kernel.org/show_bug.cgi?id=15425

Comment 23 Jonathan Larmour 2010-03-08 17:18:04 UTC
I can confirm it still exists with my Dell Studio 17 laptop after some suspends. For me as tucsonjohn suggested, unplugging the AC adaptor does bring the temperature readings back, and the fan starts working again. Unplugging the adaptor while suspended does not help. Closing/opening the lid does correctly suspend/restore. If the AC adaptor was already disconnected when suspending, then when restoring, turning it back on doesn't help - it's specifically the act of turning the AC adaptor off which jogs it back to reality.

My libsensors reports 5 temperatures altogether: temp{1,2,3} and core{0,1}. I believe the temps are CPU, north bridge and GPU respectively, but I'm not sure. Of course now none of those get cooled with no fan. Interestingly temps 1/2/3 become 0, but cores 0/1 report seemingly sane temperatures. (Perhaps worth mentioning that for some unknown reasons, gnome-applet-sensors never reports any change, even when i load the CPU. It does normally.)

Command line sensors does however show the core temperature changing:
$ sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +0.0°C  (crit = +100.0°C)                  
temp2:        +0.0°C  (crit = +100.0°C)                  
temp3:        +0.0°C  (crit = +100.0°C)                  

coretemp-isa-0000
Adapter: ISA adapter
Core 0:      +72.0°C  (high = +105.0°C, crit = +105.0°C)  

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +63.0°C  (high = +105.0°C, crit = +105.0°C)  

Also of interest:
$ cat /sys/devices/platform/coretemp.*/temp1_input 
72000
63000
$ cat /proc/acpi/thermal_zone/TZ*/temperature
temperature:             0 C
temperature:             0 C
temperature:             0 C
(and acpitool -t obviously concurs)

I'm amazed this is a low priority - this can damage hardware. I don't think it would be good to have reports of "Fedora fried my laptop".

Comment 24 Agustin Barto 2010-03-08 17:39:00 UTC
Jonathan, I reported it with low priority 'cause it had a workaround (using regular halt instead of suspend). After 4 months of cold reboots, I realize it was a mistake :)

Anyway, AFAIK this bug is in the vanilla kernel, so it would be "Linux fried my laptop". Luckily, most systems turn themselves off immediately when they get too hot. I know my Studio 1555 does.

Comment 25 Jonathan Larmour 2010-03-10 14:21:06 UTC
Agustin, I'm sure it will be a surprise to many people who expect suspend to just work (as it appears to). The only apparent symptom is when it turns off due to reaching dangerous temperatures, unless you've installed the sensors. It's a shame you can't increase the priority of the bug - if I opened another of higher priority it would just be closed as a dup of this one.

Not only do high temperatures shorten life-span, but AFAIK only the CPU reaching a critical temperature (100degC I think) will cause it to turn itself off. The north bridge and GPU are also cooled by the fan (at least in my laptop, but I suspect many others), and I'm not at all sure they will turn off - in which case they could be damaged. 

Anyway more useful info is here: http://bugzilla.kernel.org/show_bug.cgi?id=14667 showing that Dell laptops appear to need burst mode (don't be confused by a number of people there reporting different suspend problems on different laptops). A fix is to reverse the commit in http://bugzilla.kernel.org/show_bug.cgi?id=14667#c11 specifically:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.32.y.git;a=blobdiff;f=drivers/acpi/ec.c;h=788db781a5197b9b5616013c5b1fad4c77ca2cb8;hp=839b542d508746a928409dcfc9799d9794f14794;hb=6a63b06f3c494cc87eade97f081300bda60acec7;hpb=2a84cb9852f52c0cd1c48bca41a8792d44ad06cc

As per http://bugzilla.kernel.org/show_bug.cgi?id=14667#c36 it seems that the complete fix will instead want to detect both MSI and Dell from the DMI info. But until that's done,  for now just reverting the above patch may well be adequate.

Any Fedora kernel developer able to include that change?

Comment 26 Agustin Barto 2010-03-16 20:17:18 UTC
It seems the patch from http://bugzilla.kernel.org/show_bug.cgi?id=14667#c89 fixes the problem I'll try as soon as possible.

Comment 27 Agustin Barto 2010-03-18 12:29:24 UTC
I can't confirm that the patch fixes the problem because the system won't suspend while using vanilla kernel 2.6.34-rc1. *sigh*

Comment 28 Agustin Barto 2010-03-23 20:47:05 UTC
Could this patch be backported to 2.6.32?

Comment 29 Chuck Ebbert 2010-04-21 14:31:07 UTC
This patch is in 2.6.32.11-108:

  https://bugzilla.kernel.org/show_bug.cgi?id=14667#c97

Comment 30 Agustin Barto 2010-04-28 02:10:52 UTC
Yey! The defect seems to be missing in 2.6.32.12-114.fc12.i686.PAE.

Comment 31 Fedora Update System 2010-04-28 04:35:41 UTC
kernel-2.6.32.12-114.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.32.12-114.fc12

Comment 32 Fedora Update System 2010-05-17 05:50:39 UTC
kernel-2.6.32.12-115.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.32.12-115.fc12

Comment 33 Fedora Update System 2010-05-18 21:59:01 UTC
kernel-2.6.32.12-115.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 34 Agustin Barto 2010-05-18 22:06:59 UTC
I've been using kernel-2.6.32.12-115.fc12 since it entered updates-testing with no problems.

Comment 35 Phil V 2010-06-05 19:28:27 UTC
My Compaq Presario A916NR Notebook still displays this bug even with 
 with kernel 2.6.32.12-115.fc12.x86_64    #1 SMP Fri Apr 30 19:46:25 UTC 2010 

After resume, the first two reported temperatures below are perpetually and falsely fixed at room temperature; the Core temperatures do vary reasonably.

[pv@localhost ~]$ cat /proc/acpi/thermal_zone/TZ01/temperature ; sensors
temperature:             27 C
acpitz-virtual-0
Adapter: Virtual device
temp1:       +26.8°C  (crit = +100.0°C)                  

coretemp-isa-0000
Adapter: ISA adapter
Core 0:      +44.0°C  (high = +100.0°C, crit = +100.0°C)  

coretemp-isa-0001
Adapter: ISA adapter
Core 1:      +48.0°C  (high = +100.0°C, crit = +100.0°C)  

(This is an improvement over previous kernels, which had the additional bug that after resume from suspend/hibernate even idle CPU cores would run extremely hot--  even though the fan would run in my case. see https://bugzilla.redhat.com/show_bug.cgi?id=554625 )

Comment 36 Phil V 2010-07-15 21:15:00 UTC
Correction to #35 -- my fan does NOT turn back on after resume. 
Before suspend the fan will vary based on CPU loading.
After resume it stays off.

 /proc/acpi/fan/ is empty both before and after resume.

Any advice?

Comment 37 Chuck Ebbert 2010-07-16 14:55:47 UTC
(In reply to comment #35)
> My Compaq Presario A916NR Notebook still displays this bug even with 
>  with kernel 2.6.32.12-115.fc12.x86_64    #1 SMP Fri Apr 30 19:46:25 UTC 2010 
> 
> After resume, the first two reported temperatures below are perpetually and
> falsely fixed at room temperature; the Core temperatures do vary reasonably.

This could be caused by the sensors code conflicting with ACPI. Does it still happen with the sensors drivers not being loaded? (Note: don't load them and then unload them -- make sure they never load at all from cold boot.)

Comment 38 Phil V 2010-07-17 18:01:30 UTC
Please specify explicitly how I am to ensure these drivers (which ones?) never load at all.

Comment 39 Phil V 2010-09-22 19:30:28 UTC
My laptop works now with recent kernels. Thank you!


Note You need to log in before you can comment on or make changes to this bug.