Bug 532161
Summary: | Wrong ACPI temperature after suspend/resume | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Agustin Barto <abarto> | ||||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | low | ||||||||||
Version: | 12 | CC: | dougsland, gansalmon, itamar, jifl-bugzilla, jvpgomes, kernel-maint, lapeyre.math122a, maximilian.mehnert, pv.bugzilla | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2010-09-23 16:24:08 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Agustin Barto
2009-10-30 23:53:17 UTC
Problem persists with 2.6.30.9-96.fc11.i686.PAE. The issue seems to be resolved in 2.6.31.5-127.fc12.x86_64 Problem persists with 2.6.31.6-152.fc12.i686.PAE (from Koji). Problem also happens with 2.6.31.6-145.fc12.i686.PAE (latest from updates-testing). What kind of system (vendor and model) is this happening on? Created attachment 375870 [details]
Output of lspci >lspci.log 2>&1
Created attachment 375871 [details]
Output of dmesg >dmesg.log 2>&1 (right after cold boot)
Created attachment 375872 [details]
Output of dmesg >dmesg_after.log 2>&1 (right after resume)
The system is a Dell Studio 1555 running 2.6.31.6-161.fc12.i686.PAE (from Koji). The problem happens in all other versions cited before. Is there a workaround for this? It's rather annoying not being able to suspend the machine. Problem persists with 2.6.31.8-169.fc12.i686.PAE (from Koji) Problem persists with 2.6.31.9-174.fc12.x86_64. Problem persists with kernel-PAE-2.6.32.3-10.fc12.i686 (from Koji) Problem persists with kernel-PAE-2.6.31.12-174.2.3.fc12.i686. I tried to reproduce the issue with an Ubuntu 9.10 live-cd (which also uses 2.6.31) and I couldn't, so it seems it's a Fedora specific bug. I have the same problem and so does at least one other person. Here is a link to another report. http://forums.fedoraforum.org/showthread.php?p=1322435 The details are in the link above. But here is some data. hardware: Dell Studio 1537 dist: Debian unstable kernel: 2.6.32.5 video driver: fglrx 9.12 suspend software: pm-suspend v 1.2.6.1 I find that unplugging the AC (it can then be plugged in immediately) fixes the problem. But this is, of course, not the best solution. Plugging/unplugging the power cord doesn't work in my case. Now running 2.6.31.9, I do not have the problem. (but lid close no longer works; pm-suspend does work) It may have to do with kernel configuration options, but sifting through them and comparing them without knowing what you are looking for is extremely tedious and timeconsuming. Since Agustin has the problem with 2.6.31.9 and I do not, I suppose it could be some configuration problem. Among problems with recent kernels are: filesystem corruption on every shutdown, wifi dropping regularly, backlight coming up stuck in some random state. I am trying to find one kernel that has none of the worst ones. But, for every problem, I can find a kernel that does not have it. Edit/Delete Message Problem persists with 2.6.32.8-58.fc12.i686.PAE. FANTASTIC. Problem persists with 2.6.33-rc8 Problem persistis with 2.6.32.9-67.fc12.i686.PAE. AWESOME. Since I see this on a custom built kernel on debian, I filed http://bugzilla.kernel.org/show_bug.cgi?id=15425 I can confirm it still exists with my Dell Studio 17 laptop after some suspends. For me as tucsonjohn suggested, unplugging the AC adaptor does bring the temperature readings back, and the fan starts working again. Unplugging the adaptor while suspended does not help. Closing/opening the lid does correctly suspend/restore. If the AC adaptor was already disconnected when suspending, then when restoring, turning it back on doesn't help - it's specifically the act of turning the AC adaptor off which jogs it back to reality. My libsensors reports 5 temperatures altogether: temp{1,2,3} and core{0,1}. I believe the temps are CPU, north bridge and GPU respectively, but I'm not sure. Of course now none of those get cooled with no fan. Interestingly temps 1/2/3 become 0, but cores 0/1 report seemingly sane temperatures. (Perhaps worth mentioning that for some unknown reasons, gnome-applet-sensors never reports any change, even when i load the CPU. It does normally.) Command line sensors does however show the core temperature changing: $ sensors acpitz-virtual-0 Adapter: Virtual device temp1: +0.0°C (crit = +100.0°C) temp2: +0.0°C (crit = +100.0°C) temp3: +0.0°C (crit = +100.0°C) coretemp-isa-0000 Adapter: ISA adapter Core 0: +72.0°C (high = +105.0°C, crit = +105.0°C) coretemp-isa-0001 Adapter: ISA adapter Core 1: +63.0°C (high = +105.0°C, crit = +105.0°C) Also of interest: $ cat /sys/devices/platform/coretemp.*/temp1_input 72000 63000 $ cat /proc/acpi/thermal_zone/TZ*/temperature temperature: 0 C temperature: 0 C temperature: 0 C (and acpitool -t obviously concurs) I'm amazed this is a low priority - this can damage hardware. I don't think it would be good to have reports of "Fedora fried my laptop". Jonathan, I reported it with low priority 'cause it had a workaround (using regular halt instead of suspend). After 4 months of cold reboots, I realize it was a mistake :) Anyway, AFAIK this bug is in the vanilla kernel, so it would be "Linux fried my laptop". Luckily, most systems turn themselves off immediately when they get too hot. I know my Studio 1555 does. Agustin, I'm sure it will be a surprise to many people who expect suspend to just work (as it appears to). The only apparent symptom is when it turns off due to reaching dangerous temperatures, unless you've installed the sensors. It's a shame you can't increase the priority of the bug - if I opened another of higher priority it would just be closed as a dup of this one. Not only do high temperatures shorten life-span, but AFAIK only the CPU reaching a critical temperature (100degC I think) will cause it to turn itself off. The north bridge and GPU are also cooled by the fan (at least in my laptop, but I suspect many others), and I'm not at all sure they will turn off - in which case they could be damaged. Anyway more useful info is here: http://bugzilla.kernel.org/show_bug.cgi?id=14667 showing that Dell laptops appear to need burst mode (don't be confused by a number of people there reporting different suspend problems on different laptops). A fix is to reverse the commit in http://bugzilla.kernel.org/show_bug.cgi?id=14667#c11 specifically: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.32.y.git;a=blobdiff;f=drivers/acpi/ec.c;h=788db781a5197b9b5616013c5b1fad4c77ca2cb8;hp=839b542d508746a928409dcfc9799d9794f14794;hb=6a63b06f3c494cc87eade97f081300bda60acec7;hpb=2a84cb9852f52c0cd1c48bca41a8792d44ad06cc As per http://bugzilla.kernel.org/show_bug.cgi?id=14667#c36 it seems that the complete fix will instead want to detect both MSI and Dell from the DMI info. But until that's done, for now just reverting the above patch may well be adequate. Any Fedora kernel developer able to include that change? It seems the patch from http://bugzilla.kernel.org/show_bug.cgi?id=14667#c89 fixes the problem I'll try as soon as possible. I can't confirm that the patch fixes the problem because the system won't suspend while using vanilla kernel 2.6.34-rc1. *sigh* Could this patch be backported to 2.6.32? This patch is in 2.6.32.11-108: https://bugzilla.kernel.org/show_bug.cgi?id=14667#c97 Yey! The defect seems to be missing in 2.6.32.12-114.fc12.i686.PAE. kernel-2.6.32.12-114.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/kernel-2.6.32.12-114.fc12 kernel-2.6.32.12-115.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/kernel-2.6.32.12-115.fc12 kernel-2.6.32.12-115.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report. I've been using kernel-2.6.32.12-115.fc12 since it entered updates-testing with no problems. My Compaq Presario A916NR Notebook still displays this bug even with with kernel 2.6.32.12-115.fc12.x86_64 #1 SMP Fri Apr 30 19:46:25 UTC 2010 After resume, the first two reported temperatures below are perpetually and falsely fixed at room temperature; the Core temperatures do vary reasonably. [pv@localhost ~]$ cat /proc/acpi/thermal_zone/TZ01/temperature ; sensors temperature: 27 C acpitz-virtual-0 Adapter: Virtual device temp1: +26.8°C (crit = +100.0°C) coretemp-isa-0000 Adapter: ISA adapter Core 0: +44.0°C (high = +100.0°C, crit = +100.0°C) coretemp-isa-0001 Adapter: ISA adapter Core 1: +48.0°C (high = +100.0°C, crit = +100.0°C) (This is an improvement over previous kernels, which had the additional bug that after resume from suspend/hibernate even idle CPU cores would run extremely hot-- even though the fan would run in my case. see https://bugzilla.redhat.com/show_bug.cgi?id=554625 ) Correction to #35 -- my fan does NOT turn back on after resume. Before suspend the fan will vary based on CPU loading. After resume it stays off. /proc/acpi/fan/ is empty both before and after resume. Any advice? (In reply to comment #35) > My Compaq Presario A916NR Notebook still displays this bug even with > with kernel 2.6.32.12-115.fc12.x86_64 #1 SMP Fri Apr 30 19:46:25 UTC 2010 > > After resume, the first two reported temperatures below are perpetually and > falsely fixed at room temperature; the Core temperatures do vary reasonably. This could be caused by the sensors code conflicting with ACPI. Does it still happen with the sensors drivers not being loaded? (Note: don't load them and then unload them -- make sure they never load at all from cold boot.) Please specify explicitly how I am to ensure these drivers (which ones?) never load at all. My laptop works now with recent kernels. Thank you! |