Bug 856047 - laptop powering off due to overheating
laptop powering off due to overheating
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-11 00:51 EDT by Abhay
Modified: 2013-04-05 15:18 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-04-05 15:18:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages (3.63 MB, text/plain)
2012-09-11 00:54 EDT, Abhay
no flags Details
smolt_profile (3.44 KB, text/plain)
2012-09-11 00:58 EDT, Abhay
no flags Details
dmesg (79.64 KB, text/plain)
2012-09-11 00:59 EDT, Abhay
no flags Details
lspci (2.25 KB, text/plain)
2012-09-11 01:00 EDT, Abhay
no flags Details
var/log/Xorg.0.log (41.21 KB, text/x-log)
2012-09-11 01:04 EDT, Abhay
no flags Details
/var/log/Xorg.1.log (20.42 KB, text/x-log)
2012-09-11 01:05 EDT, Abhay
no flags Details

  None (edit)
Description Abhay 2012-09-11 00:51:51 EDT
Description of problem:
laptop dies of overheating


Version-Release number of selected component (if applicable):
kernel-3.6.0-0.rc4.git2.1.fc18.x86_64


How reproducible:
Always


Steps to Reproduce:
1.Start the laptop
2.Do some work for half-an-hour (sometimes even less) like surfing the net or checking the mail using evolution.
3.The laptop powers off crashing the ongoing processes.
  
Actual results:
The running processes forcefully terminated and the laptop turns itself off.


Expected results:
The laptop should not overheat


Additional info:
Comment 1 Abhay 2012-09-11 00:54:23 EDT
Created attachment 611643 [details]
/var/log/messages
Comment 2 Abhay 2012-09-11 00:58:30 EDT
Created attachment 611644 [details]
smolt_profile
Comment 3 Abhay 2012-09-11 00:59:38 EDT
Created attachment 611645 [details]
dmesg
Comment 4 Abhay 2012-09-11 01:00:41 EDT
Created attachment 611646 [details]
lspci
Comment 5 Abhay 2012-09-11 01:04:42 EDT
Created attachment 611649 [details]
var/log/Xorg.0.log
Comment 6 Abhay 2012-09-11 01:05:31 EDT
Created attachment 611657 [details]
/var/log/Xorg.1.log
Comment 7 Josh Boyer 2012-09-11 08:56:00 EDT
Normally the fan control for machines is handled by the firmware or an EC.  The kernel you're running has all of the debug options turned on and might be adding load to the system, but there isn't much the kernel can do if the system cooling isn't sufficient.

You might try recreating with a non-debug kernel.  It might also be worthwhile to see if there is a large build-up of dust around the cooling vents or in the fans.

Sep 11 09:29:07 abhay-laptop kernel: [ 2862.708614] CPU3: Core temperature above threshold, cpu clock throttled (total events = 2020)
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.709594] CPU1: Core temperature/speed normal
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.709635] CPU3: Core temperature/speed normal
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.746693] CPU0: Core temperature above threshold, cpu clock throttled (total events = 2862)
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.746763] CPU2: Core temperature above threshold, cpu clock throttled (total events = 2861)
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.749751] CPU0: Core temperature/speed normal
Sep 11 09:29:07 abhay-laptop kernel: [ 2862.749772] CPU2: Core temperature/speed normal
Sep 11 09:30:44 abhay-laptop kernel: [ 2958.829066] mce: [Hardware Error]: Machine check events logged
Sep 11 09:30:44 abhay-laptop mcelog[674]: mcelog: Unsupported new Family 6 Model 25 CPU: only decoding architectural errors
Sep 11 09:30:44 abhay-laptop mcelog[674]: Hardware event. This is not a software error.
Sep 11 09:30:44 abhay-laptop mcelog[674]: MCE 0
Sep 11 09:30:44 abhay-laptop mcelog[674]: CPU 1 THERMAL EVENT TSC 646599f6dcc
Sep 11 09:30:44 abhay-laptop mcelog[674]: TIME 1347335947 Tue Sep 11 09:29:07 2012
Sep 11 09:30:44 abhay-laptop mcelog[674]: Processor 1 heated above trip temperature. Throttling enabled.
Sep 11 09:30:44 abhay-laptop mcelog[674]: Please check your system cooling. Performance will be impacted
Sep 11 09:30:44 abhay-laptop mcelog[674]: STATUS 880003c3 MCGSTATUS 0
Sep 11 09:30:44 abhay-laptop mcelog[674]: MCGCAP c09 APICID 4 SOCKETID 0
Comment 8 Abhay 2012-09-13 04:32:25 EDT
I try to run the non-debug kernel 3.3.4-5.fc17.x86_64, and it's (my laptop) is still up for around 3 hours...

Thanks,
Abhay

(In reply to comment #7)
> Normally the fan control for machines is handled by the firmware or an EC. 
> The kernel you're running has all of the debug options turned on and might
> be adding load to the system, but there isn't much the kernel can do if the
> system cooling isn't sufficient.
> 
> You might try recreating with a non-debug kernel.  It might also be
> worthwhile to see if there is a large build-up of dust around the cooling
> vents or in the fans.
> 
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.708614] CPU3: Core temperature
> above threshold, cpu clock throttled (total events = 2020)
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.709594] CPU1: Core
> temperature/speed normal
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.709635] CPU3: Core
> temperature/speed normal
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.746693] CPU0: Core temperature
> above threshold, cpu clock throttled (total events = 2862)
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.746763] CPU2: Core temperature
> above threshold, cpu clock throttled (total events = 2861)
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.749751] CPU0: Core
> temperature/speed normal
> Sep 11 09:29:07 abhay-laptop kernel: [ 2862.749772] CPU2: Core
> temperature/speed normal
> Sep 11 09:30:44 abhay-laptop kernel: [ 2958.829066] mce: [Hardware Error]:
> Machine check events logged
> Sep 11 09:30:44 abhay-laptop mcelog[674]: mcelog: Unsupported new Family 6
> Model 25 CPU: only decoding architectural errors
> Sep 11 09:30:44 abhay-laptop mcelog[674]: Hardware event. This is not a
> software error.
> Sep 11 09:30:44 abhay-laptop mcelog[674]: MCE 0
> Sep 11 09:30:44 abhay-laptop mcelog[674]: CPU 1 THERMAL EVENT TSC 646599f6dcc
> Sep 11 09:30:44 abhay-laptop mcelog[674]: TIME 1347335947 Tue Sep 11
> 09:29:07 2012
> Sep 11 09:30:44 abhay-laptop mcelog[674]: Processor 1 heated above trip
> temperature. Throttling enabled.
> Sep 11 09:30:44 abhay-laptop mcelog[674]: Please check your system cooling.
> Performance will be impacted
> Sep 11 09:30:44 abhay-laptop mcelog[674]: STATUS 880003c3 MCGSTATUS 0
> Sep 11 09:30:44 abhay-laptop mcelog[674]: MCGCAP c09 APICID 4 SOCKETID 0
Comment 9 Fedora End Of Life 2013-04-03 11:40:55 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Note You need to log in before you can comment on or make changes to this bug.