My Thinkpad X201 has now shutdown "randomly" a handful of times: on each occasion I catch a glimpse of an "overheat at 100 degrees" or similar message from the kernel. This problem has not always happened with this laptop. I have no recollection of the problem before the new year. I upgraded to Fedora 14 on 3rd Jan 2011. This may just be coincidence, but I don't believe I experienced the problem before the upgrade. I have also noticed the area around the fan outlet (and presumably CPU) is sometimes very hot to the touch. echo level full-speed > /proc/acpi/ibm/fan demonstrates that the fan is capable of spinning up to a high (and noisy speed). The fan is definitely *not* even approaching these speeds before the emergency thermal shutdowns are occurring: there is no audible hint that the fan is spinning above its usual background speed. I ran a short test using watch on both /proc/acpi/ibm/fan and /proc/acpi/ibm/thermal, gradually increasing the CPU load with the fan set to "auto" (the default). - under light load, the fan increases from speed ~3450 (RPM?) to ~3700 - under increasing load /proc/acpi/ibm/thermal shows the temparature increasing through 70, then 80, then 90 with no increase in fan speed - when the temperature hit 93 I set /proc/acpi/ibm/fan to "full-speed", the fan speed increased to ~6700 and the temperature was brought down to the mid-70s It seems to me that the fan speed is not being increase despite the high system temperatures. One thing to note, however, is that the output of /proc/acpi/ibm/thermal as I sit writing this (i.e. under light load) is: temperatures: 55 0 0 0 0 0 0 0 which in itself seems a bit high to my inexpert eyes. Linux pootle 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23 16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
This fan speeds also seem a bit mis-aligned here? echo level full-speed > /proc/acpi/ibm/fan -> speed: 6613 echo level 7 > /proc/acpi/ibm/fan -> speed: 4208 yet: commands: level <level> (<level> is 0-7, auto, disengaged, full-speed) i.e. the fan speed at level 7 (in theory the maximum) is considerably lower than the "full-speed". (As expected attempting to set the level to >7, e.g. 8, is invalid) I'm changing fan speed having enabled the fan_control=1 for the thinkpad_acpi module.
Same happening here on T500 after the recent kernel update: [root@localhost ibm]# rpm -q kernel kernel-2.6.35.11-83.fc14.x86_64 [root@localhost ibm]# cat /proc/acpi/ibm/fan status: enabled speed: 0 level: auto
Additional info, the notebook stays in passive cooling mode no matter what the system load is. CPU freq scaling works correctly. [root@localhost THM0]# cat temperature temperature: 68 C ^^ this is with barely any load
Another observation: the Fn-PgUp activation of the "ThinkLight" doesn't work any more either (I can't turn it on). But the light turns on and off pressing these keys during BIOS startup. I wonder whether this might be a BIOS incompatibility issue? These issue above are with BIOS 1.32 (13/01/2011). I've downgraded the BIOS on the ThinkPad to 1.16 (June 2010 vintage, which is around when I bought the laptop) and the thermal regulation now seems much better: I can watch /proc/acpi/ibm/fan and /proc/acpi/ibm/thermal and see the fan speed increase (above the ~3700 limit I was seeing previously) and decrease to regulate temperature. Although the laptop is still showing 65 C at low load and the fan doesn't really go above 4000 until the temp is 99, which seems a bit hot to me (and feels very hot on the outside of the case!).
Spoke too soon: just had a shutdown due to reaching critical temperature (no warning) with the older bios and 2.6.35.11-83.fc14.x86_64 I also note that, even with the older bios, the thinklight still can't be turned on/off.
Kevin, If you blacklist the thinkpad-acpi module, do you still have the problem with the thinklight?
(In reply to comment #6) > If you blacklist the thinkpad-acpi module, do you still have the problem with > the thinklight? Hmmm. Something more subtle is going on with the thinklight. With thinkpad-acpi disabled, the thinklight is, as I sit here, working. However, a little earlier it wasn't... kind of: I pressed the key combo to turn it on, nothing. Then pressed it a few more times. Nothing (for at least 20 seconds). However a while later (2 mins? More?) I came back to the laptop and the thinklight was on. It was daylight, so I can't like to say exactly when it turned on, but it certainly wasn't reacting immediately as it is now. Furthermore, rebooting with thinkpad-acpi on, and the thinklight seems to work there too. I do use suspend; there's a part of me that's suspicious I may see more problems after resuming, but don't have enough data to back that up. (would modprobing thinkpad-acpi in/out be as effective as rebooting?) On the fan ---------- I've noticed that the fan does speed up eventually, when /proc/acpi/thermal_zone/THM0/temperature reads 98 or 99 degrees. It doesn't reach full speed, but this could be because shortly after it reaches this point I'm either acting to cool it down, or the laptop is turning itself off :( It would be really useful to know how the cooling fan is meant to work at various temperatures, and how I can test whether this is functioning properly. How can I find the "normal" expected temperature? As I sit here with not much running, /proc/acpi/thermal_zone/THM0/temperature is 67 C. I would really like to ascertain whether the problem is: 1) software 2) hardware (e.g. temp sensor) 3) was software, but is now hardware My perception (backed a little by the 67 C I see now vs. the 55 C in comment #0) is that the laptop is now running hotter, and needs a smaller increase in load to hit 100 degrees and shut down. I'm concerned that, having hit these temperatures numerous times, any thermal compound will now be in bad shape and operating well below optimum efficiency (based on experience of desktop PCs, not laptops, I don't know for sure there's thermal compound in there...) I've taken to running the fan at level 7 and manually going to full-speed when there's a bit of load, but this hits battery life and doesn't work particularly effectively anyway...
If it's any help for you, I'm not having any of these issues on Thinkpad X201 Tablet. My kernel version is kernel-2.6.35.11-83.fc14.x86_64 and thinkpad_acpi is loaded. I can't force the system over 75 C under any kind of load. Freq scaling and fan speeds work as expected. Here are the measurings under a light load: [root@pazuzu ~]# cat /proc/acpi/thermal_zone/THM0/temperature temperature: 54 C [root@pazuzu ~]# cat /proc/acpi/ibm/fan status: enabled speed: 1959 level: auto
(In reply to comment #8) Thank you, that's useful. I'll arrange for an engineer to check the hardware and start the warranty process etc. Out of interest, does you thinklight work?
(In reply to comment #9) > I'll arrange for an engineer to check the hardware > and start the warranty process etc. My laptop motherboard has been swapped out under warranty. This has both lowered the idle temperatures (currently sitting at 49 under normal use) and the Thinklight is now activated/deactivated immediately upon keypress. If others are experiencing this exact problem I suppose there may be a manufacturing defect affecting this series? My X201 type number is 3249-CTO. Unless others with the problem think otherwise, this bug can be closed?
Closing this out as it seems it was HW related.
I continued to suffer this problem in some form. I think, in hindsight, that the hardware swap outs (two so far) only reduced the problem temporarily -- perhaps the new components are initially more efficient before wearing in. There also seems to be a step change worsening after the laptop had got so hot it shut down -- I can imagine getting that hot doesn't do the thermal compound much good. Unfortunately the engineers who fitted the replacements could only do that -- they weren't willing or able to try and get to the bottom of the problem. I mention it here again as there's an Ubuntu bug (linked in external trackers field) which seems to throw up numerous similar reports. It doesn't seem to provide an answers, though. I'm still not clear whether there could be a mis-calibration in the kernel or if it's a pure BIOS bug.
Poor fan control under Linux seems to be quite a general problem for Thinkpads these days as far as I can tell: I see it on both my T500 and T420s, and basically find myself turning up the fan manually when doing heavy builds, etc. I can't claim to understand all the background but it seems surprising that the issue has not been addressed yet upstream...
I also have a Lenovo X201 computer with 8GB RAM and 120GB SSD, and also have a major overheating problem. Whenever I do any of the following it dramatically shuts down without warning: - run at full power with CPU-intensive processes (except outside on a cold day) - run the computer on an insulating surface (mattress or blanket), even with power saving at maximum Sometimes the fan comes on, but seldom at full power. Sometimes it pathetically switches to full power about 10 seconds However, this computer is configured entirely with 64-bit Windows 7--no Linux whatsoever. I suggest this may be a Lenovo BIOS or hardware problem. More information is here: http://forums.lenovo.com/t5/X-Series-ThinkPad-Laptops/x201-random-shutdown/td-p/227471/ One fix suggested there is to use a vacuum cleaner to pull dust off the heat sinks. -Tom
As mentioned in earlier comments, my heatsink-fan components were replaced (twice). This did indeed bring temperatures down somewhat, I imagine in a similar way to using a vacuum cleaner. However, the fan still didn't ramp up to full speed (as it does e.g. on a new BIOS install self-check, or issuing "disengaged" to /proc/acpi/ibm/fan. The fan is provably capable of higher RPM). The fan and heatsink may be more efficient when clean, but a relatively higher and more persistent load eventually hit thermal shutdown -- and when this has happened once, it seems to get progressively worse (degradation of thermal compound, perhaps?). I'm also aware and act against the consequences of inadequate laptop ventilation (e.g. blocking the intakes with a mattress or blanket). So I think the core issue here is that the fan doesn't spin up to full speed when reaching high temperatures, which it should do even if the heatsink is clogged with dust, or inappropriately tucked up in bed. It may well be a firmware<->kernel bug. Any advice on how to pinpoint which end is the root cause of the problem would be really appreciated (and any thoughts as who/how to escalate with Lenovo if appropriate -- consumer level support assumes a linux problem so more evidence would be helpful to prove that it is not, if indeed it isn't.) A couple of thoughts: - "level 7" (max, speed 4208) is a slower RPM than "level disengaged" (speed 5632) in /proc/acpi/ibm/fan. Is this expected behavior? - it not a fan control problem, could it be a temperature sensing problem? What kernel reported temperatures should I monitor?
make sure the gpu power saving is enabled (parameter enable_rc6 to the i915 module).
scratch that, that doesn't apply on Ironlake chipsets.
My /var/log/messages is peppered with "MCP limit exceeded" warnings -- could this be related? e.g. Feb 28 14:55:05 pootle kernel: [139789.398399] intel ips 0000:00:1f.6: MCP limit exceeded: Avg power 37596, limit 35000 Feb 28 14:56:30 pootle kernel: [139874.252075] intel ips 0000:00:1f.6: MCP limit exceeded: Avg power 36471, limit 35000 I'm getting quite a lot of them, e.g. 30 yesterday, 57 today. Searching through they seem to go back as far as I have logs. and also some: Feb 28 14:56:16 pootle kernel: [139860.388948] thinkpad_acpi: EC reports that Thermal Table has changed Several of these -- 138 since 27th Jan. And, on boot I think, these: /var/log/messages:Feb 25 13:23:23 pootle kernel: [ 19.457912] intel ips 0000:00:1f.6: CPU TDP doesn't match expected value (found 25, expected 29)
I have a Thinkpad R61 and I can say I have the same problem. I currently used the "disengage" trick when I need to run some simulation on my laptop, but this is obviously not a good solution. I was having troubling finding out where exactly to get the source for the thinkpad_acpi but I did find something that seemed legit: http://www.mjmwired.net/kernel/Documentation/laptops/thinkpad-acpi.txt#1211 The authors seem to warn against "disengaging" the fan on line 1221 That aside, I found some weird things in the code. http://kerneldox.com/d6/d28/thinkpad__acpi_8c-source.html The fan code starts on line 7141 I have a feeling that when the fan module was written, levels 0 to 7 were the only ones available. Now with modern MOBOs, much higher levels might be available meaning that they should be accessed. The code that sets the fan speed seems to be at line 7515 but I believe that the author slipped up near 07536 /* safety net should the EC not support AUTO 07537 * or FULLSPEED mode bits and just ignore them */ 07538 if (level & TP_EC_FAN_FULLSPEED) 07539 level |= 7; /* safety min speed 7 */ 07540 else if (level & TP_EC_FAN_AUTO) 07541 level |= 4; /* safety min speed 4 */ Since the line of code level |= 4; does not set the level to be at least 4, but maps the fan levels in the following manner: 0 -> 4 1 -> 5 2 -> 6 3 -> 7 4 -> 4 5 -> 5 6 -> 6 7 -> 7 That is just one bug but I think the more major one is the fact that the maximum speed should be 0xF and not 0x7. (Properly implementing a minimum is always a plus) I would spend time trying to fix it, but 1. I've never compiled a kernel module before. 2. I don't know how to download the current version of thinkpad_acpi Could this be the page for the source... seems very old. http://ibm-acpi.sourceforge.net/
@Mark: nice digging sounds like you may be onto something there. It should not be so hard to build a fedora kernel locally for testing: $ sudo yum install fedpkg ## if you don't have it :) $ fedpkg clone -a kernel ## use "-b f16" for f16 branch say $ cd kernel $ fedpkg prep $ emacs kernel-3.2.fc18/linux-3.3.0-0.rc5.git3.1.fc18.x86_64/drivers/platform/x86/thinkpad_acpi.c $ gendiff kernel-3.2.fc18 \~ > my-thinkpad-acpi.patch $ emacs kernel.spec ## add and apply my-thinkpad-acpi.patch $ fedpkg local $ sudo yum localinstall x86_64/kernel*... or of course you can download kernel srpm, add patch and build with rpm-build. I use a clunky gtk gui hack called thinkpad fan control (tpfc?) and usually just set speed to 7 manually when cpu is overheating... This problem has existed for a couple of years now I believe, boggle.
@Jens Thanks for the instructions, I'm going to try modifying it soon, just got to read through it first to make sure I understand what it does. (Also got to squeeze the time in between homework....) But man does this driver show its age. The latest model that is listed in there is the T60 or X60... Those models are 4 years old....
[mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update.
I've actually only recently starting to see this problem, as of in the last couple of days. I'm running F17 with kernel updates, so I'm running a 3.3.0-5 now. I will go back to 3.3.0-4 and see if that resolves my issue. This is also a Thinkpad X201, and it does log the 100C message in syslog before shutting down. I've also seen suspend failing and the moon shaped led keeps blinking without the laptop actually going to sleep, which also started to happen only in the last few days
I can no longer reproduce this on 3.3.1-5, so I'm assuming I was indeed running an older kernel when kernel compiles shut down my machine for overheating.
Just happened again on 3.3.1-5. I will test 3.3.2-1.fc17
confirmed 3.3.2-1.fc17 overheats and shuts down too....
and 3.3.0-0.rc7.git1.2.fc17.x86_64. So this leads me to believe this is not a kernel issue, as I did not have this problem until a few days ago, and 3.3.0.0 is a kernel from a month ago. So some userland component must have caused this.
It seems the only way to regain fan control is to reload thinkpad_acpi with modprobe thinkpad_acpi experimental=1 fan_control=1 The default kernels give me no fan control whatsoever, and I assume any power/heat monitoring tool can also not perform any corrections without this.
Fan control is implemented entirely in the firmware on modern Thinkpads.
my firmware did not change since red hat gave me the laptop. I can double check the bios settings but would be surprised if they have been changed. the fan is clearly not speeding up when the laptop is overheating, and i do feel the heat coming out on the side, so it is not a false positive. With the fan_control, at least I can force it to the max. If this would not be related to any software, I don't have an explanation why this started happening in the last few days only....
My wife's started doing this, and it turned out to be a specific web page she was leaving open in chromium that would peg the CPU at 100% and after a few hours cause the overheat because the fan just couldn't keep up.
I can confirm the exact same problem on my X201 with 3.3.2 kernel on F17. To successfully build e.g. WebKit without my laptop shutting off I need to apply the disengaged fan level trick as explained before in this thread. My thinklight is not working.
F17 is still in beta.... is it really ok to be posting the bug report here?
If the Thinklight isn't working then it sounds like the EC is very unhappy. Does it work in the BIOS? If so, does it work after booting the kernel with init=/bin/bash ? If it does then there's something that's happening after kernel init that's confusing it, and we just have to track down what...
You can probably try disabling the nvidia driver. It likes to set certain things like the hotkeys mask. I kinda explain it here: http://forums.fedoraforum.org/showthread.php?t=277541 the easier fix is to go to the line that has like "ibm" "hotkeys" and replace the "s" by a " " (you probably need the address of the later code to not be changed, so you shouldn't just delete the line. That will cause the nvidia driver to not chnage the hotkeys. Fixes things like my thinkvantage button and other Fn keys on my laptop. (Though my thinklight worked regardless)
The X201 has Intel graphics.
It appears my computer's overheating was caused by dust on the heat sink. I blasted it from outside the case with a can of compressed air and it's been fine since. (A less aggressive attempt helped somewhat, but the problem didn't disappear until I spent a minute or so pushing air through it from all directions.) I think there's also a problem with fan speed in the firmware--the fan seldom runs at top speed before the box resets (this is an OEM Windows 7 64-bit installation using Lenovo drivers). The Thinkpad light failure is common on X201s. My light works only when it's turned on and I squeeze the case immediately below the light. As soon as I let go, the light goes out.
(In reply to comment #36) > If the Thinklight isn't working then it sounds like the EC is very unhappy. > Does it work in the BIOS? If so, does it work after booting the kernel with > init=/bin/bash ? If it does then there's something that's happening after > kernel init that's confusing it, and we just have to track down what... Sorry, the thinklight problem might be a different one, since it doesn't appear to work even in the BIOS.
Tom: spraying canned air might have just cooled the inside temporarilly, hiding the problem. Are you sure it actually solved the problem, or did it happen again after a couple of hours?
Paul: I'm quite certain it wasn't a temporary fix. Before the canned air, I could reliably get my X201 to crash in less than a minute by putting it on full power and running a computationally-complex task. Even when I wasn't working it hard, it would crash several times per day. Since the canned air, it hasn't crashed once, even when I give it 15-minute CPU-saturating work on full power. -Tom
The underlying problem is still there: the fan is not reaching its maximum speed when needed. It is obvious that removing all dust from the fan will cool down the computer, the bug is still there.
This is still a problem in Fedora 17 with kernel-3.3.7-1.fc17.x86_64
I have this problem with a Thinkpad edge 15 0301 series and it isn't fixed in kernel-3.4.0-1.fc17. I tried blacklisting thinkpad_acpi, but that doesn't fix the problem. I've booted on my windows partition, and there are no problems with fan control there. I did some digging around and I'm getting weird RPM readings from the /proc/acpi/ibm/fan file. When running in auto, it gives me a reading of 598 RPM, but when I write 'level 7' into it, the reading drops to 418 RPM, even though I can hear the fan spinning faster.
First time I have hit this problem after more than 1 year with my Thinkpad T410@RHEL 6.2 Kernel 2.6.32-220.19.1.el6.x86_64 I installed gtkRell to monitor CPU temperature and by opening a big program as IBM SM GUI I'm reaching 90 °C
This comment may not be related, but I will mention it anyways. I have a thinkpad T61 and I had constant fan whirring and CPU temparatures ranging from 60 to 90 degree C. I looked at the top output and tracker-extract was taking a lot of CPU. I looked at the tracker-extract.log file and it was trying to extract some files on my Desktop folder. I moved the files from my dektop folder to another temporary folder and re logged on. The CPU temperatures are back to normal (less than 50 deg C) .. if that is normal and the noise from the fan is almost gone.
In comment to https://bugzilla.redhat.com/show_bug.cgi?id=675433#c9 and the T-series bugs. The X201t doesn't have Thinklight. I haven't had these kind of bugs neither with a T400 nor the X201 Tablet I'm currently using neither in F16 nor F17 x86_64. The only problems have been graphics related glitches and hangs due to bugs in the Intel gpu driver. Perhaps this is a model or BIOS version related problem. I'm running the newest BIOS on both devices. The model number of this X201t is 3093-95G. The T400 model is 6475.
I just had my laptop overheat and shut down on me again. The fan was NOT blowing at any audible speed at all. This was using kernel 3.4.0. My only non-standard setting was using modprobe iwlwifi wd_disable=1 to work around wifi lockup bugs. At the time I was compiling. Logs show: Jun 22 21:44:07 thinkpad kernel: [304451.373516] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) Jun 22 21:44:07 thinkpad kernel: [304451.373523] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1) Jun 22 21:44:07 thinkpad kernel: [304451.374561] CPU2: Core temperature/speed normal Jun 22 21:44:07 thinkpad kernel: [304451.374563] CPU3: Core temperature/speed normal Jun 22 21:44:14 thinkpad kernel: [304457.597633] thermal_sys: Critical temperature reached (128 C), shutting down Jun 22 21:44:14 thinkpad kernel: [304457.608087] thermal_sys: Critical temperature reached (128 C), shutting down Jun 22 21:44:14 thinkpad kernel: [304457.684044] BUG: Bad rss-counter state mm:ffff88012ef7c380 idx:1 val:-1 Jun 22 21:44:14 thinkpad kernel: [304457.684052] BUG: Bad rss-counter state mm:ffff88012ef7c380 idx:2 val:1 Jun 22 21:44:14 thinkpad kernel: [304457.686265] BUG: Bad rss-counter state mm:ffff88012de58700 idx:1 val:-1 nothing in the logs about fan speed Jun 22 21:44:14 thinkpad kernel: [304457.686272] BUG: Bad rss-counter state mm:ffff88012de58700 idx:2 val:1 Jun 22 21:44:14 thinkpad kernel: [304457.696697] BUG: Bad rss-counter state mm:ffff88000566f800 idx:1 val:-1 Jun 22 21:44:14 thinkpad kernel: [304457.696710] BUG: Bad rss-counter state mm:ffff88000566f800 idx:2 val:1
I have a ThinkPad Edge it also overheats and shutdown :S A "solution" would be to set the power_profile to "low" (in my case)... other options are: ‘mid’, ‘high’, ‘default’, ‘auto’ # echo low > /sys/class/drm/card0/device/power_profile Without doing anything my CPUs runs with 90C, after the power_profile change, it goes to 60C... however this does not seem to be a proper solution, it seems that the problem is with the driver for the graphic card. For those that are looking for answers, you also might want to take a look here: cpupower frequency-info cpupower -c all frequency-set -g ondemand cpupower set -c all frequency-set --min 800MHz --max 2.30GHz Is there a real solution for this issue ? Is anyone looking for having this problem fix ? does PCs with NVidea GPUs works ok ?
Are you using the nouveau driver ( http://nouveau.freedesktop.org/wiki/) or the one from NVidia (easiest thing is to install them from RPMFusion http://rpmfusion.org/)??? I would recommend against the Nouveau driver, I feel like it can only use the GPU to like 20% as opposed to the linux drivers (and X11) that use closer to 80-90% of the full functionality of the GPU (including low power features). I did have issues with tracker-miner, you may want to install tracker-ui-tools (fedora) sudo yum install tracker-ui-tools and run 'tracker-preferences' (it should create a menu entry for it too) and go through the menus to essentially disable it. (Uninstalling it used to remove some key parts of the desktop so I would recommend against it). Since Fedora 17, I haven't really had problems with it though.... (Lenovo R61 + Fedora 17)
I am not using any non-standard VGA driver (I have an intel chipset on this thinkpad)
Installing thinkfan allowed the fan to surpass 3800 rpm until 4500 rpm. That helped a lot and I don't think I'll hit +100° C again. "First time I have hit this problem after more than 1 year with my Thinkpad T410@RHEL 6.2 Kernel 2.6.32-220.19.1.el6.x86_64 I installed gtkRell to monitor CPU temperature and by opening a big program as IBM SM GUI I'm reaching 90 °C"
I'm having the same problem on my Thinkpad X201s running Fedora 17 running 3.4.4-5.fc17.x86_64. I know it's not hardware because I was having hardware problems and IBM just replaced my mobo on warranty (it was over-heating even under windows). Under windows 7 64bit I don't seem to have problems. After getting it back, I fired up Folding@Home on windows for about 10 minutes and it never broke 93°C. Right now with a light load, it's running at 45°C which is a decent temperature. However, when running under a heavy load, it seems to pulse. The fan will run a little, then when it hits about 96-97°C, it goes to full for a few seconds, just long enough to cool things down to about 85°C, and then goes back down to a fairly low speed. During the peak of these pulses, it would hit 100°C or 101°C, but the kernel seemed to think it was ok. Eventually, after doing this for about 20 or 30 minutes, one of the pulses broke the ceiling and Fedora shut me down. In my non-kernel-developer opinion, there seem to be 2 issues. First is that it's not catching the high temperature early enough. I noticed this before IBM changed out my motherboard. The kernel will wait until the cpu is in the high 90's before it spins the fan up and by that point the temperature is already climbing and the fan simply can't catch it. It seems to me as if a more gradual approach would be better and more stable (Not sure if all that pulsing is good on the fan.) The second thing I noticed while running under windows was that it seemed to be using CPU-throttling along with the fan in order to keep things under control. Is linux doing this? Should it be? There's my 2 cents worth. I hope it's useful. Thanks for all your hard work. BTW: I've been seeing these same issues for probably a year now in both Fedora 16 and Fedora 17 with every kernel that's come down the pike, so I don't think it's a kernel version issue.
The fan is under the control of the embedded controller, so the kernel isn't the one making the decision to change the fan speed in response to the temperature. Throttling should be handled by the kernel once the temperature reaches any passive trip points.
Our company made HDD encryption mandatory. After using LUKS encryption the use of CPU is a lot higher. So my problem started here. After removing the speed fan limit up to 4500RPM it was ok. However, updating to RHEL 6.3 with newer versions of my mail client and messenger made the problem bigger. When I open 3 big programs the temperature gets crazy. In less than 5 seconds I have my T410 shut down.
Hi Luis, I've been tracking this issue for the last 3-4 years now, my Lenovo R61 always seemed to run hotter under Linux (Ubuntu at the time, and now Fedora) than on Windows 7. I think the reason why the issue still hasn't been fixed is that it is hard to make a direct comparison in the differences of the fan speed at different temperatures in both operating systems. Until this issue gets fixed, you can try to disengage the fan. I documented how to do this on this post. http://forums.fedoraforum.org/showpost.php?p=1559106&postcount=8 I don't think disengaging is the best thing for your fan, but but should let it spin as fast as possible and cool your computer. This issue really needs to get fixed. It is basically a deal breaker. Mark
Hello Mark! Thanks for your comment. Once I updated to 6.3 the problem was so big that I started to navigate on on my company's forums. I found out that rtvscan process (Symantec Antivirus) was executing more than 200% (4 cores) of my processor at opening big programs . After applying an internal fix to the antivirus' configuration my problem has gone. I was blaming encryption in the HDD, however my computer is fast and cold as before encryption.
(In reply to comment #55) > The fan is under the control of the embedded controller, so the kernel isn't > the one making the decision to change the fan speed in response to the > temperature. Throttling should be handled by the kernel once the temperature > reaches any passive trip points. Thanks Matthew, this is useful. Could you give any more detail of how the embedded controller interact for both controlling and reporting fan speed and temperature? I think I see some changes of late. When running the laptop stressed, even though my Thinkpad feels very hot, it hasn't recently emergency shut down, and the reported temperature are much lower. I've been getting kernel updates (now 3.4.6-2.fc17) and I've also updated my BIOS to 1.39. Anecdotally I thought I could hear the fan bursting up in speed more often after the BIOS update but I don't think I'm noticing that now so could have been co-incidental. Some other observations: - setting the fan to "full-speed" now really does get the fan up to full speed, which is reported as the same speed as "disengaged" and indeed /proc reports the level as disengaged even when full-speed is issued. - my fan is "idling" at a higher speed (~35000) - when monitoring /proc/acpi/ibm/thermal with the laptop under stress, the reported temperature is maintained at a good level (~70C), which is good. But if I set the fan-speed to "full-speed" the fan does spin up to a maximum, but along with this the reported temperature doesn't fall -- if anything it fluctuates up (reaching 80C more regularly that beforehand). So while the running of the laptop seems better, I wonder if this is actually a temperature reporting problem? Was it ever really hitting 100C which forced the emergency shut down? Is that temperature reported to the kernel via the embedded controller?
Same problem here -- X201 thermal shutdown ("thermal_sys: Critical temperature reached (100 C), shutting down"), without bothering to reach maximum fan speed. In my case the full-CPU-load temperature is marginally around 100 C, so the occurence of shutdowns depends on ambient conditions. Undusting the fan (removing keyboard and applying spray) helped a little bit. There are many hints about the embedded controller in the Linux thinkpad-acpi driver (Documentation/laptops/thinkpad-acpi.txt and drivers/platform/x86/thinkpad_acpi.c). Note that there are two different types thermal sensors involved: the in-core sensors of the CPU, and those reported by ACPI (lm_sensor shows both). The latter lag after the former, on my X201. I don't know whether the ACPI readouts use external sensors on the system board (as they did in older ThinkPad models, before Lenovo started slashing costs and quality), or just report some running average of the in-core sensors. If I get this right: - Thermal throttling is controlled by the in-core sensor - Thermal shutdown is controlled by ACPI, presumably according to the ACPI-reported temperature - Fan speed, in the default (auto) mode, is controlled arbitrarily by the embedded controller; traditionally the ThinkPad firmware takes the same temperatures ACPI reports, and then applies some buggy thresholing and hysteresis algorithm... BTW, "full-speed" and "disengaged" modes are two names for the same thing (see thinkpad_acpi.c line 8265).
# Mass update to all open bugs. Kernel 3.6.2-1.fc16 has just been pushed to updates. This update is a significant rebase from the previous version. Please retest with this kernel, and let us know if your problem has been fixed. In the event that you have upgraded to a newer release and the bug you reported is still present, please change the version field to the newest release you have encountered the issue with. Before doing so, please ensure you are testing the latest kernel update in that release and attach any new and relevant information you may have gathered. If you are not the original bug reporter and you still experience this bug, please file a new report, as it is possible that you may be seeing a different problem. (Please don't clone this bug, a fresh bug referencing this bug in the comment is sufficient).
Fedora 16 changed to end-of-life (EOL) status on 2013-02-12. Fedora 16 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.
Still happening on F19 with 3.8.0-0.rc7.git0.2.fc19.x86_64
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
This issue is still present in current Fedora 19 (thinkpad_acpi: ThinkPad ACPI Extras v0.24 on kernel 3.10.3-300.fc19.x86_64) and leads to overheating and emergency power off when the system is running under full load. $ more /proc/acpi/ibm/fan status: enabled speed: 3444 level: auto commands: level <level> (<level> is 0-7, auto, disengaged, full-speed) commands: enable, disable commands: watchdog <timeout> (<timeout> is 0 (off), 1-120 (seconds)) looks essentially the same whatever the actual temperature may be. Even when the core temperature reached 95 °C, the fan speed would not increase from this level to its maximum possible value of about 4200 rpm which can be enforced after booting the system with thinkpad_acpi.fan_control=1. The system is a ThinkPad T400 with an energy efficient P8600 processor which has a tdp of a mere 25 W.
I am regularly hitting this on my T410. For good measure, I completely disabled from BIOS the Nvidia chip and that somewhat helped. However, compiling any non trivial project makes the CPU hit 100° and shut down. For the records, I can manually put the fan in full speed mode and it goes up to 7000rpm whilst the maximum speed it reaches usually is 4500rpm
I have the same problem with Fedora 19 and Thinkpad T420. Kernel 3.10.4-300.fc19.x86_64
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.11.1-200.fc19. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 2 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.
This issue is still present in current Fedora 21 (rawhide) (thinkpad_acpi: ThinkPad ACPI Extras v0.25 on 3.15.0-0.rc4.git0.1.fc21.x86_64) and leads to overheating when the system is running under full load. $ more /proc/acpi/ibm/fan status: enabled speed: 3415 level: auto looks essentially the same whatever the actual temperature may be as reported for Fedora 19 in comment 65. Even when the core temperature reached 75 °C, the fan speed would not increase from this level to its maximum possible value of about 4200 rpm which can be enforced after booting the system with thinkpad_acpi.fan_control=1. The system is a ThinkPad T400 with an energy efficient P8600 processor which has a tdp of a mere 25 W.
I re-open this bug because the problem still on. lenovo T61 Fedora 20 Kernel 3.14.4-200.fc20.i686 When level of /proc/acpi/ibm/fan is auto : Speed on 3000s Temp on 80s When level of /proc/acpi/ibm/fan is disengaged : Speed on 5000s Temp on 55~65 ====== What information you need ?
Still an issue for my Lenovo ThinkPad T400 running the current Fedora 21 development tree with kernel 3.16.0-0.rc5.git0.1.fc21.x86_64. System load is less than 75 % and still, the core temperatures reach unpleasant values listed below even though the installed P8600 processor is an efficient 25 W model! $ sensors acpitz-virtual-0 Adapter: Virtual device temp1: +87.0°C (crit = +127.0°C) temp2: +85.0°C (crit = +100.0°C) thinkpad-isa-0000 Adapter: ISA adapter fan1: 3434 RPM temp1: +87.0°C temp2: +59.0°C temp3: +40.0°C temp4: +87.0°C temp5: +43.0°C temp6: N/A temp7: +38.0°C temp8: N/A temp9: +46.0°C temp10: +58.0°C temp11: +57.0°C temp12: N/A temp13: N/A temp14: N/A temp15: N/A temp16: N/A coretemp-isa-0000 Adapter: ISA adapter Core 0: +85.0°C (high = +105.0°C, crit = +105.0°C) Core 1: +84.0°C (high = +105.0°C, crit = +105.0°C)
Still have the issue on my trustworthy T60, and it's randomly happened to me like 4 times today. I tried to force and control the problem using "stress" but couldn't get the temperature above 75. It seems like it shutdowns whenever it wants to. Nov 16 19:59:24 thinkpad kernel: thermal thermal_zone0: critical temperature reached(128 C),shutting down by the way, 128C? Seriously? Like in "signed byte maximum value"? It seems to be more a glitch than anything else. The heatsink is clean, and the fan is brand new. This shouldn't happen. Here's more info: [jerther@thinkpad ~]$ sensors acpitz-virtual-0 Adapter: Virtual device temp1: +52.0°C (crit = +127.0°C) temp2: +50.0°C (crit = +99.0°C) thinkpad-isa-0000 Adapter: ISA adapter fan1: 3843 RPM temp1: +52.0°C temp2: +41.0°C temp3: +34.0°C temp4: +107.0°C temp5: +32.0°C temp6: N/A temp7: +30.0°C temp8: N/A temp9: +38.0°C temp10: +55.0°C temp11: +52.0°C temp12: N/A temp13: N/A temp14: N/A temp15: N/A temp16: N/A coretemp-isa-0000 Adapter: ISA adapter Core 0: +50.0°C (crit = +100.0°C) Core 1: +50.0°C (crit = +100.0°C) [jerther@thinkpad ~]$ uname -a Linux thinkpad 3.17.2-200.fc20.i686 #1 SMP Tue Nov 4 18:28:00 UTC 2014 i686 i686 i386 GNU/Linux [jerther@thinkpad ~]$ cat /etc/fedora-release Fedora release 20 (Heisenbug)
Well, if you look at my previous comment, you'll notice something I noticed a bit later: the temp4 value, which I believe is the GPU temperature. It's quite high. Too high. Could it be THIS VALUE that rises up to 128 and causes the shutdown? That would make more sens. Anyway, I remembered the lousy thermal soft pad that fills the gap between the GPU and the heat sink unit. I guess that just like ordinary paste, this sort of thermal paste/pad has to be replaced at some point. So I opened the laptop and replaced the soft pad on the GPU by a solid, monolithic piece of aluminium I had laying around and coated it with some good classic thermal paste, reassembled the whole thing, booted Fedora and bam: temp4 is down to 70. (by the way, I believe a folded piece of aluminium foil would not cut it since every ply makes a thermal bridge. Google for more info.) I've also noticed a significant increase in overall performance, probably due to some throtling not happening anymore, and no forced shutdown has happened since. One question though: has the fan speed level always been managed like this? I mean, was the disengaged level ever possible without custom settings? It could be that our laptops are almost the same age and so the thermal paste went bad all at the same time, when a certain release of Fedora came out... Or maybe I just watch too many movies... Oh well. Cheers.
This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
(In reply to Joachim Frieben from comment #72) A ThinkPad T400 with a low-power CPU (P8600, 25W) running the latest Fedora rawhide nodebug kernel 4.7.0-0.rc7.git4.2.fc25 shows exactly the same behaviour as before: high temperature under load with virtually now effect on the fan speed. :::::::::::::: sensors-hi.txt :::::::::::::: acpitz-virtual-0 Adapter: Virtual device temp1: +84.0°C (crit = +127.0°C) temp2: +81.0°C (crit = +100.0°C) thinkpad-isa-0000 Adapter: ISA adapter fan1: 3426 RPM temp1: +84.0°C temp2: +52.0°C temp3: +40.0°C temp4: +79.0°C temp5: +40.0°C temp6: N/A temp7: +37.0°C temp8: N/A temp9: +42.0°C temp10: +53.0°C temp11: +56.0°C temp12: N/A temp13: N/A temp14: N/A temp15: N/A temp16: N/A coretemp-isa-0000 Adapter: ISA adapter Core 0: +81.0°C (high = +105.0°C, crit = +105.0°C) Core 1: +80.0°C (high = +105.0°C, crit = +105.0°C) :::::::::::::: sensors-lo.txt :::::::::::::: acpitz-virtual-0 Adapter: Virtual device temp1: +50.0°C (crit = +127.0°C) temp2: +49.0°C (crit = +100.0°C) thinkpad-isa-0000 Adapter: ISA adapter fan1: 3425 RPM temp1: +50.0°C temp2: +45.0°C temp3: +36.0°C temp4: +66.0°C temp5: +29.0°C temp6: N/A temp7: +28.0°C temp8: N/A temp9: +37.0°C temp10: +47.0°C temp11: +47.0°C temp12: N/A temp13: N/A temp14: N/A temp15: N/A temp16: N/A coretemp-isa-0000 Adapter: ISA adapter Core 0: +49.0°C (high = +105.0°C, crit = +105.0°C) Core 1: +50.0°C (high = +105.0°C, crit = +105.0°C)
Please open a new bug. Don't reopen a bug report for completely different hardware that has been closed for a year and a half.
(In reply to Josh Boyer from comment #78) The notebook for which I have reported the current status is actually the very same for which I had posted comment 65 ff. back in 2013, and the issue has not changed a bit. Fan control problems related to ThinkPads running the Linux kernel are a generic IBM ACPI issue which affects (at least) most R/T/X models built around 2010 including T61, T400, X201, etc.