Description of problem: Various O/S commands or dmesg(1M) output indicate problems with CPU HIGH temperatures and speeds. This urgently is concerning. I can't run the laptop this way. Please help. ============= THE SETUP: ============= - O/S: Fedora 19. Kernel version: 3.12.6-200.fc19.x86_64 - The system is a "Clevo P170HM" Laptop with 8-Core Intel CPU. - I only SSH into the laptop to do things. So no windowing systems is ever started (i.e. the laptop display always shows the TTY Console login prompt, and the lid is always closed). Again I only ssh into it remotely. No attached monitor either. - The laptop is *very* well ventilated (lots of room to breathe) and also sits on top of a dedicated external cooling fan I purchased for it. ============================ THE PROBLEMS / INDICATORS: ============================ ############################################ (1) *Relatively* high temperatures reported *immediately* after booting up. And they vary non-trivially on consecutive probes. Here are two consecutive runs of the sensors(1) command: user@linux$ sensors -f acpitz-virtual-0 Adapter: Virtual device temp1: +107.6°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 1: +96.8°F (high = +186.8°F, crit = +212.0°F) Core 2: +95.0°F (high = +186.8°F, crit = +212.0°F) Core 3: +87.8°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +107.6°F === user@linux$ sensors -f acpitz-virtual-0 Adapter: Virtual device temp1: +107.6°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 0: +105.8°F (high = +186.8°F, crit = +212.0°F) Core 1: +105.8°F (high = +186.8°F, crit = +212.0°F) <==== Example. Core 2: +103.0°F (high = +186.8°F, crit = +212.0°F) <==== Example. Core 3: +87.8°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +109.4°F ############################################ ############################################ (2) Each Core is running/scaled to different CPU frequency. Not certain, but I don't believe this is correct. user@linux$ cat /proc/cpuinfo | grep MHz cpu MHz : 894.628 cpu MHz : 1188.867 cpu MHz : 1148.828 cpu MHz : 1127.539 cpu MHz : 1259.082 cpu MHz : 1216.601 cpu MHz : 1270.898 cpu MHz : 1248.535 ############################################ ############################################ (3) As the O/S is booting up, I see the following messages scroll by, and also appear constantly in dmesg(1M) and "/var/log/messages" output. Why is the being reported since I just turned the laptop on? user@plinux$ sudo grep -i temperature /var/log/messages Jan 8 11:08:12 p170hm-nic kernel: [ 967.652909] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652910] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652913] CPU4: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652960] CPU5: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652961] CPU1: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652962] CPU6: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652963] CPU2: Package temperature above threshold, cpu clock throttled (total events = 10) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652965] CPU7: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.652966] CPU3: Package temperature above threshold, cpu clock throttled (total events = 11) Jan 8 11:08:12 p170hm-nic kernel: [ 967.653115] CPU0: Package temperature above threshold, cpu clock throttled (total events = 11) ############################################ ############################################ (4) Without doing anything substantial the CPU temperature sky rockets. Below for example, I will run the sensor(1M) command, then start a lightweight LXC container (running FC19 and the same kernel as the host laptop), then run sensors(1M) immediately after it finishes booting. Watch the CPU temperatues sky-rocket. But why? The LXC O/S is not even configured to do anything (was just installed with minimal packages). After I start the container, the internal fans *immediately* spin up to high speeds. And it's not just when I run containers, but do other things, too (for example, when I start "dropbox"). user@linux$ sensors -f <--- laptop is idle here. Only my SSH session. acpitz-virtual-0 Adapter: Virtual device temp1: +107.6°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 0: +105.8°F (high = +186.8°F, crit = +212.0°F) Core 1: +105.8°F (high = +186.8°F, crit = +212.0°F) Core 2: +105.8°F (high = +186.8°F, crit = +212.0°F) Core 3: +86.0°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +107.6°F user@plinux$ sudo lxc-start -n vps0 <--- start a LightWeight LXC container. [ ... snip ... ] Fedora release 19 (Schrödinger’s Cat) Kernel 3.12.6-200.fc19.x86_64 on an x86_64 (console) vps1 login: NOTE: Again, the container does nothing. It's a minimal install and hasn't been set up to do anything at all. Yet 1-SECOND after starting it sensors(1M) shows temperatures that completely SKYROCKET and REMAIN THERE, and the fans immediately spin up. user@plinux$ sensors -f acpitz-virtual-0 Adapter: Virtual device temp1: +194.0°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +194.0°F (high = +186.8°F, crit = +212.0°F) Core 0: +150.8°F (high = +186.8°F, crit = +212.0°F) Core 1: +145.4°F (high = +186.8°F, crit = +212.0°F) Core 2: +156.2°F (high = +186.8°F, crit = +212.0°F) Core 3: +194.0°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +194.0°F user@plinux$ sensors -f <--- Again 5 secs later. I'm not even doing anything. acpitz-virtual-0 Adapter: Virtual device temp1: +206.6°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +206.6°F (high = +186.8°F, crit = +212.0°F) Core 0: +181.4°F (high = +186.8°F, crit = +212.0°F) Core 1: +206.6°F (high = +186.8°F, crit = +212.0°F) Core 2: +163.4°F (high = +186.8°F, crit = +212.0°F) Core 3: +131.0°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +206.6°F Then, when I stop the container, the temperatures immediately go down significantly (immediately): user@plinux$ lxc-shutdown -n vps0 user@plinux$ sensors -f acpitz-virtual-0 Adapter: Virtual device temp1: +123.8°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +123.8°F (high = +186.8°F, crit = +212.0°F) Core 0: +122.0°F (high = +186.8°F, crit = +212.0°F) Core 1: +122.0°F (high = +186.8°F, crit = +212.0°F) Core 2: +122.0°F (high = +186.8°F, crit = +212.0°F) Core 3: +122.0°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +123.8°F Full logs are attached. Again, I wanted to try upgrading to, FC20 but cannot due to this (unrelated) Bug ID: 1048404. Something is very wrong here. Please help. I'm concerned about ruining the laptop and reducing it's MTBF. Status: I can't run it like this. Thank you!
Created attachment 847264 [details] dmesg output, followed by /proc/cpuinfo output ...
Additional information... The reason I'm trying LXC Containers is because I currently run multiple fully-virtualized KVM vmachines (like 4 to 6 CentOS VMs depending on the development work I'm doing). Again, there are no graphics running on the host Fedora-19 laptop or the CentOS6 KVMs... just daemons: ssh, Hadoop, Storm (by twitter), Cassandra, etc. Not all at once, but things like that. But since this is my environment, I don't require all the security and isolation that KVM VMs provide. All I need are separate IPs for each guest, so using LXC Containers with different IP can be more light weight / efficient. But I ran into the CPU TEMP/CPU CLOCK SPEED/FAN SPEED issue previously described. And that was just running one (qty. 1) LXC Container. Ironically, when I run two (qty. 2) full KVMs the issue doesn't appear. Have a look: root@p170hm# virsh start centOS6-vm0 root@p170hm# virsh start centOS6-vm1 root@p170hm# virsh list --all Id Name State ---------------------------------------------------- 2 centOS6-vm0 running <--- Here 3 centOS6-vm1 running <--- and Here - centOS6-vm2 shut off - centOS6-vm3 shut off - centOS6-vm4 shut off - centOS6-vm5 shut off - centOS6-vm6 shut off - centOS6-vm7 shut off root@p170hm# sensors -f <--- Reasonable temps in this scenario. No spikes. And fans aren't blowing hard as they did with LXC. ====================================== acpitz-virtual-0 Adapter: Virtual device temp1: +107.6°F (crit = +309.2°F) coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 0: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 1: +104.0°F (high = +186.8°F, crit = +212.0°F) Core 2: +107.6°F (high = +186.8°F, crit = +212.0°F) Core 3: +91.4°F (high = +186.8°F, crit = +212.0°F) pkg-temp-0-virtual-0 Adapter: Virtual device temp1: +109.4°F ====================================== IMPORTANT NOTE: This is just more information for sleuths. It doesn't explain why (shown previously) the CPU core speeds differ (their speeds are never in sync, even in this KVM scenario); or why running "dropbox" (in CLI mode) or other lightweight things cause a skyrocket of the CPU temperatures in a matter of 1-Second or two.
See also bug #924570
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs. Fedora 19 has now been rebased to 3.13.5-100.fc19. Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.
Hello Again: RE-OPENING THIS BUG. This bug was closed citing insufficient data, but I didn't get an alert email requesting it. This bug, originally openened for Fedora 19 and closed withough resolution, is being re-opened for Fedora-20 (since that is the version I'm now running), with the latest yum updates, and kernel which is: Kernel: 3.15.7-200.fc20.x86_64 Several months ago, I stopped my intention to use LXCs and reverted back to KVMs because of this issue, hoping that by Fedora-20 this would be resolved, but it isn't. This needs looking into because LXCs are much more efficient to use, maintain, and deploy, and run. This issue happens consistently on the all of my Fedora-20 based computers (just as they did when they were on FC19)... (A) the original Clevo P170HM laptop with which this bug was filed, and (b) now on an heavy-duty "Digital Storm" model computer with 64GB RAM, SSD, RAID, i7CPU (with 6 cores, two threads each). And a third computer as well. ============================================================ It's the same exact issue as well documented above: ============================================================ (1) Starting a basic LXC container, which is not configured to do anything at all, *immediately* (and without delay) raises the temperature *substantially* of one of the cores. (2) Starting a second LXC container (also not configured to do anything), does the same as (1), but on a different core. ============================================================ =========================================================== Demonstration Output: =========================================================== dstorm$ # No LXCs running. dstorm$ sensors -f (All is normal). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 0: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 1: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 2: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +69.8°F (high = +176.0°F, crit = +194.0°F) Core 4: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 5: +73.4°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f (All is normal). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +80.6°F (high = +176.0°F, crit = +194.0°F) Core 0: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 1: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 2: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +66.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 5: +73.4°F (high = +176.0°F, crit = +194.0°F) dstorm$ sudo lxc-start -d -n vps00 (Start a container). dstorm$ sensors -f (**Immediate 27-degree jump for Core-1**). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +84.2°F (high = +176.0°F, crit = +194.0°F) Core 1: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +82.4°F (high = +176.0°F, crit = +194.0°F) Core 3: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 4: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 5: +80.6°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +86.0°F (high = +176.0°F, crit = +194.0°F) Core 1: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +84.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 4: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 5: +80.6°F (high = +176.0°F, crit = +194.0°F) dstorm$ sudo lxc-start -d -n vps01 (Start a second container). dstorm$ sensors -f (Temperatures are even higher now). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +109.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +89.6°F (high = +176.0°F, crit = +194.0°F) Core 1: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +107.6°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 3: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +80.6°F (high = +176.0°F, crit = +194.0°F) Core 5: +84.2°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +91.4°F (high = +176.0°F, crit = +194.0°F) Core 1: +109.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 3: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +78.8°F (high = +176.0°F, crit = +194.0°F) Core 5: +84.2°F (high = +176.0°F, crit = +194.0°F) ===================== At this point the fans are noticeably faster, and temperature LED read-out on the "Digital Storm" computer reats ~410 degrees, where it normally reads ~320. Here is what each LXC container is doing (not much); and btw. they are also running Fedora-20 with the same kernel: dstorm$ lxc-ps --lxc CONTAINER PID TTY TIME CMD vps00 9616 ? 00:00:00 systemd vps00 9646 ? 00:35:36 systemd-journal vps00 9654 ? 00:00:00 systemd-udevd vps00 9976 ? 00:00:00 firewalld vps00 9979 ? 00:00:00 rsyslogd vps00 9983 ? 00:00:00 dbus-daemon vps00 9986 ? 00:00:00 systemd-logind vps00 9993 pts/4 00:00:00 agetty vps00 9995 pts/2 00:00:00 agetty vps00 9998 pts/5 00:00:00 agetty vps00 9999 pts/3 00:00:00 agetty vps00 10006 pts/6 00:00:00 agetty vps00 10012 ? 00:00:00 sshd vps01 10754 ? 00:00:00 systemd vps01 10784 ? 00:35:05 systemd-journal vps01 10789 ? 00:00:00 systemd-udevd vps01 11204 ? 00:00:00 firewalld vps01 11206 ? 00:00:00 rsyslogd vps01 11207 ? 00:00:00 dbus-daemon vps01 11211 ? 00:00:00 systemd-logind vps01 11232 pts/10 00:00:00 agetty vps01 11233 pts/8 00:00:00 agetty vps01 11234 pts/11 00:00:00 agetty vps01 11235 pts/9 00:00:00 agetty vps01 11236 pts/12 00:00:00 agetty vps01 11264 ? 00:00:00 sshd vps00 11908 ? 00:00:00 systemd vps00 11910 ? 00:00:00 (sd-pam) vps01 11965 ? 00:00:00 systemd vps01 11967 ? 00:00:00 (sd-pam) [Final side note]: Although this will not solve the issue (it will only shift the issue around), I plan on setting the affinity of each LXC instance to a different core. Above, both instances share Core-1 and Core-2. I will try to change this in the LXC config file for each instance. But again, this is just to enhance CPU distribution performance. The temperature issue is still a problem. Try LXC and you will see the issue. It's easily reproducible. Can we continue this and find a resolution? I'm concerned about the life impact these warmer temperatures will have on the computer. (please and thank you). :)
Hello... Any ideas on this. Thanks!
Can somone please look at this/fix this? I need to use containers (instead of KVM) but can't because of this temperature & FAN issue -- which shouldn't be happening just because I spin up a do-nothing/idle LXC. If this can't be fixed, then LXCs may as well not exist, as their adverse affect on CPU temps and fan speeds (fully describe above) is a show-stopper. Please!
Can someone help with this? Anyone please. Thanks.
Closed by original submitter (nmvega). No one paid attention to this urgent issue -- which prohibits using Fedora as a host to LXC containers -- despite repeated requests on *this* bug ID. Opening a new ticket for this same issue.