Hello: This bug, originally openened for Fedora 19 and closed without resolution (when an entire batch of bugs were closed requesting that 'if the problem persisted, please re-open it'). This problem is *100%* reproducible no mater what the computer type (from high-end laptops to super high-end servers), and across all YUM updated versions of Fedora 19 and 20, including all kernels that have been released for them. This issue is *urgent* because it has -- for a long time now -- prohibited Fedora from being used as a data center HOST to guest LXC Containers. In fact, many months ago we abandoned our intention to use more efficient LXCs and reverted back to heavier KVMs because of this issue, hoping that by Fedora-20 (and certainly by now) this would be resolved; but it isn't. ============================================================ Again the issue (as originally stated): ============================================================ (1) Starting a basic LXC container, which is not configured to do anything at all, *immediately* (and without delay) raises the temperature *substantially* of one of the cores. (2) Starting a second LXC container (also not configured to do anything), does the same as (1), but on a different core (i.e. the one that that LXC uses). (3) and so on ... ============================================================ =========================================================== Demonstration Output: =========================================================== dstorm$ # No LXCs running. dstorm$ sensors -f (All is normal). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 0: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 1: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 2: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +69.8°F (high = +176.0°F, crit = +194.0°F) Core 4: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 5: +73.4°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f (All is normal). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +80.6°F (high = +176.0°F, crit = +194.0°F) Core 0: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 1: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 2: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +66.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 5: +73.4°F (high = +176.0°F, crit = +194.0°F) dstorm$ sudo lxc-start -d -n vps00 (Start a container). dstorm$ sensors -f (**Immediate 27-degree jump for Core-1**). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +84.2°F (high = +176.0°F, crit = +194.0°F) Core 1: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +82.4°F (high = +176.0°F, crit = +194.0°F) Core 3: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 4: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 5: +80.6°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +86.0°F (high = +176.0°F, crit = +194.0°F) Core 1: +100.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +84.2°F (high = +176.0°F, crit = +194.0°F) Core 3: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 4: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 5: +80.6°F (high = +176.0°F, crit = +194.0°F) dstorm$ sudo lxc-start -d -n vps01 (Start a second container). dstorm$ sensors -f (Temperatures are even higher now). coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +109.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +89.6°F (high = +176.0°F, crit = +194.0°F) Core 1: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +107.6°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 3: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +80.6°F (high = +176.0°F, crit = +194.0°F) Core 5: +84.2°F (high = +176.0°F, crit = +194.0°F) dstorm$ sensors -f coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 0: +91.4°F (high = +176.0°F, crit = +194.0°F) Core 1: +109.4°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 2: +111.2°F (high = +176.0°F, crit = +194.0°F) <-- spike Core 3: +75.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +78.8°F (high = +176.0°F, crit = +194.0°F) Core 5: +84.2°F (high = +176.0°F, crit = +194.0°F) ===================== At this point the fans are noticeably faster, and temperature LED read-out on the "Digital Storm" computer reats ~410 degrees, where it normally reads ~320. Here is what each LXC container is doing (not much); and btw. they are also running Fedora-20 with the same kernel: dstorm$ lxc-ps --lxc CONTAINER PID TTY TIME CMD vps00 9616 ? 00:00:00 systemd vps00 9646 ? 00:35:36 systemd-journal vps00 9654 ? 00:00:00 systemd-udevd vps00 9976 ? 00:00:00 firewalld vps00 9979 ? 00:00:00 rsyslogd vps00 9983 ? 00:00:00 dbus-daemon vps00 9986 ? 00:00:00 systemd-logind vps00 9993 pts/4 00:00:00 agetty vps00 9995 pts/2 00:00:00 agetty vps00 9998 pts/5 00:00:00 agetty vps00 9999 pts/3 00:00:00 agetty vps00 10006 pts/6 00:00:00 agetty vps00 10012 ? 00:00:00 sshd vps01 10754 ? 00:00:00 systemd vps01 10784 ? 00:35:05 systemd-journal vps01 10789 ? 00:00:00 systemd-udevd vps01 11204 ? 00:00:00 firewalld vps01 11206 ? 00:00:00 rsyslogd vps01 11207 ? 00:00:00 dbus-daemon vps01 11211 ? 00:00:00 systemd-logind vps01 11232 pts/10 00:00:00 agetty vps01 11233 pts/8 00:00:00 agetty vps01 11234 pts/11 00:00:00 agetty vps01 11235 pts/9 00:00:00 agetty vps01 11236 pts/12 00:00:00 agetty vps01 11264 ? 00:00:00 sshd vps00 11908 ? 00:00:00 systemd vps00 11910 ? 00:00:00 (sd-pam) vps01 11965 ? 00:00:00 systemd vps01 11967 ? 00:00:00 (sd-pam) Try to launch a LXC and you will see the issue. It's easily reproducible. Can team Fedora provide help us with this? (please and thank you). I am happy to work to provide additional information,... although, again, you will be able reproduce this problem on your computers (laptops even), too. Again, the impact of this long running issue is that we are not able to use Fedora as a HOST to Fedora LXC Container Guests; which has wide implications on having to rebuild (and test, and operate) servers a different HOST O/S distribution order to be able to safely use LXCs (not trivial). From all indications, Fedora 19 & 20 (with any kernel) will burn out our systems, therefore urgent. Thank you again! Thank you.
Just for completeness, can you please specify the LXC package versions you are using? And I might be missing something, but is 110°F (43°C) really to be considered a high CPU temperature? Admittedly that's a substantial increase compared to 70°F (21°C), yes. However, cores in my workstation here (Core2 Quad Q9550, that's an old, but low TDP CPU) are never colder than 34°C (93°F). Anyway, I can contact upstream about this, although I am not sure they can anything do about it, as LXC is the userspace part. You filed bug 1050106 against the kernel component, which is in principle the right thing to do...
Hello Thomas: (1) Here are the LXC RPMs (latest of them), although as mentioned the issue has persisted across many iterations of RPMS (lxc, kernels, etc). user@linux$ rpm -qa | egrep 'lxc' lxc-doc-1.0.5-5.fc20.noarch lxc-templates-1.0.5-5.fc20.x86_64 libvirt-daemon-driver-lxc-1.1.3.5-2.fc20.x86_64 python3-lxc-1.0.5-5.fc20.x86_64 clxclient-3.6.1-9.fc20.x86_64 lxc-1.0.5-5.fc20.x86_64 lxc-extra-1.0.5-5.fc20.x86_64 lxc-devel-1.0.5-5.fc20.x86_64 lua-lxc-1.0.5-5.fc20.x86_64 lxc-libs-1.0.5-5.fc20.x86_64 (2) No one paid attention to the former bug, though I pleaded. Also, on this bug, the dropdown did not let me select 'kernel'. I think a collaborative effort (LXC and kernel) is optimal. (3) Every computer (a wide variety of them) we tried to run even just one or two LXCs, all jump drastically in temperature (as you see) -- for an LXC or two that are essentially idle. Correspondingly, all of those computer's FANS think there is a problem because, in each case, they speed up and get noticeably loud. Take this well equipped server: - 64GB RAM @ 2600Mhz - i7 x 3Ghz x 12 Cores - 2TB SSD (RAID-O H/W stripe of 1TB pair) No monitor Host Fedora O/S is optimized to run only what is necessary. Everything is disabled (no 'sendmail', no 'cron'). It's very tight. Running one idle LXC causes a spike in temperature; run two, and the fans start increasing. Yet nothing is really happening. On the other hand, on that very same machine I can run 5 *fully virtualized* CentOS6 KVM guests (on Fedora-20 Host), each with 11GB RAM assigned to them; and on them run distributed Apache Hadoop/HDFS, Apache Spark and Apache Kafka to perform Real-Time distributed Machine Learning -- so those KVMs are truly doing a lot! Yet for the amount of real-time work that that KVM-based cluster is doing, (again, full virtualization now) there is very little increase in temperature, and zero increase in fan speed. Also note that there is an 'overall' temperature LED on the front of that computer. It reads ~320 when Fedora Host is booted up and idle. I can launch those 5 KVMs, and it goes up to about ~340; but launching 1 or 2 *idle* LXCs causes a jump to above ~410 immediately. Why? So it's not just 'sensors -f' output. There are LED and FAN increase indications, too. So something is definitely going on with LXC & Kernel, and because there is, we're assuming the possibility that the temperature jump can be even higher than shown. We have to... -- to protect the systems. I think one of the underlying components used in createing the virtual container is causing a problem (kernel iptables, chroot, resource management, etc.) or maybe a kernel mutex is spinning, or something. But this behavior is definitely problemmatic. Again, we really want to use LXC because we can get better utilization from every server that way. But we are stuck. Thank you again!
This is a known existing problem with systemd-journald in a containers. If you look at the CPU time in those container processes, you will notice systemd-journald is in a runaway condition and consuming 100% CPU. If you were to run "top" you would see your load average has shot through the roof and multiple systemd-journald processes are camped out on the CPUs consuming the processors. The problem relates to having /dev/kmsg symlinked to /dev/console in the containers, which is common in a lot of cases with sysvinit or upstart but causes problems with systemd-journald because journald is reading from kmsg and writing to console thus creating a messaging loop which it is then failing to detect. This problem is going to be addressed in some patches to be released shortly for templates supporting systemd based distros and also attempting to intercept the affected containers at startup with default settings. Existing containers running systemd-journald will need to be updated with a couple of minor changes... To address this problem in an affected container... 1) Shut down the container. 2) Edit the container config file and add the following line... lxc.kmsg = 0 3) Remove the existing symlink for the container /dev. Because, for systemd, this is a persistent subdirectory under the /dev/.lxc in the host devtmpfs area, it should be removed like this: rm -f /var/lib/lxc/{container-name}/rootfs.dev/kmsg 4) Restart the container.
Hi Michael: Thank you for taking to time to articulate the issue as you did (appreciated!). And there is good new, too. I made the adjustments you prescribed above to each of the 5 LXC containers, started them, and everything looks as expected, including the front-display LED temperature reading (only ~330). root@linux# lxc-ls --active vps00 vps01 vps02 vps03 vps04 root@linux# sensors -f coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 0: +73.4°F (high = +176.0°F, crit = +194.0°F) Core 1: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 2: +77.0°F (high = +176.0°F, crit = +194.0°F) Core 3: +66.2°F (high = +176.0°F, crit = +194.0°F) Core 4: +71.6°F (high = +176.0°F, crit = +194.0°F) Core 5: +75.2°F (high = +176.0°F, crit = +194.0°F) This is finally SOLVED. \o/ Thank you very much Michael & Thomas.
*** Bug 1195945 has been marked as a duplicate of this bug. ***
Fixed in commit e8a16654, will be in 1.0.8.
This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.