|Summary:||Performance of version 4.4.4 drastically reduced from prior versions|
|Product:||[Fedora] Fedora||Reporter:||Jason H. <cakersq>|
|Component:||kernel||Assignee:||Kernel Maintainer List <kernel-maint>|
|Status:||CLOSED ERRATA||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||23||CC:||cakersq, cunio, gansalmon, itamar, jonathan, jonrwads, kernel-maint, labbott, lquerel, madhu.chinakonda, mchehab, prash.n.rao, rosand86|
|Fixed In Version:||kernel-4.5.0-302.fc24 kernel-4.4.6-301.fc23||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2016-04-02 15:54:00 UTC||Type:||Bug|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
Description Jason H. 2016-03-12 22:21:39 UTC
Created attachment 1135699 [details] OpenSSL Speed Test on 4.2.3-300 Description of problem: The performance of the 4.4.4-301 kernel is significantly reduced (to about 18%) from release version 4.2.3-300. This manifests itself as laggy performance of applications, and general system slugishness. Version-Release number of selected component (if applicable): kernel-4.4.4-301.fc23.x86_64 How reproducible: Always Steps to Reproduce: 1. Install Fedora 23 release 2. Upgrade all system components to current versions, except for kernel-* 3. Reboot 4. Run "openssl speed" test, and save results. 5. Upgrade kernel to 4.4.4-301 6. Re-Run "openssl speed" test, and save results Actual results: See attachments. Speed test after kernel upgrade are approximately 18% of the values prior to the upgrade. Expected results: Similar speed test results. Additional info: CPU, quad core 64-bit: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Comment 1 Jason H. 2016-03-12 22:22:28 UTC
Created attachment 1135700 [details] OpenSSL Speed Test on 4.4.4-301
Comment 2 Jason H. 2016-03-12 22:23:14 UTC
Created attachment 1135705 [details] SysBench Test on 4.2.3-300
Comment 3 Jason H. 2016-03-12 22:23:45 UTC
Created attachment 1135706 [details] SysBench Test on 4.4.4-301
Comment 4 Jason H. 2016-03-13 18:28:24 UTC
I just tested against all prior versions of kernel released through "updates", and everything up through version 4.4.3-300 worked normally, while version 4.4.4-300 is where the slow down occurs. I have also tested against version 4.4.5-301 currently in "updates-testing" repository, and the problem persists.
Comment 5 Jason H. 2016-03-14 11:31:44 UTC
This seems to have to do with the Intel SpeedStep governors. In 4.4.3 and prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz. In 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to /proc/cpuinfo. Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to performance, instead of the default "powersave" forces the CPU to 2400 MHz, and improves performance greatly, but still not to the same level as in 4.4.3. Attached /proc/cpuinfo from both kernels, while under load.
Comment 6 Jason H. 2016-03-14 11:32:33 UTC
Created attachment 1136112 [details] /proc/cpuinfo for kernel 4.4.3-300
Comment 7 Jason H. 2016-03-14 11:33:02 UTC
Created attachment 1136113 [details] /proc/cpuinfo for kernel 4.4.4-301
Comment 8 Jon W. 2016-03-14 18:47:05 UTC
(In reply to Jason H. from comment #5) > This seems to have to do with the Intel SpeedStep governors. In 4.4.3 and > prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz. In > 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to > /proc/cpuinfo. > > Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to > performance, instead of the default "powersave" forces the CPU to 2400 MHz, > and improves performance greatly, but still not to the same level as in > 4.4.3. > > Attached /proc/cpuinfo from both kernels, while under load. This is also affecting me and has ever since I installed the updates for this kernel. The above helped me get to usable performance but it is still degraded.
Comment 9 Laura Abbott 2016-03-14 21:48:21 UTC
I'm not seeing any degradation on sysbench when I run on my machine with 4.4.5. I'm also not seeing any major changes to SpeedStep between 4.4.3 and 4.4.4 or any other system that jumps out at me. Can you share the following 1) dmesg from working and non-working kernels 2) lspci You can also try the bisection scripts at https://pagure.io/fedbisect to see which commit between 4.4.3 and 4.4.4 may have broken it.
Comment 11 Jon W. 2016-03-14 21:58:03 UTC
Created attachment 1136317 [details] dmesg 4.4.4-301.fc23.x86_64
Comment 12 Jon W. 2016-03-14 21:59:06 UTC
At this point I can only get what is currently running at 4.4.4 as I am still using the computer. I will have get the old dmesg later.
Comment 16 Jason H. 2016-03-14 22:45:47 UTC
The Bisect scripts failed miserably: $ ./fedbisect.sh sync something ~/fedbisect/scripts ~/fedbisect Traceback (most recent call last): File "./fedbisect-run.py", line 3, in <module> import bisect_state File "/home/jason/fedbisect/scripts/bisect_state.py", line 5, in <module> import koji_cli File "/home/jason/fedbisect/scripts/koji_cli.py", line 29, in <module> import koji ImportError: No module named koji After installing "koji" package: $ ./fedbisect.sh sync something ~/fedbisect/scripts ~/fedbisect Traceback (most recent call last): File "./fedbisect-run.py", line 3, in <module> import bisect_state File "/home/jason/fedbisect/scripts/bisect_state.py", line 8, in <module> from git import Repo ImportError: No module named git I installed every package for python that includes the name "git", and couldn't get it working. Please include instructions on all required prerequisites.
Comment 17 Laura Abbott 2016-03-15 01:59:00 UTC
Sorry, the docs are missing the dependencies. I'm going to update them. The packages you should need are koji GitPython fedpkg hmaccalc pesign gcc
Comment 18 Jon W. 2016-03-15 15:25:14 UTC
Created attachment 1136645 [details] dmesg 4.4.3-300.fc23.x86_64
Comment 19 Jason H. 2016-03-16 13:25:15 UTC
I ran through fedbisect, it returned "Found your commit!". I did a little research about git bisect to get the details. I would recommend adding a section in your script to output the last bad commit details when it is found, or add to your readme what to do when "Found your commit" is displayed. "git bisect log" returns: # first bad commit: [774ac8b7eff69e0786970157de2157e68b22f456] Thermal: initialize thermal zone device correctly "git bisect visualize" returns: commit 774ac8b7eff69e0786970157de2157e68b22f456 Author: Zhang Rui <firstname.lastname@example.org> Date: Fri Oct 30 16:31:47 2015 +0800 Thermal: initialize thermal zone device correctly commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 upstream. After thermal zone device registered, as we have not read any temperature before, thus tz->temperature should not be 0, which actually means 0C, and thermal trend is not available. In this case, we need specially handling for the first thermal_zone_device_update(). Both thermal core framework and step_wise governor is enhanced to handle this. And since the step_wise governor is the only one that uses trends, so it's the only thermal governor that needs to be updated. Tested-by: Manuel Krause <email@example.com> Tested-by: szegad <firstname.lastname@example.org> Tested-by: prash <email@example.com> Tested-by: amish <firstname.lastname@example.org> Tested-by: Matthias <email@example.com> Reviewed-by: Javi Merino <firstname.lastname@example.org> Signed-off-by: Zhang Rui <email@example.com> Signed-off-by: Chen Yu <firstname.lastname@example.org> Signed-off-by: Greg Kroah-Hartman <email@example.com>
Comment 20 Jason H. 2016-03-16 15:17:59 UTC
FedBisect also requires the following dependencies (from a base install of Fedora): openssl-devel bc gcc m4 net-tools
Comment 21 Laura Abbott 2016-03-16 19:11:01 UTC
Thanks for using the bisect scripts. They are still a work in progress so I will update for the Found your commit. What you found is a good candidate for causing a perf drop. It's first in a series so I can't revert it by itself to test. Can you test http://koji.fedoraproject.org/koji/taskinfo?taskID=13369275 when it finishes ? This reverts the thermal series on top of 4.4.5.
Comment 22 Jason H. 2016-03-16 22:01:05 UTC
I just tested kernel-4.4.5-300.perfdropreverts.fc23.x86_64, and it works great, no performance issues experienced. Thanks for all the work on this!
Comment 23 Laura Abbott 2016-03-16 22:55:57 UTC
The upstream developers want to know if this happens on 4.5 as well, can you test this? http://koji.fedoraproject.org/koji/buildinfo?buildID=744823
Comment 24 Jason H. 2016-03-16 23:24:35 UTC
kernel-4.5.0-300.fc24.x86_64 also experiences the same significant performance decrease. In addition, with this kernel, my system completely freezes any time X is launched, I had to start in "run level" 3. This is unrelated to the original problem, so I'm not going to look into this further, just interesting.
Comment 25 Laura Abbott 2016-03-17 00:55:11 UTC
Thanks for testing. More requests for info: the output of "grep . /sys/class/thermal/*/*" on working and good kernel (preferably using the 4.4.5 scratch build I gave as a test since that will be fairly close) On the bad kernel, the output of grep . /sys/devices/system/cpu/intel_pstate/* Do you still see the problem if you set /sys/class/thermal/thermal_zone*/mode to "disabled"
Comment 26 Prash 2016-03-17 09:15:49 UTC
Created attachment 1137332 [details] prash-openssl-speed I can't reproduce the problem on my system. For reference, I'm one of the original reporters of the bug in the handling of the thermal subsystem. My affected device is a HP ProBook 4410s laptop running Archlinux. The patches by Rui Zhang and Chen Yu fixed my problem, and I have been running patched kernels for a year now. I did not notice a drop in performance at any time. For this bug report, I ran "openssl speed" like Jason H. I tested three different kernel versions: 4.1.19(LTS), 4.5.0-rc6-g18558ca, and 4.5.0, the last two of which, include the patches by Rui Zhang and Chen Yu. The 4.5.x kernels are 0.02% slower than the LTS kernels, but for my system, I can't say if they are (1) significant and (2) attributable to these patches. Seems like these patches are incompatible with newer processors or chipsets.
Comment 27 Jason H. 2016-03-17 10:25:08 UTC
Laura, I will get you that information tonight. Prash, since you mentioned Arch, I did test against different versions of the kernel on Arch as well, and the "good" baseline was a little slower than the comparable Fedora kernel, but nothing critical. While using the 4.4.4 kernel from Arch, performance dropped by around 25%. Not as a significant decrease as in Fedora (~%82 percent drop), but still a noticeable decrease for me. I'm sure someone smarter than I will find a way to solve both our issues!
Comment 28 Jason H. 2016-03-17 20:18:07 UTC
Created attachment 1137504 [details] /sys/devices/system/cpu/intel_pstate for 4.4.4-301.fc23.x86_64 (slow) Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for slow kernel 4.4.4.
Comment 29 Jason H. 2016-03-17 20:19:30 UTC
Created attachment 1137505 [details] /sys/devices/system/cpu/intel_pstate for 4.4.5-300.perfdropreverts.fc23.x86_64 (fast) Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for working kernel
Comment 30 Jason H. 2016-03-17 20:20:47 UTC
Created attachment 1137507 [details] /sys/class/thermal for 4.4.4-301.fc23.x86_64 (slow) Output of "grep . /sys/class/thermal/*/*" for slow kernel.
Comment 31 Jason H. 2016-03-17 20:21:36 UTC
Created attachment 1137508 [details] /sys/class/thermal for 4.4.5-300.perfdropreverts.fc23.x86_64 Output of "grep . /sys/class/thermal/*/*" command for working kernel.
Comment 32 Jason H. 2016-03-17 20:23:16 UTC
I have attached the requested outputs of /sys/class/thermal and /sys/devices/system/cpu/intel_pstate. Setting /sys/class/thermal/thermal_zone0/mode to "disabled" had no effect to the performance issues on the 4.4.4 kernel. My thermal_zone1 does not have a "mode" parameter to set.
Comment 33 Jacek Pawlyta 2016-03-20 13:17:47 UTC
*** Bug 1317147 has been marked as a duplicate of this bug. ***
Comment 34 Laura Abbott 2016-03-21 17:43:48 UTC
The patch authors gave a fix that someone else confirmed fixes the performance issue for them. The output from the thermal files here shows the same trip point weirdness so it looks like the same issue. I pulled in the patch to the tree. It should be available when 4.4.7 comes out (later this week or next). Thanks again for reporting and following up.
Comment 35 Fedora Update System 2016-03-31 15:58:18 UTC
kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
Comment 36 Fedora Update System 2016-03-31 16:02:05 UTC
kernel-4.4.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
Comment 37 Jason H. 2016-03-31 20:18:57 UTC
(In reply to Fedora Update System from comment #35) > kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. > https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e I have tested kernel-4.4.6-301.fc23 x86_64, and it fixes my performance issues. Thanks!
Comment 38 Fedora Update System 2016-04-01 01:55:36 UTC
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
Comment 39 Fedora Update System 2016-04-01 15:22:57 UTC
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
Comment 40 Fedora Update System 2016-04-01 20:57:10 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81fd1b03aa
Comment 41 Fedora Update System 2016-04-02 00:44:04 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 42 Fedora Update System 2016-04-02 15:51:55 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 43 Fedora Update System 2016-04-08 15:52:05 UTC
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 44 Fedora Update System 2016-04-08 20:19:57 UTC
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.