Bug 1317190 - Performance of version 4.4.4 drastically reduced from prior versions
Performance of version 4.4.4 drastically reduced from prior versions
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
23
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
: 1317147 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-12 17:21 EST by Jason H.
Modified: 2016-04-08 16:19 EDT (History)
13 users (show)

See Also:
Fixed In Version: kernel-4.5.0-302.fc24 kernel-4.4.6-301.fc23
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-02 11:54:00 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
OpenSSL Speed Test on 4.2.3-300 (4.03 KB, text/plain)
2016-03-12 17:21 EST, Jason H.
no flags Details
OpenSSL Speed Test on 4.4.4-301 (4.03 KB, text/plain)
2016-03-12 17:22 EST, Jason H.
no flags Details
SysBench Test on 4.2.3-300 (2.25 KB, text/plain)
2016-03-12 17:23 EST, Jason H.
no flags Details
SysBench Test on 4.4.4-301 (2.26 KB, text/plain)
2016-03-12 17:23 EST, Jason H.
no flags Details
/proc/cpuinfo for kernel 4.4.3-300 (7.94 KB, text/plain)
2016-03-14 07:32 EDT, Jason H.
no flags Details
/proc/cpuinfo for kernel 4.4.4-301 (7.93 KB, text/plain)
2016-03-14 07:33 EDT, Jason H.
no flags Details
lspci (Jon W.) (2.02 KB, text/plain)
2016-03-14 17:55 EDT, Jon W.
no flags Details
dmesg 4.4.4-301.fc23.x86_64 (75.01 KB, text/plain)
2016-03-14 17:58 EDT, Jon W.
no flags Details
dmesg 4.2.3-300 (69.53 KB, text/plain)
2016-03-14 18:20 EDT, Jason H.
no flags Details
dmesg 4.4.4-301 (68.24 KB, text/plain)
2016-03-14 18:21 EDT, Jason H.
no flags Details
lspci (1.67 KB, text/plain)
2016-03-14 18:21 EDT, Jason H.
no flags Details
dmesg 4.4.3-300.fc23.x86_64 (75.67 KB, text/plain)
2016-03-15 11:25 EDT, Jon W.
no flags Details
prash-openssl-speed (49.48 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-03-17 05:15 EDT, Prash
no flags Details
/sys/devices/system/cpu/intel_pstate for 4.4.4-301.fc23.x86_64 (slow) (257 bytes, text/plain)
2016-03-17 16:18 EDT, Jason H.
no flags Details
/sys/devices/system/cpu/intel_pstate for 4.4.5-300.perfdropreverts.fc23.x86_64 (fast) (257 bytes, text/plain)
2016-03-17 16:19 EDT, Jason H.
no flags Details
/sys/class/thermal for 4.4.4-301.fc23.x86_64 (slow) (3.31 KB, text/plain)
2016-03-17 16:20 EDT, Jason H.
no flags Details
/sys/class/thermal for 4.4.5-300.perfdropreverts.fc23.x86_64 (3.31 KB, text/plain)
2016-03-17 16:21 EDT, Jason H.
no flags Details

  None (edit)
Description Jason H. 2016-03-12 17:21:39 EST
Created attachment 1135699 [details]
OpenSSL Speed Test on 4.2.3-300

Description of problem: The performance of the 4.4.4-301 kernel is significantly reduced (to about 18%) from release version 4.2.3-300.  This manifests itself as laggy performance of applications, and general system slugishness.


Version-Release number of selected component (if applicable): kernel-4.4.4-301.fc23.x86_64


How reproducible: Always


Steps to Reproduce:
1. Install Fedora 23 release
2. Upgrade all system components to current versions, except for kernel-*
3. Reboot
4. Run "openssl speed" test, and save results.
5. Upgrade kernel to 4.4.4-301
6. Re-Run "openssl speed" test, and save results

Actual results:
See attachments.  Speed test after kernel upgrade are approximately 18% of the values prior to the upgrade.


Expected results:
Similar speed test results.


Additional info:
CPU, quad core 64-bit: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
Comment 1 Jason H. 2016-03-12 17:22 EST
Created attachment 1135700 [details]
OpenSSL Speed Test on 4.4.4-301
Comment 2 Jason H. 2016-03-12 17:23 EST
Created attachment 1135705 [details]
SysBench Test on 4.2.3-300
Comment 3 Jason H. 2016-03-12 17:23 EST
Created attachment 1135706 [details]
SysBench Test on 4.4.4-301
Comment 4 Jason H. 2016-03-13 14:28:24 EDT
I just tested against all prior versions of kernel released through "updates", and everything up through version 4.4.3-300 worked normally, while version 4.4.4-300 is where the slow down occurs.  I have also tested against version 4.4.5-301 currently in "updates-testing" repository, and the problem persists.
Comment 5 Jason H. 2016-03-14 07:31:44 EDT
This seems to have to do with the Intel SpeedStep governors.  In 4.4.3 and prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz.  In 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to /proc/cpuinfo.

Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to performance, instead of the default "powersave" forces the CPU to 2400 MHz, and improves performance greatly, but still not to the same level as in 4.4.3.

Attached /proc/cpuinfo from both kernels, while under load.
Comment 6 Jason H. 2016-03-14 07:32 EDT
Created attachment 1136112 [details]
/proc/cpuinfo for kernel 4.4.3-300
Comment 7 Jason H. 2016-03-14 07:33 EDT
Created attachment 1136113 [details]
/proc/cpuinfo for kernel 4.4.4-301
Comment 8 Jon W. 2016-03-14 14:47:05 EDT
(In reply to Jason H. from comment #5)
> This seems to have to do with the Intel SpeedStep governors.  In 4.4.3 and
> prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz.  In
> 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to
> /proc/cpuinfo.
> 
> Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to
> performance, instead of the default "powersave" forces the CPU to 2400 MHz,
> and improves performance greatly, but still not to the same level as in
> 4.4.3.
> 
> Attached /proc/cpuinfo from both kernels, while under load.

This is also affecting me and has ever since I installed the updates for this kernel.  The above helped me get to usable performance but it is still degraded.
Comment 9 Laura Abbott 2016-03-14 17:48:21 EDT
I'm not seeing any degradation on sysbench when I run on my machine with 4.4.5. I'm also not seeing any major changes to SpeedStep between 4.4.3 and 4.4.4 or any other system that jumps out at me. Can you share the following

1) dmesg from working and non-working kernels
2) lspci

You can also try the bisection scripts at https://pagure.io/fedbisect to see which commit between 4.4.3 and 4.4.4 may have broken it.
Comment 10 Jon W. 2016-03-14 17:55 EDT
Created attachment 1136316 [details]
lspci (Jon W.)
Comment 11 Jon W. 2016-03-14 17:58 EDT
Created attachment 1136317 [details]
dmesg 4.4.4-301.fc23.x86_64
Comment 12 Jon W. 2016-03-14 17:59:06 EDT
At this point I can only get what is currently running at 4.4.4 as I am still using the computer.  I will have get the old dmesg later.
Comment 13 Jason H. 2016-03-14 18:20 EDT
Created attachment 1136322 [details]
dmesg 4.2.3-300
Comment 14 Jason H. 2016-03-14 18:21 EDT
Created attachment 1136323 [details]
dmesg 4.4.4-301
Comment 15 Jason H. 2016-03-14 18:21 EDT
Created attachment 1136324 [details]
lspci
Comment 16 Jason H. 2016-03-14 18:45:47 EDT
The Bisect scripts failed miserably:
$ ./fedbisect.sh sync something
~/fedbisect/scripts ~/fedbisect
Traceback (most recent call last):
  File "./fedbisect-run.py", line 3, in <module>
    import bisect_state
  File "/home/jason/fedbisect/scripts/bisect_state.py", line 5, in <module>
    import koji_cli
  File "/home/jason/fedbisect/scripts/koji_cli.py", line 29, in <module>
    import koji
ImportError: No module named koji


After installing "koji" package:

$ ./fedbisect.sh sync something
~/fedbisect/scripts ~/fedbisect
Traceback (most recent call last):
  File "./fedbisect-run.py", line 3, in <module>
    import bisect_state
  File "/home/jason/fedbisect/scripts/bisect_state.py", line 8, in <module>
    from git import Repo
ImportError: No module named git


I installed every package for python that includes the name "git", and couldn't get it working.

Please include instructions on all required prerequisites.
Comment 17 Laura Abbott 2016-03-14 21:59:00 EDT
Sorry, the docs are missing the dependencies. I'm going to update them. The
packages you should need are

   koji
   GitPython
   fedpkg
   hmaccalc
   pesign
   gcc
Comment 18 Jon W. 2016-03-15 11:25 EDT
Created attachment 1136645 [details]
dmesg 4.4.3-300.fc23.x86_64
Comment 19 Jason H. 2016-03-16 09:25:15 EDT
I ran through fedbisect, it returned "Found your commit!".  I did a little research about git bisect to get the details.  I would recommend adding a section in your script to output the last bad commit details when it is found, or add to your readme what to do when "Found your commit" is displayed.


"git bisect log" returns:

# first bad commit: [774ac8b7eff69e0786970157de2157e68b22f456] Thermal: initialize thermal zone device correctly

"git bisect visualize" returns:

commit 774ac8b7eff69e0786970157de2157e68b22f456
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Fri Oct 30 16:31:47 2015 +0800

    Thermal: initialize thermal zone device correctly
    
    commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 upstream.
    
    After thermal zone device registered, as we have not read any
    temperature before, thus tz->temperature should not be 0,
    which actually means 0C, and thermal trend is not available.
    In this case, we need specially handling for the first
    thermal_zone_device_update().
    
    Both thermal core framework and step_wise governor is
    enhanced to handle this. And since the step_wise governor
    is the only one that uses trends, so it's the only thermal
    governor that needs to be updated.
    
    Tested-by: Manuel Krause <manuelkrause@netscape.net>
    Tested-by: szegad <szegadlo@poczta.onet.pl>
    Tested-by: prash <prash.n.rao@gmail.com>
    Tested-by: amish <ammdispose-arch@yahoo.com>
    Tested-by: Matthias <morpheusxyz123@yahoo.de>
    Reviewed-by: Javi Merino <javi.merino@arm.com>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Chen Yu <yu.c.chen@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Comment 20 Jason H. 2016-03-16 11:17:59 EDT
FedBisect also requires the following dependencies (from a base install of Fedora):
openssl-devel
bc
gcc
m4
net-tools
Comment 21 Laura Abbott 2016-03-16 15:11:01 EDT
Thanks for using the bisect scripts. They are still a work in progress so I will update for the Found your commit.

What you found is a good candidate for causing a perf drop. It's first in a series so I can't revert it by itself to test. Can you test http://koji.fedoraproject.org/koji/taskinfo?taskID=13369275 when it finishes ? This reverts the thermal series on top of 4.4.5.
Comment 22 Jason H. 2016-03-16 18:01:05 EDT
I just tested kernel-4.4.5-300.perfdropreverts.fc23.x86_64, and it works great, no performance issues experienced.

Thanks for all the work on this!
Comment 23 Laura Abbott 2016-03-16 18:55:57 EDT
The upstream developers want to know if this happens on 4.5 as well, can you test this? http://koji.fedoraproject.org/koji/buildinfo?buildID=744823
Comment 24 Jason H. 2016-03-16 19:24:35 EDT
kernel-4.5.0-300.fc24.x86_64 also experiences the same significant performance decrease.

In addition, with this kernel, my system completely freezes any time X is launched, I had to start in "run level" 3.  This is unrelated to the original problem, so I'm not going to look into this further, just interesting.
Comment 25 Laura Abbott 2016-03-16 20:55:11 EDT
Thanks for testing. More requests for info:

the output of "grep . /sys/class/thermal/*/*" on working and good kernel (preferably using the 4.4.5 scratch build I gave as a test since that will be fairly close)

On the bad kernel, the output of grep . /sys/devices/system/cpu/intel_pstate/*

Do you still see the problem if you set /sys/class/thermal/thermal_zone*/mode to "disabled"
Comment 26 Prash 2016-03-17 05:15 EDT
Created attachment 1137332 [details]
prash-openssl-speed

I can't reproduce the problem on my system.

For reference, I'm one of the original reporters of the bug in the handling of the thermal subsystem. My affected device is a HP ProBook 4410s laptop running Archlinux. The patches by Rui Zhang and Chen Yu fixed my problem, and I have been running patched kernels for a year now. I did not notice a drop in performance at any time.

For this bug report, I ran "openssl speed" like Jason H. I tested three different kernel versions: 4.1.19(LTS), 4.5.0-rc6-g18558ca, and 4.5.0, the last two of which, include the patches by Rui Zhang and Chen Yu. The 4.5.x kernels are 0.02% slower than the LTS kernels, but for my system, I can't say if they are (1) significant and (2) attributable to these patches.

Seems like these patches are incompatible with newer processors or chipsets.
Comment 27 Jason H. 2016-03-17 06:25:08 EDT
Laura, I will get you that information tonight.

Prash, since you mentioned Arch, I did test against different versions of the kernel on Arch as well, and the "good" baseline was a little slower than the comparable Fedora kernel, but nothing critical.  While using the 4.4.4 kernel from Arch, performance dropped by around 25%.  Not as a significant decrease as in Fedora (~%82 percent drop), but still a noticeable decrease for me.

I'm sure someone smarter than I will find a way to solve both our issues!
Comment 28 Jason H. 2016-03-17 16:18 EDT
Created attachment 1137504 [details]
/sys/devices/system/cpu/intel_pstate for 4.4.4-301.fc23.x86_64 (slow)

Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for slow kernel 4.4.4.
Comment 29 Jason H. 2016-03-17 16:19 EDT
Created attachment 1137505 [details]
/sys/devices/system/cpu/intel_pstate for 4.4.5-300.perfdropreverts.fc23.x86_64 (fast)

Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for working kernel
Comment 30 Jason H. 2016-03-17 16:20 EDT
Created attachment 1137507 [details]
/sys/class/thermal for 4.4.4-301.fc23.x86_64 (slow)

Output of "grep . /sys/class/thermal/*/*" for slow kernel.
Comment 31 Jason H. 2016-03-17 16:21 EDT
Created attachment 1137508 [details]
/sys/class/thermal for 4.4.5-300.perfdropreverts.fc23.x86_64

Output of "grep . /sys/class/thermal/*/*" command for working kernel.
Comment 32 Jason H. 2016-03-17 16:23:16 EDT
I have attached the requested outputs of /sys/class/thermal and /sys/devices/system/cpu/intel_pstate.

Setting /sys/class/thermal/thermal_zone0/mode to "disabled" had no effect to the performance issues on the 4.4.4 kernel.  My thermal_zone1 does not have a "mode" parameter to set.
Comment 33 Jacek Pawlyta 2016-03-20 09:17:47 EDT
*** Bug 1317147 has been marked as a duplicate of this bug. ***
Comment 34 Laura Abbott 2016-03-21 13:43:48 EDT
The patch authors gave a fix that someone else confirmed fixes the performance issue for them. The output from the thermal files here shows the same trip point weirdness so it looks like the same issue. I pulled in the patch to the tree. It should be available when 4.4.7 comes out (later this week or next).

Thanks again for reporting and following up.
Comment 35 Fedora Update System 2016-03-31 11:58:18 EDT
kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
Comment 36 Fedora Update System 2016-03-31 12:02:05 EDT
kernel-4.4.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
Comment 37 Jason H. 2016-03-31 16:18:57 EDT
(In reply to Fedora Update System from comment #35)
> kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23.
> https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e

I have tested kernel-4.4.6-301.fc23 x86_64, and it fixes my performance issues.

Thanks!
Comment 38 Fedora Update System 2016-03-31 21:55:36 EDT
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb
Comment 39 Fedora Update System 2016-04-01 11:22:57 EDT
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e
Comment 40 Fedora Update System 2016-04-01 16:57:10 EDT
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81fd1b03aa
Comment 41 Fedora Update System 2016-04-01 20:44:04 EDT
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 42 Fedora Update System 2016-04-02 11:51:55 EDT
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 43 Fedora Update System 2016-04-08 11:52:05 EDT
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 44 Fedora Update System 2016-04-08 16:19:57 EDT
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.