Bug 1507027
| Summary: | [ESXi][RHEL7.6]x86/vmware: Add paravirt sched clock | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Daniele <dconsoli> |
| Component: | kernel | Assignee: | Vitaly Kuznetsov <vkuznets> |
| kernel sub component: | ESXi | QA Contact: | ldu <ldu> |
| Status: | CLOSED ERRATA | Docs Contact: | Jiri Herrmann <jherrman> |
| Severity: | medium | ||
| Priority: | medium | CC: | ailan, boyang, cavery, grzegorz.halat, jherrman, ldu, leiwang, mtessun, pasik, vkuznets, yacao |
| Version: | 7.6 | Keywords: | FutureFeature |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | kernel-3.10.0-883.el7 | Doc Type: | Release Note |
| Doc Text: |
Paravirtualized clock added to Red Hat Enterprise Linux VMs
With this update, the paravirtualized `sched_clock()` function has been integrated in the Red Hat Enterprise Linux kernel. This improves the performance of Red Hat Enterprise Linux virtual machines (VMs) running on VMWare hypervisors.
Note that the function is enabled by default. To disable it, add the "no-vmw-sched-clock" option to the kernel command line.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-10-30 08:19:58 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Daniele
2017-10-27 12:40:24 UTC
Hi Vitaly,
I try the LKP on RHEL, but RHEL is not a supported system for LKP.
I run another performance tool unixbench on the guest.
The test result show this update brought some performance improvement.
Below is the detail test result:
Without the sched patch kernel:
=======================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: bootp-73-199-91.lab.eng.pek2.redhat.com: GNU/Linux
OS: GNU/Linux -- 3.10.0-862.el7.x86_64 -- #1 SMP Wed Mar 21 18:14:51 EDT 2018
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (4400.0 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (4400.0 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
10:34:15 up 16:42, 2 users, load average: 0.18, 0.12, 0.09; runlevel 3
------------------------------------------------------------------------
Benchmark Run: Fri Apr 13 2018 10:34:15 - 11:02:19
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 29005066.9 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4330.7 MWIPS (9.8 s, 7 samples)
Execl Throughput 2525.3 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 491421.2 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 129891.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1609588.0 KBps (30.0 s, 2 samples)
Pipe Throughput 664258.2 lps (10.0 s, 7 samples)
Pipe-based Context Switching 136408.7 lps (10.0 s, 7 samples)
Process Creation 7613.7 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 5460.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1048.3 lpm (60.0 s, 2 samples)
System Call Overhead 618401.1 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 29005066.9 2485.4
Double-Precision Whetstone 55.0 4330.7 787.4
Execl Throughput 43.0 2525.3 587.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 491421.2 1241.0
File Copy 256 bufsize 500 maxblocks 1655.0 129891.5 784.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 1609588.0 2775.2
Pipe Throughput 12440.0 664258.2 534.0
Pipe-based Context Switching 4000.0 136408.7 341.0
Process Creation 126.0 7613.7 604.3
Shell Scripts (1 concurrent) 42.4 5460.2 1287.8
Shell Scripts (8 concurrent) 6.0 1048.3 1747.2
System Call Overhead 15000.0 618401.1 412.3
========
System Benchmarks Index Score 908.7
------------------------------------------------------------------------
Benchmark Run: Fri Apr 13 2018 11:02:19 - 11:30:24
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 57995632.4 lps (10.0 s, 7 samples)
Double-Precision Whetstone 8648.0 MWIPS (9.8 s, 7 samples)
Execl Throughput 4966.4 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 874465.9 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 227793.2 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2870742.7 KBps (30.0 s, 2 samples)
Pipe Throughput 1325348.0 lps (10.0 s, 7 samples)
Pipe-based Context Switching 270309.4 lps (10.0 s, 7 samples)
Process Creation 16341.7 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 7689.6 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1087.2 lpm (60.0 s, 2 samples)
System Call Overhead 1161983.8 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 57995632.4 4969.6
Double-Precision Whetstone 55.0 8648.0 1572.4
Execl Throughput 43.0 4966.4 1155.0
File Copy 1024 bufsize 2000 maxblocks 3960.0 874465.9 2208.2
File Copy 256 bufsize 500 maxblocks 1655.0 227793.2 1376.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 2870742.7 4949.6
Pipe Throughput 12440.0 1325348.0 1065.4
Pipe-based Context Switching 4000.0 270309.4 675.8
Process Creation 126.0 16341.7 1297.0
Shell Scripts (1 concurrent) 42.4 7689.6 1813.6
Shell Scripts (8 concurrent) 6.0 1087.2 1811.9
System Call Overhead 15000.0 1161983.8 774.7
========
System Benchmarks Index Score 1618.3
################################################################################
With the sched patch kernel:
Pipe Throughput 669672.4 lps (10.0 s, 7 samples)
Pipe-based Context Switching 145046.6 lps (10.0 s, 7 samples)
Process Creation 7889.8 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 5320.7 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1039.9 lpm (60.0 s, 2 samples)
System Call Overhead 618191.2 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 29015828.0 2486.4
Double-Precision Whetstone 55.0 4326.6 786.7
Execl Throughput 43.0 2497.3 580.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 491846.6 1242.0
File Copy 256 bufsize 500 maxblocks 1655.0 130984.0 791.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1630917.6 2811.9
Pipe Throughput 12440.0 669672.4 538.3
Pipe-based Context Switching 4000.0 145046.6 362.6
Process Creation 126.0 7889.8 626.2
Shell Scripts (1 concurrent) 42.4 5320.7 1254.9
Shell Scripts (8 concurrent) 6.0 1039.9 1733.2
System Call Overhead 15000.0 618191.2 412.1
========
System Benchmarks Index Score 914.9
------------------------------------------------------------------------
Benchmark Run: Fri Apr 13 2018 14:38:17 - 15:06:21
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 58034919.9 lps (10.0 s, 7 samples)
Double-Precision Whetstone 8632.3 MWIPS (9.8 s, 7 samples)
Execl Throughput 4786.9 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 874006.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 229655.3 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2960246.9 KBps (30.0 s, 2 samples)
Pipe Throughput 1325788.9 lps (10.0 s, 7 samples)
Pipe-based Context Switching 279446.8 lps (10.0 s, 7 samples)
Process Creation 16720.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 7744.1 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1088.8 lpm (60.0 s, 2 samples)
System Call Overhead 1169166.7 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 58034919.9 4973.0
Double-Precision Whetstone 55.0 8632.3 1569.5
Execl Throughput 43.0 4786.9 1113.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 874006.4 2207.1
File Copy 256 bufsize 500 maxblocks 1655.0 229655.3 1387.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 2960246.9 5103.9
Pipe Throughput 12440.0 1325788.9 1065.7
Pipe-based Context Switching 4000.0 279446.8 698.6
Process Creation 126.0 16720.9 1327.1
Shell Scripts (1 concurrent) 42.4 7744.1 1826.4
Shell Scripts (8 concurrent) 6.0 1088.8 1814.7
System Call Overhead 15000.0 1169166.7 779.4
========
System Benchmarks Index Score 1628.0
======= Script description and score comparison completed! =======
(In reply to ldu from comment #6) > Hi Vitaly, > I try the LKP on RHEL, but RHEL is not a supported system for LKP. > I run another performance tool unixbench on the guest. > The test result show this update brought some performance improvement. Thank you Lily, I'll go ahead with the patchset. Patch(es) committed on kernel repository and an interim kernel build is undergoing testing Patch(es) available on kernel-3.10.0-883.el7 Verified this bug on RHEL 7.6 with kernel kernel-3.10.0-915.el7 check the dmesg could see the "sched" related log: [root@bootp-73-199-225 ~]# dmesg |grep vmware [ 0.000000] vmware: TSC freq read from hypervisor : 2199.998 MHz [ 0.000000] vmware: Host bus clock speed read from hypervisor : 66000000 Hz [ 0.000000] vmware: using sched offset of 7509058513 ns Also the performance seems a little improvement with tools unixbench. so Change the status to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3083 |