Bug 1772894

Summary: kvm nx_huge_pages_recovery_ratio=0 is needed to meet KVM-RT low latency requirement
Product: Red Hat Enterprise Linux 7 Reporter: Paolo Bonzini <pbonzini>
Component: kernel-rtAssignee: Tom Rix <trix>
kernel-rt sub component: KVM QA Contact: Pei Zhang <pezhang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: bhu, chayang, dhoward, jinzhao, juzhang, knoel, lgoncalv, lilu, lmiksik, pbonzini, peterx, pezhang, trix, virt-bugs, virt-maint, williams
Version: 7.8Keywords: ZStream
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1772738
: 1781156 1781157 (view as bug list) Environment:
Last Closed: 2020-03-31 19:51:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1772738    
Bug Blocks: 1672377, 1715542, 1779458, 1781156, 1781157    

Comment 2 Beth Uptagrafft 2019-11-15 15:57:59 UTC
*** Bug 1772651 has been marked as a duplicate of this bug. ***

Comment 10 Clark Williams 2019-12-04 20:33:06 UTC
*** Bug 1779455 has been marked as a duplicate of this bug. ***

Comment 14 Paolo Bonzini 2019-12-05 00:25:15 UTC
The mitigation itself does not cause jitter, it only slows down the guest somewhat. Real-time is not about running things fast, but "fast enough" and predictably.

Comment 15 Peter Xu 2019-12-05 01:06:37 UTC
(In reply to Paolo Bonzini from comment #14)
> The mitigation itself does not cause jitter, it only slows down the guest
> somewhat. Real-time is not about running things fast, but "fast enough" and
> predictably.

Makes sense.  Sorry I confused with this - of course we can't turn off a mitigation by default.

Then we probably need to add kvm.nx_huge_pages=0 manually when testing with all mitigations=off case for kvm-rt, because IIUC otherwise the guest could stop to use huge pages on some page table entries witout notice.

Comment 16 Paolo Bonzini 2019-12-06 14:42:24 UTC
Mitigations=off already disables nx_huge_pages=0 so we're safe for that.

Comment 21 Pei Zhang 2020-01-03 09:08:32 UTC
*** Bug 1779455 has been marked as a duplicate of this bug. ***

Comment 25 Pei Zhang 2020-02-04 01:17:11 UTC
Verified with kernel-rt-3.10.0-1125.rt56.1091.el7.x86_64:

1. Default nx_huge_pages_recovery_ratio is 0. This is expected.
# systool -vm kvm | grep nx_hug
    nx_huge_pages_recovery_ratio= "0"
    nx_huge_pages       = "Y"


2. KVM-RT acceptance get PASS. 1 hour cyclictest max latency is 21us.

==Results==
(1)Single VM with 1 rt vCPU:
# Min Latencies: 00006
# Avg Latencies: 00007
# Max Latencies: 00020

(2)Single VM with 8 rt vCPUs:
# Min Latencies: 00006 00008 00008 00008 00008 00008 00008 00008
# Avg Latencies: 00007 00008 00008 00008 00008 00008 00008 00008
# Max Latencies: 00021 00020 00019 00018 00018 00019 00019 00018

(3)Multiple VMs each with 1 rt vCPU:
- VM1
# Min Latencies: 00006
# Avg Latencies: 00007
# Max Latencies: 00020

- VM2
# Min Latencies: 00006
# Avg Latencies: 00007
# Max Latencies: 00018

- VM3
# Min Latencies: 00006
# Avg Latencies: 00007
# Max Latencies: 00019

- VM4
# Min Latencies: 00005
# Avg Latencies: 00007
# Max Latencies: 00018



==Versions==
qemu-kvm-tools-rhev-2.12.0-43.el7.x86_64
microcode_ctl-2.1-61.el7.x86_64
rt-tests-1.5-9.el7.x86_64
kernel-rt-3.10.0-1125.rt56.1091.el7.x86_64
qemu-kvm-rhev-2.12.0-43.el7.x86_64
qemu-kvm-common-rhev-2.12.0-43.el7.x86_64
tuned-2.11.0-8.el7.noarch
libvirt-4.5.0-32.el7.x86_64


Beaker job:https://beaker.engineering.redhat.com/jobs/4048311

So this bug has been fixed very well. Move to 'VERIFIED'.

Comment 27 errata-xmlrpc 2020-03-31 19:51:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1070