Bug 1779455
Summary: | There is latency spike(99us) with kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=Y | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Pei Zhang <pezhang> | |
Component: | kernel-rt | Assignee: | Luis Claudio R. Goncalves <lgoncalv> | |
kernel-rt sub component: | KVM | QA Contact: | Pei Zhang <pezhang> | |
Status: | CLOSED DUPLICATE | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | bhu, chayang, jinzhao, juzhang, knoel, lcapitulino, lgoncalv, mtosatti, pbonzini, peterx, trix, virt-maint, williams | |
Version: | 7.8 | Keywords: | Reopened | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1779458 (view as bug list) | Environment: | ||
Last Closed: | 2020-01-03 09:08:32 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1672377, 1779458 |
Description
Pei Zhang
2019-12-04 02:30:32 UTC
*** This bug has been marked as a duplicate of bug 1772894 *** Hi Peter, Luiz, Marcelo, Paolo, With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=N, the latency looks good, 6h cyclictest max latency is 29us (the meltdown&spectre mitigations were enabled in these runs: pti_enable=1 ibpb_enabled=1 ibrs_enabled=0 retp_enabled=1). 3/3 PASS. Run1: 6h cyclictest testing results: (2)Single VM with 8 rt vCPUs(max latency is 22): # Min Latencies: 00006 00006 00006 00006 00006 00006 00006 00006 # Avg Latencies: 00008 00007 00006 00007 00007 00007 00007 00007 # Max Latencies: 00022 00020 00018 00018 00018 00018 00019 00019 Run2: 6h cyclictest testing results(max latency is 29): (2)Single VM with 8 rt vCPUs: # Min Latencies: 00006 00007 00007 00007 00007 00007 00006 00006 # Avg Latencies: 00008 00007 00007 00007 00007 00007 00007 00007 # Max Latencies: 00023 00019 00020 00022 00019 00019 00019 00029 Run3: 6h cyclictest testing results(max latency is 23): (2)Single VM with 8 rt vCPUs: # Min Latencies: 00006 00007 00007 00007 00007 00006 00006 00006 # Avg Latencies: 00008 00007 00007 00007 00007 00007 00007 00006 # Max Latencies: 00023 00019 00018 00019 00019 00019 00018 00018 Beaker jobs: https://beaker.engineering.redhat.com/jobs/3939499 https://beaker.engineering.redhat.com/jobs/3939500 https://beaker.engineering.redhat.com/jobs/3938095 So besides nx_huge_pages_recovery_ratio=0, we also need nx_huge_pages=N for expected latency results. I'll submit 24h jobs to double confirm this conclusion. Testing results will be updated soon. > Well, a guest switching code between 2MB -> 4K (say khugepaged), without a TLB flush, can
> cause this condition as i understand.
>
> And in that case, a buggy guest can crash the host.
>
> Am i missing something?
That would also be true of a bare-metal Linux system, and the memory management subsystem was audited. So it's only about untrusted guests. Still, the default should be nx_huge_pages=1.
Karen, The latency spike happened because the upstream patch to set nx_huge_pages_recovery_ratio=0 on realtime kernels was not backported. Now it has been backported, which means the spike should be gone. Worrying is the fact that the all instruction pages are now cached by 4K TLB entries, and the recovery thread is disabled. This might slowdown certain workloads. Regarding whether or not to enable the security mitigation, Intel mentions: Once these updates are applied, it may be appropriate for some customers to consider additional steps. This includes customers who cannot guarantee that trusted software is running on their system(s) and are using Simultaneous Multi-Threading (SMT). In these cases, customers should consider how they utilize SMT for their particular workload(s), guidance from their OS and VMM software providers, and the security threat model for their particular environment. Because these factors will vary considerably by customer, Intel is not recommending that Intel® HT be disabled, and it’s important to understand that doing so does not alone provide protection against MDS. I think providing a "trusted_code=Y/N" tunable to control the Spectre/Meltdown and the vulnerability above is useful. I don't think unstruted code runs on most of these Telco deployments. (In reply to Pei Zhang from comment #14) > Hi Peter, Luiz, Marcelo, Paolo, > > With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=N, the latency looks > good, 6h cyclictest max latency is 29us (the meltdown&spectre mitigations > were enabled in these runs: pti_enable=1 ibpb_enabled=1 ibrs_enabled=0 > retp_enabled=1). 3/3 PASS. > > Run1: > 6h cyclictest testing results: > (2)Single VM with 8 rt vCPUs(max latency is 22): > # Min Latencies: 00006 00006 00006 00006 00006 00006 00006 00006 > # Avg Latencies: 00008 00007 00006 00007 00007 00007 00007 00007 > # Max Latencies: 00022 00020 00018 00018 00018 00018 00019 00019 > > Run2: > 6h cyclictest testing results(max latency is 29): > (2)Single VM with 8 rt vCPUs: > # Min Latencies: 00006 00007 00007 00007 00007 00007 00006 00006 > # Avg Latencies: 00008 00007 00007 00007 00007 00007 00007 00007 > # Max Latencies: 00023 00019 00020 00022 00019 00019 00019 00029 > > Run3: > 6h cyclictest testing results(max latency is 23): > (2)Single VM with 8 rt vCPUs: > # Min Latencies: 00006 00007 00007 00007 00007 00006 00006 00006 > # Avg Latencies: 00008 00007 00007 00007 00007 00007 00007 00006 > # Max Latencies: 00023 00019 00018 00019 00019 00019 00018 00018 > > Beaker jobs: > https://beaker.engineering.redhat.com/jobs/3939499 > https://beaker.engineering.redhat.com/jobs/3939500 > https://beaker.engineering.redhat.com/jobs/3938095 > > So besides nx_huge_pages_recovery_ratio=0, we also need nx_huge_pages=N for > expected latency results. > > I'll submit 24h jobs to double confirm this conclusion. Testing results will > be updated soon. Testing with kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=N, the 24h cyclictest testings looks good. Max latency is 34us, no spike any more. ==Results== (1)Single VM with 1 rt vCPU: # Min Latencies: 00006 # Avg Latencies: 00008 # Max Latencies: 00023 (2)Single VM with 8 rt vCPUs: # Min Latencies: 00006 00006 00006 00006 00006 00006 00006 00006 # Avg Latencies: 00008 00007 00007 00007 00007 00007 00007 00008 # Max Latencies: 00021 00019 00019 00018 00018 00018 00019 00028 (3)Multiple VMs each with 1 rt vCPU: - VM1 # Min Latencies: 00006 # Avg Latencies: 00008 # Max Latencies: 00023 - VM2 # Min Latencies: 00006 # Avg Latencies: 00008 # Max Latencies: 00020 - VM3 # Min Latencies: 00006 # Avg Latencies: 00008 # Max Latencies: 00021 - VM4 # Min Latencies: 00006 # Avg Latencies: 00008 # Max Latencies: 00034 ==Versions== kernel-rt-3.10.0-957.43.1.rt56.957.el7.x86_64 qemu-kvm-rhev-2.12.0-18.el7_6.7.x86_64 rt-tests-1.0-12.el7.x86_64 qemu-kvm-common-rhev-2.12.0-18.el7_6.7.x86_64 tuned-2.10.0-6.el7_6.4.noarch microcode_ctl-2.1-47.12.el7_6.x86_64 qemu-kvm-tools-rhev-2.12.0-18.el7_6.7.x86_64 libvirt-4.5.0-10.el7_6.15.x86_64 Thanks a lot for testing, Pei! Scenarios summary: (1)With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=Y: max latency is 99us (2)With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=N: max latency is 34us With scenario (1), enabling nx_huge_pages mitigation will cause the max latency is 99us which should not be acceptable. Thanks Marcelo's suggestion by mail, I agree re-open this bz for further discussion. (In reply to Pei Zhang from comment #26) > Scenarios summary: > (1)With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=Y: max latency is > 99us > (2)With kvm nx_huge_pages_recovery_ratio=0 nx_huge_pages=N: max latency is > 34us > > With scenario (1), enabling nx_huge_pages mitigation will cause the max > latency is 99us which should not be acceptable. Thanks Marcelo's suggestion > by mail, I agree re-open this bz for further discussion. Pei, I think the conclusion is that, we want nx_huge_pages=N in addition to nx_huge_pages_recovery_ratio=0. Correct? If yes, then would you open a new BZ and keep this one as a dupe? I think this got too confusing at this point (not your fault!!) and just asking for nx_huge_pages=N will simplify it. *** This bug has been marked as a duplicate of bug 1772894 *** |