Bug 1687578 - Incorrect CVE vulnerabilities reported on Cascade Lake cpus
Summary: Incorrect CVE vulnerabilities reported on Cascade Lake cpus
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.0
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.0
Assignee: Eduardo Habkost
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-11 19:11 UTC by Joe Mario
Modified: 2019-09-26 17:13 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-3.1.0-22.module+el8.0.1+3032+a09688b9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-07 10:41:10 UTC
Type: Bug
Target Upstream Version:
Embargoed:
yuhuang: needinfo-


Attachments (Terms of Use)
Software version info, and CVE vulnerabilities info for both host and guest. (4.63 KB, text/plain)
2019-03-11 19:11 UTC, Joe Mario
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2395 0 None None None 2019-08-07 10:41:22 UTC

Description Joe Mario 2019-03-11 19:11:01 UTC
Created attachment 1542999 [details]
Software version info, and CVE vulnerabilities info for both host and guest.

Description of problem:
The Cascade Lake cpu fixes three recent CVE vulnerabilities in hardware.  
The vulnerabilities files in /sys/devices/system/cpu/vulnerabilities/ correctly reports the fixes.  A kvm guest booted with host-passthrough reports incorrect information in those files.

Version-Release number of selected component (if applicable):
Lenovo reported this to us in their RHEL-8 testing.
I duplicated it on RHEL-8.  I assume the problem also exists on RHEL-7.6.

How reproducible:

Steps to Reproduce:
1.  Boot a rhel-8 guest on a rhel-8 host on system using Intel's Cascade Lake cpus. 
2. Cat the contents of /sys/devices/system/cpu/vulnerabilities/* to see the correct CVE mitigation.  spectre_v2, l1tf, and meltdown are resolved in hardware.
3.Then look at the same file contents in the guest.  The contents are erroneous.  See below.

See the attached files for the host and guest information, including kernel, libvirt, qemu versions.

Here's the host output, with the correct information:
# grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/l1tf:Not affected
/sys/devices/system/cpu/vulnerabilities/meltdown:Not affected
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling

Here's the guest incorrect output:
# grep . /sys/devices/system/cpu/vulnerabilities/* 
/sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic retpoline, IBPB, IBRS_FW

Notice that:
 a) l1tf is missing
 b) meltdown says it's mitigated with PTI
 c) spectre_v2 says it's mitigated with retpolines, and IBRS_FW.

See the attached files for more complete system info.

Comment 4 Paolo Bonzini 2019-03-29 13:16:32 UTC
We don't include any ARCH_CAPABILITIES bits in the CascadeLake CPU models. Joe, if your QEMU version is 3.1.0, can you try with "-cpu host"?

Comment 5 Eduardo Habkost 2019-03-29 14:04:59 UTC
On QEMU 3.1.0, you might need "-cpu host,migratable=off".  arch-capabilities is going to be migratable only on QEMU 4.0.

Comment 6 Eduardo Habkost 2019-03-30 21:11:09 UTC
I will ask libvirt developers if they are willing to make host-passthrough use "-cpu host,migratable=off".  But even if they do that, we should backport the following commits to 8.0.1 so users and management software can manually enable arch-capabilities, and to allow libvirt to include it when expanding mode=host-model.


commit 485b1d256bcb0874bcde0223727c159b6837e6f8
Author: Eduardo Habkost <ehabkost>
Date:   Fri Jan 25 20:06:05 2019 -0200

    i386: kvm: Disable arch_capabilities if MSR can't be set
    
    KVM has two bugs in the handling of MSR_IA32_ARCH_CAPABILITIES:
    
    1) Linux commit commit 1eaafe91a0df ("kvm: x86: IA32_ARCH_CAPABILITIES
       is always supported") makes GET_SUPPORTED_CPUID return
       arch_capabilities even if running on SVM.  This makes "-cpu
       host,migratable=off" incorrectly expose arch_capabilities on CPUID on
       AMD hosts (where the MSR is not emulated by KVM).
    
    2) KVM_GET_MSR_INDEX_LIST does not return MSR_IA32_ARCH_CAPABILITIES if
       the MSR is not supported by the host CPU.  This makes QEMU not
       initialize the MSR properly at kvm_put_msrs() on those hosts.
    
    Work around both bugs on the QEMU side, by checking if the MSR
    was returned by KVM_GET_MSR_INDEX_LIST before returning the
    feature flag on kvm_arch_get_supported_cpuid().
    
    This has the unfortunate side effect of making arch_capabilities
    unavailable on hosts without hardware support for the MSR until bug #2
    is fixed on KVM, but I can't see another way to work around bug #1
    without that side effect.
    
    Signed-off-by: Eduardo Habkost <ehabkost>
    Message-Id: <20190125220606.4864-2-ehabkost>
    Signed-off-by: Eduardo Habkost <ehabkost>

commit 014018e19b3c54dd1bf5072bc912ceffea40abe8
Author: Eduardo Habkost <ehabkost>
Date:   Fri Jan 25 20:06:06 2019 -0200

    i386: Make arch_capabilities migratable
    
    Now that kvm_arch_get_supported_cpuid() will only return
    arch_capabilities if QEMU is able to initialize the MSR properly,
    we know that the feature is safely migratable.
    
    Signed-off-by: Eduardo Habkost <ehabkost>
    Message-Id: <20190125220606.4864-3-ehabkost>
    Signed-off-by: Eduardo Habkost <ehabkost>

Comment 7 Joe Mario 2019-03-31 10:56:39 UTC
Paolo wrote:
 > Joe, if your QEMU version is 3.1.0, can you try with "-cpu host"?

My xml file had:
  <cpu mode='host-passthrough' check='none'>
which results in "-cpu host" being passed on the qemu-kvm command line.

My qemu version looks older than 3.1.0, but isn't it what rhel-8 ships with?
# /usr/libexec/qemu-kvm --version
QEMU emulator version 2.12.0 (qemu-kvm-2.12.0-63.module+el8+2833+c7d6d092)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

It's not my Cascade Lake system, but that version appears to be what's installed with the rhel-8 install.

The repos are pointing to latest-RHEL8, and the date of quemu-kvm is late Feb.

# ls -l /usr/libexec/qemu-kvm
-rwxr-xr-x 1 root root 13513176 Feb 26 13:50 /usr/libexec/qemu-kvm

Joe

Comment 11 Danilo de Paula 2019-04-15 18:47:15 UTC
Fix included in qemu-kvm-3.1.0-22.module+el8.0.1+3032+a09688b9

Comment 13 Yumei Huang 2019-04-17 07:10:29 UTC
Verify:
qemu-kvm-3.1.0-22.module+el8.0.1+3042+caff5060

Host using Cascade Lake cpu:

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              192
On-line CPU(s) list: 0-191
Thread(s) per core:  2
Core(s) per socket:  24
Socket(s):           4
NUMA node(s):        4
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz
Stepping:            6
CPU MHz:             3599.825
CPU max MHz:         3900.0000
CPU min MHz:         1000.0000
BogoMIPS:            4800.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0-23,96-119
NUMA node1 CPU(s):   24-47,120-143
NUMA node2 CPU(s):   48-71,144-167
NUMA node3 CPU(s):   72-95,168-191
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni flush_l1d arch_capabilities

Check on host,
# grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/l1tf:Not affected
/sys/devices/system/cpu/vulnerabilities/meltdown:Not affected
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling


Boot rhel8 guest with -cpu host, 

# /usr/libexec/qemu-kvm -m 8G -smp 8 -cpu host ...

Check the contents of /sys/devices/system/cpu/vulnerabilities/* in guest, got same result as host. 

#  grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/l1tf:Not affected
/sys/devices/system/cpu/vulnerabilities/meltdown:Not affected
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp
/sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization
/sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling

Comment 14 Eduardo Habkost 2019-06-08 23:27:25 UTC
(In reply to Yumei Huang from comment #13)
> Verify:
> qemu-kvm-3.1.0-22.module+el8.0.1+3042+caff5060

Thanks!  What was the host kernel version used to verify this BZ?

Comment 15 Yumei Huang 2019-06-10 02:17:21 UTC
(In reply to Eduardo Habkost from comment #14)
> (In reply to Yumei Huang from comment #13)
> > Verify:
> > qemu-kvm-3.1.0-22.module+el8.0.1+3042+caff5060
> 
> Thanks!  What was the host kernel version used to verify this BZ?

It's kernel-4.18.0-80.15.el8.x86_64. 

I will add kernel version next time I verify bugs, thanks for reminder.

Comment 16 Guo, Zhiyi 2019-08-02 07:01:27 UTC
Test against Cascadelake host using latest rhel8.1 environment, seems mitigation status are incorrect for VM. 

My pkgs:
4.18.0-124.el8.x86_64(host & VM)
qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3.x86_64
libvirt-client-5.5.0-2.module+el8.1.0+3773+7dd501bf.x86_64
microcode_ctl-20190618-1.el8.x86_64

My qemu cpu options:
...
-machine pc-q35-rhel8.0.0,accel=kvm,usb=off,dump-guest-core=off -cpu Cascadelake-Server,arch-capabilities=on,stibp=on,hypervisor=on,mpx=off,pku=on
...

Inside host, the mitigation status are:
/sys/devices/system/cpu/vulnerabilities/l1tf:Not affected                                                                                                                  /sys/devices/system/cpu/vulnerabilities/mds:Not affected                                                                                                                   /sys/devices/system/cpu/vulnerabilities/meltdown:Not affected                                                                                                              /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp                                              /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization                                                                                 /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling 

Inside VM, the mitigation status are:
# grep . /sys/devices/system/cpu/vulnerabilities/*
grep . /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/l1tf:Mitigation: PTE Inversion                                                                                                     /sys/devices/system/cpu/vulnerabilities/mds:Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown                                                  /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI                                                                                                           /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp                                              /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization                                                                                 /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling

For l1tf and meltdown, they should be not affected?
For spectre_v2, this should be Mitigation: Enhanced IBRS, IBPB: conditional, RSB filling?

Paolo, can you help to check this behavior?

Comment 18 errata-xmlrpc 2019-08-07 10:41:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2395

Comment 19 Eduardo Habkost 2019-09-26 17:13:44 UTC
(In reply to Guo, Zhiyi from comment #16)
> For l1tf and meltdown, they should be not affected?
> For spectre_v2, this should be Mitigation: Enhanced IBRS, IBPB: conditional,
> RSB filling?
> 
> Paolo, can you help to check this behavior?

The BZ was already marked as verified, but for the record: arch-capabilities bits are never copied automatically from the host unless using "-cpu host".  If using named CPU model, arch-capabilities need to be configured manually (or you could use libvirt mode=host-model to do it).


Note You need to log in before you can comment on or make changes to this bug.