1686898 – Enhance detection of host CPU model to avoid guesses based on fea.ture list length

Bug 1686898 - Enhance detection of host CPU model to avoid guesses based on fea.ture list length

Summary: Enhance detection of host CPU model to avoid guesses based on fea.ture list l...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	8.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	8.0
Assignee:	Jiri Denemark
QA Contact:	jiyan
Docs Contact:
URL:
Whiteboard:
Depends On:	1558558 1686895
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-08 15:54 UTC by Jiri Denemark
Modified:	2020-11-14 06:21 UTC (History)
CC List:	7 users (show)
Fixed In Version:	libvirt-5.0.0-10.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1686895
Environment:
Last Closed:	2019-08-07 10:41:10 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	knoel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:2395	0	None	None	None	2019-08-07 10:41:22 UTC

Description Jiri Denemark 2019-03-08 15:54:29 UTC

+++ This bug was initially created as a clone of Bug #1686895 +++

+++ This bug was initially created as a clone of Bug #1558558 +++

Description of problem:
With the CVE patches, when the "Copy host cpu configuration" is used, the cpu type in the VM does not match the host cpu.

The host in the configuration is Broadwell and with the CPU patches, Broadwell-IBRS is available for the VM.

Here is the output of the qemu process which shows the cpu as Broadwell-IBRS

qemu      2053     1 88 Mar16 ?        3-05:26:28 /usr/libexec/qemu-kvm
    -name guest=rhel7.4,debug-threads=on
    -S
    ...
    -machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off
    -cpu Broadwell-IBRS
    -m 122880
    -realtime mlock=off
    -smp 64,sockets=64,cores=1,threads=1
    ...

When the host cpu option is checked, the client is automatically set to Skylake-Client-IBRS as shown in the qemu process output

qemu     21674     1  8 09:08 ?        00:02:14 /usr/libexec/qemu-kvm
    -name guest=rhel7.4,debug-threads=on
    -S
    ...
    -machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off
    -cpu Skylake-Client-IBRS,ss=on,hypervisor=on,tsc_adjust=on,pdpe1gb=on,mpx=off,xsavec=off,xgetbv1=off
    -m 122880
    -realtime mlock=off
    -smp 64,sockets=64,cores=1,threads=1
    ...


Version-Release number of selected component (if applicable):

Kernel - 3.10.0-693.19.1.el7.x86_64

qemu-kvm-rhev-2.9.0-16.el7_4.13.x86_64
libvirt-daemon-driver-qemu-3.2.0-14.el7_4.9.x86_64
qemu-img-rhev-2.9.0-16.el7_4.13.x86_64
ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
qemu-kvm-common-rhev-2.9.0-16.el7_4.13.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Install the software bits on Broadwell
2. Set cpu type to host pass thru
3. check the qemu process

Actual results:

VM shows cpu type Skylake-client-IBRS

Expected results:

VM should show Broadwell-IBRS

Additional info:

--- Additional comment from Eduardo Habkost on 2018-03-23 19:15:55 UTC ---

Moving to libvirt, as this is libvirt choosing Skylake-Client instead of Broadwell.

The CPU definition generated by libvirt is not entirely wrong (because Skylake-Client is the same as Broadwell+MPX+XSAVEC+XGETBV1.  But I understand it is confusing.

It looks like the Broadwell CPU model in libvirt cpu_map.xml is missing the following features: ABM, ARAT, F16C, RDRAND, VME, XSAVEOPT.

--- Additional comment from jiyan on 2019-01-08 03:01:19 UTC ---

Reproduced this bug on the following components.

Version:
kernel-3.10.0-693.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.13.x86_64
libvirt-3.2.0-14.el7_4.9.x86_64

Steps:
1. Related info
# lscpu
...
Model name:            Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
...

# virsh capabilities 
<capabilities>
  <host>
...
      <model>Broadwell</model>

# virsh domcapabilities
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Skylake-Client</model>       *********
...
    </mode>
    <mode name='custom' supported='yes'>
...
      <model usable='no'>Skylake-Client</model>  *********
...
      <model usable='yes'>Broadwell</model>  *********
    </mode>
  </cpu>

2. In "virt-manager", enable "copy host CPU configuration" 
# virsh domstate test1
shut off

# virsh dumpxml test1 --inactive |grep "<cpu" -A3
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

# virsh start test1
Domain test1 started

# virsh dumpxml test1 |grep "<cpu" -A17
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Client</model>       *********
    <vendor>Intel</vendor>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='mpx'/>
    <feature policy='disable' name='xsavec'/>
    <feature policy='disable' name='xgetbv1'/>
  </cpu>

# ps -ef |grep test1
-cpu Skylake-Client,ss=on,hypervisor=on,tsc_adjust=on,pdpe1gb=on,mpx=off,xsavec=off,xgetbv1=off -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1

--- Additional comment from Jiri Denemark on 2019-02-14 16:12:33 UTC ---

The difference between QEMU's and libvirt's specification of a given CPU model
is not an issue and in fact it is inevitable since some CPU models in QEMU
provide different features depending on the chosen machine type.

If users ask for Broadwell via libvirt, they will always get what QEMU
understands as Broadwell, libvirt's definition for this is irrelevant.

The definitions in libvirt are mostly used for detecting which CPU model
corresponds to the host CPU. However, it is partially also used for checking
guest vs. host CPU compatibilty and that's the reason we don't change CPU
model definitions in libvirt. Users wouldn't like their domain to suddenly
fail to start after libvirt is upgraded in case their host CPU does not
support any of the additional features possibly added to the CPU model they
use.

The problem (which is not exactly easy to fix) is that detecting a CPU model
just from a set of CPU features is not 100% reliable. Thus, we try to match
family/model numbers, but we only have a limited list of CPU signatures (in
fact we just use those from QEMU CPU models), while in real world a single CPU
model is shipped with several different family/model numbers. Unfortunately,
there seems to be no official documentation from Intel which would let us list
all CPU signatures for each CPU model. I'll try to talk to them about this,
but without such knowledge, we can't really do much.

That said, if host-model is resulting in an incorrect CPU model, users can
always use a specific CPU model.

--- Additional comment from Andrew Theurer on 2019-02-15 12:51:45 UTC ---

(In reply to Jiri Denemark from comment #11)
> The difference between QEMU's and libvirt's specification of a given CPU
> model
> is not an issue and in fact it is inevitable since some CPU models in QEMU
> provide different features depending on the chosen machine type.
> 
> If users ask for Broadwell via libvirt, they will always get what QEMU
> understands as Broadwell, libvirt's definition for this is irrelevant.

So, should this problem be fixed in Qemu, and why would Qemu select "Skylake-IBRS" when we want "Broadwell-IBRS"?

Do we know exactly what it is within the Skylake-IBRS definition that causes the performance degradation?  Shared cache definitions? A missing instruction?

--- Additional comment from Paolo Bonzini on 2019-02-15 15:34:52 UTC ---

QEMU does not make any decision here.  "host-model" only exists at the libvirt level.

> Do we know exactly what it is within the Skylake-IBRS definition that causes the performance degradation? 

What performance degradation?

--- Additional comment from Sanjay Rao on 2019-02-18 20:14:14 UTC ---

(In reply to Paolo Bonzini from comment #16)
> QEMU does not make any decision here.  "host-model" only exists at the
> libvirt level.
> 
> > Do we know exactly what it is within the Skylake-IBRS definition that causes the performance degradation? 
> 
> What performance degradation?


Paolo

We first saw this when we tested the CVE mitigation code and compared to results on a KVM kernel without mitigations, we saw that when we set the flag to host model, it would set the KVM cpu type to Skylake-client-IBRS and we saw performance degradation compared to not using the flag.

When the cpu type set to Broadwell-IBRS, we saw a 5 % drop compared to a baseline obtained on a kernel without mitigation but when the host model box was checked, the wrong cpu type was set and we saw a 25% drop compared to the baseline.

--- Additional comment from Bob Sibley on 2019-02-18 22:51:42 UTC ---

The issue has nothing to do with Mitigation:

When:

  <cpu mode='host-model' check='none'>
    <model fallback='allow'/>
  </cpu>

is selected, before guest is started, in .xml the cpu mode has host-model.

after starting guest:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Client</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='umip'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='mpx'/>
    <feature policy='disable' name='xsavec'/>
    <feature policy='disable' name='xgetbv1'/>
  </cpu>

in the .xml cpu mode, now has "Skylake-Client", now this shouldn't matter, except we're seeing performance degragation (as stated in "comment 20").

We're only seeing this in Intel Broadwell, the cpu mode flags  Broadwell, Broadwell-IBRS, Broadwell-noTSX, and Broadwell-noTSX-IBRS, when selected didn't change.

--- Additional comment from Paolo Bonzini on 2019-02-19 00:59:59 UTC ---

Ugh, IBRS indeed:

bool spec_ctrl_cond_enable_ibrs(bool full_retp)
{
        if (cpu_has_spec_ctrl() && (is_skylake_era() || !full_retp) &&
            !noibrs_cmdline) {
                if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED))
                        spec_ctrl_enable_ibrs_enhanced();
                else
                        set_spec_ctrl_pcp_ibrs();
                return true;
        }

        return false;
}

I think libvirt should be changed to never use a CPU that has a newer family/model than the host one.  Jiri, does that sound doable?

--- Additional comment from Jiri Denemark on 2019-02-19 11:40:37 UTC ---

Well, IMHO it's essentially what I suggested in comment #11, unless there's a
way to actually check what CPU is newer by comparing family/model numbers
without listing all possible combinations along individual CPU models in
cpu_map. As far as I can tell the family/model numbers are assigned pretty
randomly which makes comparisons pretty much impossible. Please, correct me if
I'm wrong it would make me happy.

--- Additional comment from Jiri Denemark on 2019-02-27 13:31:33 UTC ---

Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2019-February/msg01528.html

--- Additional comment from Jiri Denemark on 2019-03-05 18:58:03 UTC ---

This is fixed upstream since

commit 4ff74a806ad42820eef3877c8ec146770914d8df
Refs: v5.1.0-74-g4ff74a806a
Author:     Jiri Denemark <jdenemar>
AuthorDate: Fri Feb 22 14:45:24 2019 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Tue Mar 5 14:47:49 2019 +0100

    cpu_map: Add more signatures for Broadwell CPU models

    This fixes several CPUs which were incorrectly detected as
    Skylake-Client.

    Signed-off-by: Jiri Denemark <jdenemar>
    Reviewed-by: Ján Tomko <jtomko>

--- Additional comment from Jiri Denemark on 2019-03-08 11:44:29 UTC ---

One more additional patch is needed to fix a bug in the original series:

commit 62cb9c335c43a722e81ac0a1ed6e1111ba1d428b
Refs: v5.1.0-144-g62cb9c335c
Author:     Michal Privoznik <mprivozn>
AuthorDate: Thu Mar 7 14:17:01 2019 +0100
Commit:     Michal Privoznik <mprivozn>
CommitDate: Thu Mar 7 15:30:40 2019 +0100

    cpu: Don't access invalid memory in virCPUx86Translate

    Problem is that if there are no signatures for a CPU, then we
    still allocate cpu->signatures (even though with size 0). Later,
    we access cpu->signatures[0] if cpu->signatures is not NULL.

     Invalid read of size 4
        at 0x5F439D7: virCPUx86Translate (cpu_x86.c:2930)
        by 0x5F3C239: virCPUTranslate (cpu.c:927)
        by 0x57CE7A1: qemuProcessUpdateGuestCPU (qemu_process.c:5870)
        ...
      Address 0xf752d40 is 0 bytes after a block of size 0 alloc'd
        at 0x4C30EC6: calloc (vg_replace_malloc.c:711)
        by 0x5DBDE4E: virAllocN (viralloc.c:190)
        by 0x5F3E4FA: x86ModelCopySignatures (cpu_x86.c:990)
        by 0x5F3E60F: x86ModelCopy (cpu_x86.c:1008)
        by 0x5F3E7CB: x86ModelFromCPU (cpu_x86.c:1068)
        by 0x5F4397E: virCPUx86Translate (cpu_x86.c:2922)
        by 0x5F3C239: virCPUTranslate (cpu.c:927)
        by 0x57CE7A1: qemuProcessUpdateGuestCPU (qemu_process.c:5870)
        ...

    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Jiri Denemark <jdenemar>

Comment 4 jiyan 2019-06-17 03:54:28 UTC

Verified this bug on libvirt-5.0.0-10.module+el8.0.1+3363+49e420ce.x86_64

Version:
libvirt-5.0.0-10.module+el8.0.1+3363+49e420ce.x86_64
qemu-kvm-3.1.0-27.module+el8.0.1+3253+c5371cb3.x86_64
kernel-4.18.0-80.el8.x86_64

Steps:
1. Check the cpuinfo in the output of "virsh capabilities" and "virsh domcapabilities"
# virsh capabilities 
<capabilities>

  <host>
    <uuid>4f11c612-e27d-11e7-9a7d-0894ef59df54</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>Broadwell</model>
...

# virsh domcapabilities
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Broadwell</model>
      <vendor>Intel</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='ss'/>
      <feature policy='require' name='f16c'/>
      ...
    </mode>
    <mode name='custom' supported='yes'>
      <model usable='yes'>qemu64</model>
      ...
    </mode>

2. Prepare a shutdown VM with the following conf
# virsh domstate test
shut off

# virsh edit test
Domain test XML configuration not changed.

# virsh dumpxml test --inactive |grep "<cpu" -A3
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

3. Start the VM, check the xml and qemu cmd line.
# virsh start test
Domain test started

# virsh dumpxml test |grep "<cpu" -A20
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Broadwell</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='umip'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
  </cpu>

# ps -ef |grep test
qemu      6521     1 99 23:51 ?        00:00:19 /usr/libexec/qemu-kvm -name guest=test,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-test/master-key.aes -machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off 
-cpu Broadwell,vme=on,ss=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc_adjust=on,umip=on,xsaveopt=on,pdpe1gb=on,abm=on,rtm=on,hle=on 
...

The result is as expected, move this bug to be verified.

Comment 6 errata-xmlrpc 2019-08-07 10:41:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2395

Note You need to log in before you can comment on or make changes to this bug.