Bug 1018251

Summary: Enabling <cpu mode="host-model"> does not use correct cpuid level, causes kernel panics
Product: Red Hat Enterprise Linux 7 Reporter: Jiri Denemark <jdenemar>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Luyao Huang <lhuang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: amit.shah, berrange, cfergeau, clalancette, crobinso, dwmw2, dyuan, gsun, itamar, jdenemar, jforbes, juzhang, knoel, laine, lhuang, libvirt-maint, mzhan, pbonzini, rbalakri, rjones, scottt.tw, veillard, virt-maint, xfu, xuzhang, yalzhang
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-3.2.0-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1014682 Environment:
Last Closed: 2017-08-01 17:06:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 824989    
Bug Blocks: 910269, 1199452, 1401400    
Attachments:
Description Flags
x86info output of host, reply to #C12
none
out put of cpu-gather.sh, reply to #c12
none
out put of cpu-gather.sh 2, reply to #c12 none

Description Jiri Denemark 2013-10-11 14:36:02 UTC
+++ This bug was initially created as a clone of Bug #1014682 +++

+++ This bug was initially created as a clone of Bug #870071 +++

Description of problem:

This is using a SandyBridge CPU which has AVX instructions:
https://en.wikipedia.org/wiki/Advanced_Vector_Extensions

I'm booting a guest using <cpu mode="host-model"/>.  Inside
the guest, when initializing an mdadm device (yes, this guest
has RAID arrays inside), we see the trace attached below.

I think what is happening here:

 (a) CPU flags are copied from host to guest, advertising 'avx'
 (b) Guest tries to use 'avx'.
 (c) KVM doesn't emulate it, so it all falls in a hole.

Perhaps libvirt should filter flags based on what KVM can actually do?

Version-Release number of selected component (if applicable):

qemu-1.2.0-16.fc18.x86_64
libvirt-0.10.2-3.fc18.x86_64
kernel-3.6.2-2.fc18.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in libguestfs test suite: make -C tests/md check

--- Additional comment from Paolo Bonzini on 2013-10-04 12:12:50 UTC ---

Ironically, right now for AVX to work you do not require the AVX CPUID bit (though application will probably not use it unless the CPUID bit is required).  AVX works if the XCR0 register's bit 2 is set.  This requires:

- the XSAVE CPUID feature, otherwise the kernel will not try to set the OSXSAVE bit in CR4

- the OSXSAVE CPUID feature, otherwise the processor will not enable the XSETBV instruction that writes to XCR0.  This feature however is ignored on the command line.  KVM sets it when the kernel writes 1 to the OSXSAVE bit of CR4

- the bit 2 of EAX to be set in CPUID leaf EAX=0xD/ECX=0 (in current RHEL7 QEMU this is always true; later it will be keyed on the AVX CPUID bit, bug 1005695), otherwise the processor will not enable AVX instructions.

- CPUID level to be 13 or higher, otherwise the CPUID leaf is not available.

Thus, "-cpu Westmere,+xsave,+avx,level=13" is required to enable AVX.  Current QEMU will enable it even if you omit "+avx" but that's not future-proof.

--- Additional comment from Jiri Denemark on 2013-10-04 13:40:02 UTC ---

Oh cool, finally someone who knows something about this area :-) Thanks Paolo. Libvirt currently doesn't model anything but CPU model and features. And when detecting what host CPU is, we only use CPU features, which means we may easily detect the CPU as an older model plus additional features. Thus host-model can select a model+features combination that does not actually work. We need to make host CPU probing smarter (and we plan to involve QEMU in the process see bug 824989) so that the CPU it creates is always usable. Until we do that, using "host-model" is fine if it works for you but it's too fragile to be generally recommended. The same applies to a full copy of host CPU from capabilities XML. I'd suggest using either of the following:

- host-passthrough CPU mode
- just the CPU model from capabilities XML without the additional features; it should be possible to force-add features that QEMU is able to emulate, e.g., <feature name="x2apic" policy="force"/> but I'm not sure if that's safe for all CPU models or not.

--- Additional comment from Jiri Denemark on 2013-10-10 13:03:04 UTC ---

So all we can do for 7.0 is to better document how fragile host-model is and finally make it better once we have bug 824989 fixed. I'll keep this bug for the documentation changes and clone it for the real work later.

Comment 1 Paolo Bonzini 2013-11-27 12:49:16 UTC
As a quick hack. adding level=13 whenever xsave is enabled should work.

Comment 6 Jiri Denemark 2017-03-03 19:27:15 UTC
This should be finally fixed by (in combination with QEMU 2.9.0):

commit 2a586b4402a7637e0bef9a2876d065c0ce6bfef1
Refs: v3.1.0-9-g2a586b440
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:10:22 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemucapstest: Update test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar>

commit 0bde051f3de02b1be25ea4a4d9f062abfa3d1397
Refs: v3.1.0-10-g0bde051f3
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:10:49 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    domaincapstest: Add test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar>

commit d2f8f3052d48f284d56e27c98ce7a2ce6c656e59
Refs: v3.1.0-11-gd2f8f3052
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 15 10:18:53 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    docs: Update description of the host-model CPU mode

    Signed-off-by: Jiri Denemark <jdenemar>

commit 4c0723a1d75b981e8939c4c5b6bde7607fc7301e
Refs: v3.1.0-12-g4c0723a1d
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Jan 30 16:30:13 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Rename hostCPU/feature element in capabilities cache

    The element will be generalized in the following commits.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 03a34f6b84da009291e8651aba71df8a6761d081
Refs: v3.1.0-13-g03a34f6b8
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 15:46:47 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Prepare for more types in qemuMonitorCPUModelInfo

    Signed-off-by: Jiri Denemark <jdenemar>

commit 2fc215dd2ad4b88c1054da804c4c45b3d4e5c2fa
Refs: v3.1.0-14-g2fc215dd2
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 16:01:30 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Store more types in qemuMonitorCPUModelInfo

    While query-cpu-model-expansion returns only boolean features on s390,
    but x86_64 reports some integer and string properties which we are
    interested in.

    Signed-off-by: Jiri Denemark <jdenemar>

commit d7f054a512a911a386d9bbeec51379e4bb843ca5
Refs: v3.1.0-15-gd7f054a51
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 22 16:51:50 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Probe "max" CPU model in TCG

    Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0
    introduces a new "max" CPU model which can be used to ask QEMU what the
    best CPU it can provide to a TCG domain is.

    Signed-off-by: Jiri Denemark <jdenemar>

commit f0138289920d5204c1654bc9b17115d1a315d62e
Refs: v3.1.0-16-gf01382899
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Jan 11 14:36:34 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Get host CPU model from QEMU on x86_64

    Until now host-model CPU mode tried to enable all CPU features supported
    by the host CPU even if QEMU/KVM did not support them. This caused a
    number of issues and made host-model quite unreliable. Asking QEMU for
    the CPU it can provide and the current host makes host-model much more
    robust.

    This commit fixes the following bugs:

        https://bugzilla.redhat.com/show_bug.cgi?id=1018251
        https://bugzilla.redhat.com/show_bug.cgi?id=1371617
        https://bugzilla.redhat.com/show_bug.cgi?id=1372581
        https://bugzilla.redhat.com/show_bug.cgi?id=1404627
        https://bugzilla.redhat.com/show_bug.cgi?id=870071

    In addition to that, the following bug should be mostly limited to cases
    when an unsupported feature is explicitly requested:

       	https://bugzilla.redhat.com/show_bug.cgi?id=1335534

    Signed-off-by: Jiri Denemark <jdenemar>

commit be3d59754b1a1da174ff1796882a0ceb35e198e8
Refs: v3.1.0-17-gbe3d59754
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Jan 31 13:44:00 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use enum for CPU model expansion type

    Signed-off-by: Jiri Denemark <jdenemar>

commit bb3363c90b5b19c37f8e5b8f512eb00014d2dae4
Refs: v3.1.0-18-gbb3363c90
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Feb 23 13:53:51 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use full CPU model expansion on x86

    The static CPU model expansion is designed to return only canonical
    names of all CPU properties. To maintain backwards compatibility libvirt
    is stuck with different spelling of some of the features, but we need to
    use the full expansion to get the additional spellings. In addition to
    returning all spelling variants for all properties the full expansion
    will contain properties which are not guaranteed to be migration
    compatible. Thus, we need to combine both expansions. First we need to
    call the static expansion to limit the result to migratable properties.
    Then we can use the result of the static expansion as an input to the
    full expansion to get both canonical names and their aliases.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 2f882dbfa92c14d585a786a42d284b63ffdca4e3
Refs: v3.1.0-19-g2f882dbfa
Author:     Jiri Denemark <jdenemar>
AuthorDate: Thu Feb 23 14:31:23 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Make virQEMUCapsInitCPUModel testable

    Signed-off-by: Jiri Denemark <jdenemar>

commit d065934cd07c01fbb29f25bbb223eb4ce126a90e
Refs: v3.1.0-20-gd065934cd
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Feb 1 17:48:41 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Switch host CPU data scripts to model expansion

    Instantiating "host" CPU and querying it using qom-get has been the only
    way of probing host CPU via QEMU until 2.9.0 implemented
    query-cpu-model-expansion for x86_64. Even though libvirt never really
    used the old way its result can be easily converted into the one
    produced by query-cpu-model-expansion. Thus we can reuse the original
    test data and possible get new data from hosts where QEMU does not
    support the new QMP command.

    Signed-off-by: Jiri Denemark <jdenemar>

commit d46a1aa4d8caafe977cc41a80ef86af1d10e60b7
Refs: v3.1.0-21-gd46a1aa4d
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 14:59:42 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Convert all json data files to query-cpu-model-expansion

    Converted by running the following command, renaming the files as
    *.new, and committing only the *.new files.

        (cd tests/cputestdata; ./cpu-convert.py *.json)

    Signed-off-by: Jiri Denemark <jdenemar>

commit a19696b5924e7512dcca4f30d15147036708389e
Refs: v3.1.0-22-ga19696b59
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 10:33:52 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Test virQEMUCapsInitCPUModel

    The original test didn't use family/model numbers to make better
    decisions about the CPU model and thus mis-detected the model in the two
    cases which are modified in this commit. The detected CPU models now
    match those obtained from raw CPUID data.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 5e4fc2ef993343643587f2b079b63f2c9f038e6f
Refs: v3.1.0-23-g5e4fc2ef9
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 15:04:38 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop obsolete CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar>

commit 8907204cd83f0ca29c48d19bbf2778132d8578a2
Refs: v3.1.0-24-g8907204cd
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Feb 13 15:06:35 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop .new suffix from CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 7 Jiri Denemark 2017-03-14 11:28:15 UTC
The following additional commits are required to fix this issue with QEMU older than 2.9.0:

commit 5cbc247e0d89ffa6b5d570cccc741dd8eb4f9dc6
Refs: v3.1.0-116-g5cbc247e0
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Jul 24 10:15:38 2013 +0200
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    Do not format <arch> in guest CPU XML

    This element is only allowed for host CPUs.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 23a3f5f50c71fd4b1576ce70c1617fc19df89323
Refs: v3.1.0-117-g23a3f5f50
Author:     Jiri Denemark <jdenemar>
AuthorDate: Mon Mar 6 21:35:49 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    cpu: Replace cpuNodeData with virCPUGetHost

    cpuNodeData has always been followed by cpuDecode as no hypervisor
    driver is really interested in raw CPUID data for a host CPU. Let's
    create a new CPU driver API which returns virCPUDefPtr directly.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 5677b9b3364b232844d86f6676ebfd3e8f9e3ec0
Refs: v3.1.0-118-g5677b9b33
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Mar 7 11:38:38 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    cpu: Add virCPUType parameter to virCPUGetHost

    The parameter can be used to request either VIR_CPU_TYPE_HOST (which has
    been assumed so far) or VIR_CPU_TYPE_GUEST definition.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 79a78c13ecb837518bbfc2c549884bab09ee8568
Refs: v3.1.0-119-g79a78c13e
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Mar 7 12:20:01 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    cpu: Add list of allowed CPU models to virCPUGetHost

    When creating host CPU definition usable with a given emulator, the CPU
    should not be defined using an unsupported CPU model. The new @models
    and @nmodels parameters can be used to limit CPU models which can be
    used in the result.

    Signed-off-by: Jiri Denemark <jdenemar>

commit 4f23862f46529333c9db0f98d640881594c6113c
Refs: v3.1.0-120-g4f23862f4
Author:     Jiri Denemark <jdenemar>
AuthorDate: Tue Mar 7 19:33:37 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    qemu: Refactor virQEMUCapsInitCPU

    The function is now called virQEMUCapsProbeHostCPU. Both the refactoring
    and the change of the name is done for consistency with a new function
    which will be introduced in the following commit.

    Signed-off-by: Jiri Denemark <jdenemar>
commit e958fb5b15ec24477101814467945db56cb75e12
Refs: v3.1.0-121-ge958fb5b1
Author:     Jiri Denemark <jdenemar>
AuthorDate: Wed Mar 8 13:32:46 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    qemu: Report better host-model CPUs in domain caps

    One of the main reasons for introducing host-model CPU definition in a
    domain capabilities XML was the inability to express disabled features
    in a host capabilities XML. That is, when a host CPU is, e.g., Haswell
    without x2apic support, host capabilities XML will have to report it as
    Westmere + a bunch of additional features., but we really want to use
    Haswell - x2apic when creating a host-model CPU.

    Unfortunately, I somehow forgot to do the last step and the code would
    just copy the CPU definition found in the host capabilities XML. This
    changed recently for new QEMU versions which allow us to query host CPU,
    but any slightly older QEMU will not benefit from any change I did. This
    patch makes sure the right CPU model is filled in the domain
    capabilities even with old QEMU.

    The issue was reported in
    https://bugzilla.redhat.com/show_bug.cgi?id=1426456

    Signed-off-by: Jiri Denemark <jdenemar>

commit 065564c8401d7db24057e7eaabfba7037aee4d96
Refs: v3.1.0-122-g065564c84
Author:     Jiri Denemark <jdenemar>
AuthorDate: Fri Mar 3 16:29:16 2017 +0100
Commit:     Jiri Denemark <jdenemar>
CommitDate: Mon Mar 13 23:49:57 2017 +0100

    cputest: New test for Intel Core i7-4510U

    Signed-off-by: Jiri Denemark <jdenemar>

Comment 9 Luyao Huang 2017-06-14 03:21:43 UTC
Hi jirka,

I found a problem when i tried to verify this bug:

1. prepare a intel host:

# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz
Stepping:              9
CPU MHz:               2734.296
CPU max MHz:           3500.0000
CPU min MHz:           1600.0000
BogoMIPS:              6185.99
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts

virsh # capabilities 
<capabilities>

  <host>
    <uuid>a6989909-d9a7-11e2-9275-9296ddfd6bef</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>IvyBridge</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='4' threads='1'/>
      <feature name='ds'/>
      <feature name='acpi'/>
      <feature name='ss'/>
      <feature name='ht'/>
      <feature name='tm'/>
      <feature name='pbe'/>
      <feature name='dtes64'/>
      <feature name='monitor'/>
      <feature name='ds_cpl'/>
      <feature name='vmx'/>
      <feature name='smx'/>
      <feature name='est'/>
      <feature name='tm2'/>
      <feature name='xtpr'/>
      <feature name='pdcm'/>
      <feature name='pcid'/>
      <feature name='osxsave'/>
      <feature name='arat'/>
      <feature name='xsaveopt'/>
      <feature name='invtsc'/>

2. install a old qemu for testing (since old qemu not support emulate all cpu flags):

# rpm -q qemu-kvm
qemu-kvm-1.5.3-141.el7.x86_64

3. start a guest with host-model:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>

4. check guest xml:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>IvyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='disable' name='ds'/>
    <feature policy='disable' name='acpi'/>
    <feature policy='require' name='ss'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='tm'/>
    <feature policy='disable' name='pbe'/>
    <feature policy='disable' name='dtes64'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='ds_cpl'/>
    <feature policy='disable' name='vmx'/>
    <feature policy='disable' name='smx'/>
    <feature policy='disable' name='est'/>
    <feature policy='disable' name='tm2'/>
    <feature policy='disable' name='xtpr'/>
    <feature policy='disable' name='pdcm'/>
    <feature policy='require' name='pcid'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='arat'/>    <<------ not valid flags
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='hypervisor'/>


qemu log:

CPU feature arat not found
CPU feature arat not found


You can see that libvirt auto add a arat flags, but that flag is not support by host, guest, qemu. Could you please help to check this problem ? Thanks a lot for your answer

Comment 11 yalzhang@redhat.com 2017-06-15 01:38:18 UTC
Hi Jiri, 

Please help to check below scenario, the guest fail to get the flag 'xsave' as "no CPUID level 0xd" in step 3, do you think the 'xsave' policy should change to "disable" in step 2?

Test on 
libvirt-3.2.0-10.el7.x86_64
qemu-kvm-rhev-2.9.0-10.el7.x86_64

Prepare a host with such cpu

# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 23
Model name:            Intel(R) Core(TM)2 Quad CPU    Q9500  @ 2.83GHz
Stepping:              10
CPU MHz:               2833.000
CPU max MHz:           2833.0000
CPU min MHz:           2000.0000
BogoMIPS:              5652.44
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              3072K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm

# virsh domcapabilities
 <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Penryn</model>
      <vendor>Intel</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='ss'/>
      <feature policy='require' name='x2apic'/>
      <feature policy='require' name='tsc-deadline'/>
      <feature policy='require' name='xsave'/>      
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
    </mode>

1.
# virsh dumpxml rhel7.4 | grep /cpu -B6
  <cpu mode='host-model' check='full'>
    <model fallback='allow'/>
    <numa>
      <cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
      <cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
    </numa>
  </cpu>

2. 
# virsh start rhel7.4
Domain rhel7.4 started

# virsh dumpxml rhel7.4 | grep /cpu -B15
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Penryn</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='tsc-deadline'/>
    <feature policy='require' name='xsave'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <numa>
      <cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
      <cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
    </numa>
  </cpu>

3. check cpu flags on guest

# lscpu | grep xsave  ========> no xsave
#
# dmesg | grep xsave
[    0.004094] CPU: CPU feature xsave disabled, no CPUID level 0xd
[    0.002000] CPU: CPU feature xsave disabled, no CPUID level 0xd
[    0.002000] CPU: CPU feature xsave disabled, no CPUID level 0xd
[    0.002000] CPU: CPU feature xsave disabled, no CPUID level 0xd
[    0.002000] CPU: CPU feature xsave disabled, no CPUID level 0xd
[    0.002000] CPU: CPU feature xsave disabled, no CPUID level 0xd

Comment 12 Paolo Bonzini 2017-06-15 14:32:24 UTC
Luyao, regarding comment 9 that QEMU is a bit too old.  QEMU 2.4.0+ can support ARAT unconditionally, even if the host doesn't support it, so the warning is okay.

The one in comment 11 is a weird CPU.  Can you run "x86info -a", and also what is the kernel version?

Comment 13 Luyao Huang 2017-06-16 02:12:32 UTC
(In reply to Paolo Bonzini from comment #12)
> Luyao, regarding comment 9 that QEMU is a bit too old.  QEMU 2.4.0+ can
> support ARAT unconditionally, even if the host doesn't support it, so the
> warning is okay.

Hi Paolo,

Thanks for your comment, in comment 9, if the arat not supported by qemu, then there is no reason that libvirt still show that arat feature is require status, and even in that old qemu, libvirt still can use qom-get to check the guest cpu data:

virsh # qemu-monitor-command r7 '{"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]", "property": "filtered-features"}}'
{"return":[{"cpuid-register":"EAX","cpuid-input-ecx":1,"cpuid-input-eax":13,"features":0},{"cpuid-register":"EDX","cpuid-input-eax":2147483658,"features":0},{"cpuid-register":"EAX","cpuid-input-eax":1073741825,"features":0},{"cpuid-register":"EDX","cpuid-input-eax":3221225473,"features":0},{"cpuid-register":"ECX","cpuid-input-eax":2147483649,"features":0},{"cpuid-register":"EDX","cpuid-input-eax":2147483649,"features":0},{"cpuid-register":"EDX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":0},{"cpuid-register":"ECX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":0},{"cpuid-register":"EBX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":0},{"cpuid-register":"ECX","cpuid-input-eax":1,"features":134267388},{"cpuid-register":"EDX","cpuid-input-eax":1,"features":2959081472}],"id":"libvirt-29"}


virsh # qemu-monitor-command r7 '{"execute":"qom-get","arguments":{"path":"/machine/unattached/device[0]", "property": "feature-words"}}'
{"return":[{"cpuid-register":"EAX","cpuid-input-ecx":1,"cpuid-input-eax":13,"features":1},{"cpuid-register":"EDX","cpuid-input-eax":2147483658,"features":0},{"cpuid-register":"EAX","cpuid-input-eax":1073741825,"features":16777467},{"cpuid-register":"EDX","cpuid-input-eax":3221225473,"features":0},{"cpuid-register":"ECX","cpuid-input-eax":2147483649,"features":1},{"cpuid-register":"EDX","cpuid-input-eax":2147483649,"features":672139264},{"cpuid-register":"EDX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":0},{"cpuid-register":"ECX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":0},{"cpuid-register":"EBX","cpuid-input-ecx":0,"cpuid-input-eax":7,"features":641},{"cpuid-register":"ECX","cpuid-input-eax":1,"features":4156170755},{"cpuid-register":"EDX","cpuid-input-eax":1,"features":260832255}],"id":"libvirt-30"}

I think libvirt can detect that arat is not in guest cpu feature list, but still show it in a wrong status, this makes me think there a bug in some place.

Comment 14 Jiri Denemark 2017-06-16 15:47:29 UTC
Comments 9 and 13:

Yes, apparently there is a small bug in libvirt in the code which checks what features were disabled by QEMU. It just disables all features it finds in the filtered-features. However, the old QEMU does not know anything about CPU feature 'arat' and it naturally cannot list arat in filtered-features. Thus libvirt should also disable features which are not mentioned in feature-words.

Could you file a new BZ for this issue?


Comment 11 (and 12):

Could you provide the data Paolo asked for in the second part of comment 12 and attach the output of http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-gather.sh script (after installing cpuid tool; note, you don't need to run the script as root)?

Comment 15 Jiri Denemark 2017-06-16 15:50:57 UTC
(In reply to Jiri Denemark from comment #14)
> Comments 9 and 13:
> ...
> Could you file a new BZ for this issue?

Actually, we already have bug 1371617 for exactly this issue, so just return it back as unfixed.

Comment 16 Luyao Huang 2017-06-19 01:47:05 UTC
(In reply to Jiri Denemark from comment #15)
> (In reply to Jiri Denemark from comment #14)
> > Comments 9 and 13:
> > ...
> > Could you file a new BZ for this issue?
> 
> Actually, we already have bug 1371617 for exactly this issue, so just return
> it back as unfixed.

Thanks for your reply, i have moved that bug to unfixed status.

Comment 17 Luyao Huang 2017-06-19 01:56:55 UTC
(In reply to Jiri Denemark from comment #14)
> Comments 9 and 13:
> 
> Yes, apparently there is a small bug in libvirt in the code which checks
> what features were disabled by QEMU. It just disables all features it finds
> in the filtered-features. However, the old QEMU does not know anything about
> CPU feature 'arat' and it naturally cannot list arat in filtered-features.
> Thus libvirt should also disable features which are not mentioned in
> feature-words.
> 
> Could you file a new BZ for this issue?
> 
> 
> Comment 11 (and 12):
> 
> Could you provide the data Paolo asked for in the second part of comment 12
> and attach the output of
> http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-
> gather.sh script (after installing cpuid tool; note, you don't need to run
> the script as root)?

Move this need info to Yalan since this hardware is belong to her.

Hi Yalan, please help to check the comment 14 needinfo for you.

Thanks.

Comment 18 yalzhang@redhat.com 2017-06-21 07:26:00 UTC
Created attachment 1289961 [details]
x86info output of host, reply to #C12

# uname -a
Linux server74 3.10.0-671.el7.x86_64 #1 SMP Mon May 22 22:48:01 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

# rpm -q libvirt
libvirt-3.2.0-11.el7.x86_64

Comment 19 yalzhang@redhat.com 2017-06-22 08:31:20 UTC
Created attachment 1290570 [details]
out put of cpu-gather.sh, reply to #c12

Comment 20 yalzhang@redhat.com 2017-06-22 08:34:57 UTC
Created attachment 1290573 [details]
out put of cpu-gather.sh 2, reply to #c12

I run several times of the script, http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-gather.sh, and get different results, see the attachments:

# diff output_cpu_gather1 output_cpu_gather2
4c4
<    0x00000001 0x00: eax=0x0001067a ebx=0x03040800 ecx=0x0c08e3bd edx=0xbfebfbff
---
>    0x00000001 0x00: eax=0x0001067a ebx=0x01040800 ecx=0x0c08e3bd edx=0xbfebfbff
35c35
< {"timestamp": {"seconds": 1498119894, "microseconds": 712479}, "event": "SHUTDOWN", "data": {"guest": false}}
---
> {"timestamp": {"seconds": 1498119899, "microseconds": 305765}, "event": "SHUTDOWN", "data": {"guest": false}}

Comment 21 Luyao Huang 2017-06-22 09:41:35 UTC
Test with libvirt-3.2.0-14.el7.x86_64 and qemu-kvm-rhev-2.9.0-12.el7.x86_64:

S1: prepare a host which not support xsave avx:

1.

virsh # domcapabilities 
...
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>SandyBridge</model>
      <vendor>Intel</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='ss'/>
      <feature policy='require' name='pcid'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='pdpe1gb'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='disable' name='xsave'/>
      <feature policy='disable' name='avx'/>
    </mode>
...

2. start a guest with check is partial:

# virsh start r7
Domain r7 started

# virsh dumpxml r7
...
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>SandyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='xsave'/>
    <feature policy='disable' name='avx'/>
    <feature policy='disable' name='xsaveopt'/>
...

3. login guest and check cpu feature:

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes hypervisor lahf_lm tsc_adjust arat

4. require this xsave + avx feature and set the check to full in xml:

  <cpu mode='host-model' check='full'>
    <model fallback='allow'/>
    <feature policy='require' name='xsave'/>
    <feature policy='require' name='avx'/>

5. restart the guest:

# virsh start r7
error: Failed to start domain r7
error: operation failed: guest CPU doesn't match specification: missing features: xsave,avx,xsaveopt


S2: prepare a host which support xsave avx:

1.

virsh # domcapabilities 
...
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Opteron_G5</model>
      <vendor>AMD</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='x2apic'/>
      <feature policy='require' name='tsc-deadline'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='bmi1'/>
      <feature policy='require' name='mmxext'/>
      <feature policy='require' name='fxsr_opt'/>
      <feature policy='require' name='cmp_legacy'/>
      <feature policy='require' name='cr8legacy'/>
      <feature policy='require' name='osvw'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='disable' name='rdtscp'/>
      <feature policy='disable' name='svm'/>
    </mode>
...

2. start a guest with check is partial:

# virsh start r7
Domain r7 started

# virsh dumpxml r7
...
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Opteron_G5</model>
    <vendor>AMD</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='tsc-deadline'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='bmi1'/>
    <feature policy='require' name='mmxext'/>
    <feature policy='require' name='fxsr_opt'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='require' name='cr8legacy'/>
    <feature policy='require' name='osvw'/>
    <feature policy='disable' name='rdtscp'/>
    <feature policy='disable' name='svm'/>
...

3. login guest and check the feature:

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb lm art rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw xop fma4 tbm tsc_adjust bmi1 arat

4. disable this feature in xml:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
    <feature policy='disable' name='xsave'/>
    <feature policy='disable' name='avx'/>

5. start guest and check:

# virsh start r7
Domain r7 started

# virsh dumpxml r7

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Opteron_G5</model>
    <vendor>AMD</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='tsc-deadline'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='bmi1'/>
    <feature policy='require' name='mmxext'/>
    <feature policy='require' name='fxsr_opt'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='require' name='cr8legacy'/>
    <feature policy='require' name='osvw'/>
    <feature policy='disable' name='rdtscp'/>
    <feature policy='disable' name='svm'/>
    <feature policy='disable' name='xsave'/>
    <feature policy='disable' name='avx'/>

IN GUEST:

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb lm art rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes f16c hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw xop fma4 tbm tsc_adjust bmi1 arat

And more testing related to these patches check bug 822148 comment 28

Comment 22 Jiri Denemark 2017-06-23 10:13:04 UTC
S1: the xsaveopt issue is similar to the one described in https://bugzilla.redhat.com/show_bug.cgi?id=1371617#c65

S2: everything seems to be working as expected here or did I miss anything?

Comment 23 Luyao Huang 2017-06-26 06:47:49 UTC
(In reply to Jiri Denemark from comment #22)
> S1: the xsaveopt issue is similar to the one described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1371617#c65
> 
> S2: everything seems to be working as expected here or did I miss anything?

Hi Jirka,

The comment 22 is a comment which will be used to verify this bug. And yes, S2 test results are working as expected.

That need info was set in comment 18, it is related to another problem found by yalan (check comment 11), and she update the debug information in comment 18,19,20.

Comment 24 yalzhang@redhat.com 2017-06-27 01:23:52 UTC
set this bug to be verified as the main function is ok, and we will track the problem in comment 22 continuously and file a new one if necessary.

Comment 25 Jiri Denemark 2017-06-27 09:34:39 UTC
So after looking at the https://bugzilla.redhat.com/attachment.cgi?id=1290573, the CPU model created by libvirt seems to be correct. QEMU even advertises xsave as enabled, so I'm not sure what's going on there. Could you file a new BZ for the issue in comment #11, where we can continue discussing this issue? Don't forget to attach the output of cpu-gather.sh and x86info.

Comment 26 errata-xmlrpc 2017-08-01 17:06:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 27 errata-xmlrpc 2017-08-01 23:48:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 28 errata-xmlrpc 2017-08-02 01:25:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846