Bug 1765445

Summary: Cmd "virsh Hypervisor-cpu-compare" outputs wrong result with VM's active dumpxml as input because of topoext
Product: Red Hat Enterprise Linux 9 Reporter: jiyan <jiyan>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
libvirt sub component: General QA Contact: yalzhang <yalzhang>
Status: CLOSED MIGRATED Docs Contact:
Severity: medium    
Priority: medium CC: dyuan, fjin, jdenemar, jsuchane, virt-maint, xuzhang, yafu, yalzhang
Version: 9.0Keywords: MigratedToJIRA, Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 12:19:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jiyan 2019-10-25 06:32:01 UTC
Description of problem:
Cmd "virsh Hypervisor-cpu-compare" outputs wrong result with VM's active dumpxml as input because of topoext

Version-Release number of selected component (if applicable):
qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57.x86_64
libvirt-5.6.0-5.virtcov.el8.x86_64
kernel-4.18.0-141.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Check domcapabilities, there is no topoext cpu feature here
# virsh domcapabilities
...
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>EPYC-IBPB</model>
      <vendor>AMD</vendor>
      <feature policy='require' name='x2apic'/>
      <feature policy='require' name='tsc-deadline'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='arch-capabilities'/>
      <feature policy='require' name='cmp_legacy'/>
      <feature policy='require' name='perfctr_core'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='require' name='virt-ssbd'/>
      <feature policy='require' name='skip-l1dfl-vmentry'/>
      <feature policy='disable' name='monitor'/>
      <feature policy='disable' name='svm'/>
    </mode>

2. Prepare a shutdown VM with the following conf
# virsh domstate test_1 
shut off

# virsh dumpxml test_1 --inactive |grep "<cpu" -A2
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

4. Start VM and check active VM's dumpxml, topoext is enabled here
# virsh start test_1
Domain test_1 started

# virsh dumpxml test_1  |grep "<cpu" -A20
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <vendor>AMD</vendor>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='tsc-deadline'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='arch-capabilities'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='require' name='perfctr_core'/>
    <feature policy='require' name='virt-ssbd'/>
    <feature policy='require' name='skip-l1dfl-vmentry'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='svm'/>
    <feature policy='require' name='topoext'/>
  </cpu>

5. Use the active VM's dumpxml as input of "virsh hypervisor-cpu-compare"
# virsh dumpxml test_1 > test_1.xml

# virsh hypervisor-cpu-compare test_1.xml 
CPU described in test_1.xml is incompatible with the CPU provided by hypervisor on the host

Actual results:
As step-5 shows

Expected results:
Since VM can start successfully on this host, I think the CPU conf should not be incompatible. 

Additional info:
1> This issue can be reproduced on both RHEL-8.1 AV and RHEL-8.1.
2> The reproducing step above is for RHEL-8.1 AV, and the reproducing steps for RHEL-8.1 can be seen in this link: https://bugzilla.redhat.com/show_bug.cgi?id=1619798#c9

Comment 3 yalzhang@redhat.com 2021-04-08 02:56:57 UTC
Guest can not start with <cpu mode='host-model' check='full'/> on EPYC-IBPB system
# rpm -q libvirt-libs qemu-kvm
libvirt-libs-7.0.0-12.module+el8.4.0+10596+32ba7df3.x86_64
qemu-kvm-5.2.0-14.module+el8.4.0+10425+ad586fa5.x86_64

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  2
Core(s) per socket:  32
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
BIOS Vendor ID:      AMD
CPU family:          23
Model:               1
Model name:          AMD EPYC 7601 32-Core Processor
BIOS Model name:     AMD EPYC 7601 32-Core Processor   
...
2. Prepare a vm with below cpu setting:
<cpu mode='host-model' check='full'/>

3. Try to start the vm, it failed to start
# virsh start rhel
error: Failed to start domain 'rhel'
error: operation failed: guest CPU doesn't match specification: extra features: topoext

Comment 4 John Ferlan 2021-09-08 13:30:23 UTC
Bulk update: Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 5 yalzhang@redhat.com 2022-04-22 09:01:50 UTC
I have tested on latest libvirt-8.2.0-1.el9.x86_64 on host with EPYC-Milan cpu, the issue in comment 0 can still be reproduced.

Comment 6 Jiri Denemark 2022-11-16 12:50:45 UTC
So apparently -cpu host does not enable topoext and thus libvirt does not show
it in the host-model definition in domain capabilities. But once libvirt
starts QEMU with that cpu-model, topoext is enabled by QEMU. Most likely
because the QEMU definition of EPYC-IBPB contains topoext. Thus it appears in
the live XML as

    <feature policy='require' name='topoext'/>

Checking this CPU definition with hypervisor-cpu-compare fails because libvirt
thinks the host cannot provide topoext (as it was not reported as enabled by
QEMU for -cpu host).

So it looks like topoext is just another magic feature which is only enabled
in some cases? But I'm not sure why would -cpu host refuse to enable it when a
named CPU model enables it by itself without an explicit request.

Comment 7 yalzhang@redhat.com 2023-04-03 05:32:10 UTC
The bug can still be reproduced on rhel 9.2 with libvirt-9.0.0-10.el9_2.x86_64 and qemu-kvm-7.2.0-11.el9_2.x86_64.
Since there is no explicite resolution for it, extend the stale date with 6M.

Comment 10 RHEL Program Management 2023-09-22 12:18:20 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 11 RHEL Program Management 2023-09-22 12:19:32 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.

Comment 12 Red Hat Bugzilla 2024-01-21 04:25:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days