RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1815572 - VM live migration fails: the CPU is incompatible with host CPU: Host CPU does not provide required fea-tures: virt-ssbd
Summary: VM live migration fails: the CPU is incompatible with host CPU: Host CPU does...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.7
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-20 15:39 UTC by Oliver Freyermuth
Modified: 2022-09-02 06:30 UTC (History)
8 users (show)

Fixed In Version: libvirt-4.5.0-34.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-29 20:29:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1744281 1 None None None 2023-09-07 20:26:27 UTC
Red Hat Bugzilla 1745181 0 urgent CLOSED virt-ssbd not included on CPU mode='host-model' 2023-09-07 20:27:44 UTC
Red Hat Product Errata RHSA-2020:4000 0 None None None 2020-09-29 20:29:40 UTC

Internal Links: 1897948

Description Oliver Freyermuth 2020-03-20 15:39:08 UTC
Description of problem:

Live migration using "virsh" fails for VMs with "host-model" CPUs:

virsh migrate vm.example.com qemu+ssh://kvm010.example.com/system --live --p2p --tunnelled --auto-converge --compressed --verbose --unsafe


Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-167.el7_7.4.x86_64
libvirt-4.5.0-23.el7_7.5.x86_64

How reproducible:
Always. 

Steps to Reproduce:
1. Create a VM with <cpu mode='host-model'/> on a host supporting SSBD. 
2. Start the VM and observe "<feature policy='require' name='virt-ssbd'/>" when running "virsh dumpxml vm.example.com". 
3. Try to live-migrate it using:
   virsh migrate vm.example.com qemu+ssh://kvm010.example.com/system --live --p2p --tunnelled --auto-converge --compressed --verbose --unsafe

Actual results:
Error:
the CPU is incompatible with host CPU: Host CPU does not provide required features: virt-ssbd

Expected results:
It works. 

Additional info:
The host can never offer "virt-ssbd". This is probably a side-effect of the fix done in:
https://bugzilla.redhat.com/show_bug.cgi?id=1745181
which added virt-ssbd to the guest flags. 
I am not sure if:
https://bugzilla.redhat.com/show_bug.cgi?id=1744281
will also fix this side-effect.

Comment 4 Jiri Denemark 2020-03-23 14:40:54 UTC
Since this is a migration between two hosts, could you please tell us the
version of kernel, qemu-kvm, and libvirt from both the source and the
destination host?

What happens if you create a new VM with <cpu mode='host-model'/> on
kvm010.example.com, is "<feature policy='require' name='virt-ssbd'/>" also
present in the output of virsh dumpxml THE_NEW_DOMAIN?

Comment 5 Oliver Freyermuth 2020-03-23 14:54:13 UTC
Of course:
kvm010 (source): 
Kernel: 3.10.0-1062.18.1.el7.x86_64
qemu-kvm-1.5.3-167.el7_7.4.x86_64
libvirt-4.5.0-23.el7_7.6.x86_64

kmv009 (target):
Kernel: 3.10.0-1062.18.1.el7.x86_64
qemu-kvm-1.5.3-167.el7_7.4.x86_64
libvirt-4.5.0-23.el7_7.6.x86_64

Both nodes have been rebooted a few days ago, so they are also running the version which is installed. 

After creating a new VM on kvm010, the issue prevails:

[root@kvm010 ~]# grep host-model /etc/libvirt/qemu/broken-vm.example.com.xml -A2
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

[root@kvm010 ~]# virsh dumpxml broken-vm.example.com | grep ssbd
    <feature policy='require' name='virt-ssbd'/>

Comment 6 Jiri Denemark 2020-03-23 16:26:01 UTC
OK, so both hosts correctly enable virt-ssbd (it's emulated by the
virtualization stack, hence the "virt-" prefix), but somethings wrong with the
host CPU compatibility check.

Could you please provide complete domain XML of the running domain (or at
least its <cpu> element including children)?

Comment 7 Oliver Freyermuth 2020-03-23 16:58:15 UTC
Here's the CPU element of the running domain:
----------------
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Opteron_G4</model>
    <vendor>AMD</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='mmxext'/>
    <feature policy='require' name='fxsr_opt'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='disable' name='extapic'/>
    <feature policy='require' name='cr8legacy'/>
    <feature policy='require' name='osvw'/>
    <feature policy='disable' name='ibs'/>
    <feature policy='disable' name='skinit'/>
    <feature policy='disable' name='wdt'/>
    <feature policy='disable' name='nodeid_msr'/>
    <feature policy='disable' name='topoext'/>
    <feature policy='disable' name='perfctr_core'/>
    <feature policy='disable' name='perfctr_nb'/>
    <feature policy='require' name='ibpb'/>
    <feature policy='require' name='virt-ssbd'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='rdtscp'/>
    <feature policy='disable' name='svm'/>
  </cpu>
----------------

In case it is also of interest, here's the CPU tags of the capabilities commands:
---------------
# virsh capabilities
...
    <cpu>
      <arch>x86_64</arch>
      <model>Opteron_G4</model>
      <vendor>AMD</vendor>
      <microcode version='100664894'/>
      <counter name='tsc' frequency='2299999000'/>
      <topology sockets='1' cores='64' threads='1'/>
      <feature name='vme'/>
      <feature name='ht'/>
      <feature name='monitor'/>
      <feature name='osxsave'/>
      <feature name='mmxext'/>
      <feature name='fxsr_opt'/>
      <feature name='cmp_legacy'/>
      <feature name='extapic'/>
      <feature name='cr8legacy'/>
      <feature name='osvw'/>
      <feature name='ibs'/>
      <feature name='skinit'/>
      <feature name='wdt'/>
      <feature name='lwp'/>
      <feature name='nodeid_msr'/>
      <feature name='topoext'/>
      <feature name='perfctr_core'/>
      <feature name='perfctr_nb'/>
      <feature name='invtsc'/>
      <feature name='ibpb'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
      <pages unit='KiB' size='1048576'/>
    </cpu>
...
---------------
# virsh domcapabilities
...
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='allow'>Opteron_G4</model>
      <vendor>AMD</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='ht'/>
      <feature policy='require' name='monitor'/>
      <feature policy='require' name='osxsave'/>
      <feature policy='require' name='mmxext'/>
      <feature policy='require' name='fxsr_opt'/>
      <feature policy='require' name='cmp_legacy'/>
      <feature policy='require' name='extapic'/>
      <feature policy='require' name='cr8legacy'/>
      <feature policy='require' name='osvw'/>
      <feature policy='require' name='ibs'/>
      <feature policy='require' name='skinit'/>
      <feature policy='require' name='wdt'/>
      <feature policy='require' name='nodeid_msr'/>
      <feature policy='require' name='topoext'/>
      <feature policy='require' name='perfctr_core'/>
      <feature policy='require' name='perfctr_nb'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='require' name='ibpb'/>
    </mode>
    <mode name='custom' supported='yes'>
      <model usable='unknown'>EPYC-IBPB</model>
      ...
...
---------------
And the actual flags the CPU reports:
---------------
# cat /proc/cpuinfo  | grep flags | head -n1
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt fma4 nodeid_msr topoext perfctr_core perfctr_nb cpb hw_pstate retpoline_amd ssbd ibpb vmmcall arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
---------------

Let me know if there's anything else of interest.

Comment 9 Jiri Denemark 2020-04-23 13:44:11 UTC
This bug can be easily reproduced even on a single host using save/restore:

1. define a domain with <cpu mode='host-model'/>
2. virsh start $DOM
3. virsh managedsave $DOM
4. virsh start $DOM
error: Failed to start domain $DOM
error: the CPU is incompatible with host CPU: Host CPU does not provide required features: virt-ssbd

The bug is triggered by a RHEL-7 only hack for bug 1745181, which adds
virt-ssbd feature to all host-model CPUs on AMD hosts. Depending on the state
of virt-ssbd in the freshly started VM, it may or may not appear in the domain
definition. For compatibility with old QEMU the domain XML used for migration
and save/restore contains the original CPU definition (used when starting a
domain) with check='partial' and the real CPU definition (modified to match
the actual CPU created by QEMU) with check='full' is stored separately in a
cookie.

With QEMU 1.5.3 domains are started with the CPU definition from domain XML
and the CPU def in cookie is ignored. Since the original CPU definition is
stored after we add virt-ssbd (for bug 1745181), the domain XML will contain
this feature. However, neither host CPU capabilities nor domain capabilities
contain virt-ssbd (as QEMU 1.5.3 is too old to report its support) and libvirt
complains the host CPU does not support virt-ssbd when checking compatibility
of the CPU definition with virt-ssbd.

Comment 13 jiyan 2020-05-25 08:41:18 UTC
Reproduced this issue with libvirt-4.5.0-33.el7.x86_64.

Version:
libvirt-4.5.0-33.el7.x86_64
qemu-kvm-1.5.3-174.el7.x86_64
kernel-3.10.0-1144.el7.x86_64

Steps:
1. Prepare a shutdown VM with the following conf
# virsh domstate test79
shut off

# virsh dumpxml test79 --inactive | grep "<cpu" -A2
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

2. Start VM and check active dumpxml 
# virsh start test79
Domain test79 started

# virsh dumpxml test79 | grep "<cpu" -A23
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <vendor>AMD</vendor>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='disable' name='extapic'/>
    <feature policy='disable' name='skinit'/>
    <feature policy='disable' name='wdt'/>
    <feature policy='disable' name='tce'/>
    <feature policy='disable' name='topoext'/>
    <feature policy='disable' name='perfctr_core'/>
    <feature policy='disable' name='perfctr_nb'/>
    <feature policy='require' name='virt-ssbd'/>   ******
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='arat'/>
    <feature policy='disable' name='svm'/>
  </cpu>

3. Managedsave VM and then start VM
# virsh managedsave test79
Domain test79 state saved by libvirt

# virsh start test79
error: Failed to start domain test79
error: the CPU is incompatible with host CPU: Host CPU does not provide required features: virt-ssbd

Comment 14 jiyan 2020-05-25 08:44:17 UTC
Verified this issue with libvirt-4.5.0-36.el7.x86_64.
1. Upgrade libvirt and restart libvirtd service based on the previous comment
# yum update libvirt* -y

# rpm -qa libvirt
libvirt-4.5.0-36.el7.x86_64

# systemctl restart libvirtd

2. Prepare a shutdowm VM with the following conf
# virsh domstate test79-new
shut off

# virsh dumpxml test79-new --inactive | grep "<cpu" -A2
  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

3. Start VM, then check active dumpxml and qemu cmd line and guest cpu flag
# virsh start test79-new
Domain test79-new started

# virsh dumpxml test79-new  | grep "<cpu" -A18
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>EPYC-IBPB</model>
    <vendor>AMD</vendor>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='disable' name='extapic'/>
    <feature policy='disable' name='skinit'/>
    <feature policy='disable' name='wdt'/>
    <feature policy='disable' name='tce'/>
    <feature policy='disable' name='topoext'/>
    <feature policy='disable' name='perfctr_core'/>
    <feature policy='disable' name='perfctr_nb'/>
**    <feature policy='require' name='virt-ssbd'/> **
    <feature policy='disable' name='monitor'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='disable' name='arat'/>
    <feature policy='disable' name='svm'/>
  </cpu>

# ps -ef | grep test79-new
-cpu EPYC-IBPB,+ht,+osxsave,+cmp_legacy,+extapic,+skinit,+wdt,+tce,+topoext,+perfctr_core,+perfctr_nb,** +virt-ssbd **

# virsh console test79-new
Connected to domain test79-new
Escape character is ^]
Red Hat Enterprise Linux Server 7.9 Beta (Maipo)
Kernel 3.10.0-1136.el7.x86_64 on an x86_64
localhost login: root
Password: 
[root@localhost ~]# lscpu | grep ssbd
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw retpoline_amd ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 ** virt_ssbd  ** arat

4. Managedsave VM and then start VM
# virsh managedsave test79-new

Domain test79-new state saved by libvirt

# virsh start test79-new
Domain test79-new started

# virsh console test79-new
Connected to domain test79-new
Escape character is ^]

[root@localhost ~]# lscpu | grep ssbd
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw retpoline_amd ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 ** virt_ssbd ** arat

All the test results are as expected, move this bug to be verified.

Comment 17 errata-xmlrpc 2020-09-29 20:29:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: libvirt security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4000

Comment 18 Oliver Freyermuth 2020-11-15 19:21:44 UTC
Sadly, the originally reported issue seems not fixed / shows up in a different way now. 
While "managedsave" is fixed, live-migration still fails! 

Starting a VM freshly shows this in the /var/log/libvirt/qemu/FQDN.log:

-cpu Opteron_G4,+vme,+ht,+monitor,+osxsave,+mmxext,+fxsr_opt,+cmp_legacy,+extapic,+cr8legacy,+osvw,+ibs,+skinit,+wdt,+nodeid_msr,+topoext,+perfctr_core,+perfctr_nb,+ibpb,+virt-ssbd \

However, migrating it to another hypervisor reveals this when qemu is started during migration:

-cpu Opteron_G4,+vme,+ht,+monitor,+osxsave,+mmxext,+fxsr_opt,+cmp_legacy,+extapic,+cr8legacy,+osvw,+ibs,+skinit,+wdt,+nodeid_msr,+topoext,+perfctr_core,+perfctr_nb,+ibpb \

Both nodes are running 7.9 (i.e. the updated packages). The migrated VM freezes a few seconds after migration. 

Should I report this in a new issue, since the symptoms are different?

Comment 19 Oliver Freyermuth 2020-11-15 19:34:41 UTC
I have reported this as a new issue here:
https://bugzilla.redhat.com/show_bug.cgi?id=1897948
following the statement "If the solution does not work for you, open a new bug report.".


Note You need to log in before you can comment on or make changes to this bug.