Crucial feature of Openstack 13->16 FFWD upgrade stopped working on latest rhel8.3 Nova logs: 2021-01-26 23:30:25.169 7 ERROR nova.virt.libvirt.driver [req-774be110-7fb6-4865-a177-d624a821cf9e 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] CPU doesn't have compatibility. Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult 2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server [req-774be110-7fb6-4865-a177-d624a821cf9e 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] Exception during message handling: nova.exception.InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility. In nested virt scenarion we have hypervisor hosting 2 vms holding computes of openstack. Livemigration of vms running on these computes fails with the ^^ nova error. Comparison of the computes and virsh domcapabilities diff: OSP13 RHEL7.9 libvirt-daemon-4.5.0-36.el7_9.3.x86_64 | RHEL8.3 libvirt-daemon-6.6.0-7.1.module+el8.3.0+8852+b44fca9f.x86_64 <cpu> < <mode name='host-passthrough' supported='yes'/> < <mode name='host-model' supported='yes'> <mode name='host-model' supported='yes'> <model fallback='forbid'>Skylake-Server-IBRS</model> | <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='ss'/> > <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> < <feature policy='require' name='umip'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='pku'/> <feature policy='require' name='avx512vnni'/> < <feature policy='require' name='md-clear'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='stibp'/> <feature policy='require' name='ssbd'/> | <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='ibpb'/> > <feature policy='require' name='amd-stibp'/> > <feature policy='require' name='amd-ssbd'/> > <feature policy='require' name='rdctl-no'/> > <feature policy='require' name='ibrs-all'/> > <feature policy='require' name='skip-l1dfl-vmentry'/> > <feature policy='require' name='mds-no'/> > <feature policy='require' name='pschange-mc-no'/> > <feature policy='disable' name='hle'/> > <feature policy='disable' name='rtm'/> <feature policy='disable' name='mpx'/> <feature policy='disable' name='mpx'/> </mode> </mode> diff virsh capabiliries <cpu> <cpu> <arch>x86_64</arch> <arch>x86_64</arch> <model>Skylake-Server-IBRS</model> | <model>Cascadelake-Server-noTSX</model> <vendor>Intel</vendor> <vendor>Intel</vendor> <microcode version='1'/> <microcode version='1'/> <topology sockets='8' cores='1' threads='1'/> | <topology sockets='8' dies='1' cores='1' threads='1'/> <feature name='ss'/> <feature name='ss'/> <feature name='vmx'/> <feature name='vmx'/> <feature name='osxsave'/> <feature name='osxsave'/> <feature name='hypervisor'/> <feature name='hypervisor'/> <feature name='tsc_adjust'/> <feature name='tsc_adjust'/> <feature name='clflushopt'/> < <feature name='umip'/> <feature name='umip'/> <feature name='pku'/> <feature name='pku'/> <feature name='ospke'/> <feature name='ospke'/> <feature name='avx512vnni'/> < <feature name='md-clear'/> <feature name='md-clear'/> <feature name='stibp'/> <feature name='stibp'/> <feature name='arch-facilities'/> | <feature name='arch-capabilities'/> <feature name='ssbd'/> < <feature name='xsaves'/> <feature name='xsaves'/> <feature name='ibpb'/> <feature name='ibpb'/> > <feature name='amd-ssbd'/> > <feature name='rdctl-no'/> > <feature name='ibrs-all'/> > <feature name='skip-l1dfl-vmentry'/> > <feature name='mds-no'/> > <feature name='pschange-mc-no'/> > <feature name='tsx-ctrl'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> <pages unit='KiB' size='1048576'/> </cpu> </cpu> Hypervisor: RHEL8.2 libvirt-client-6.0.0-25.5.module+el8.2.1+8680+ea98947b.x86_64 <mode name='host-passthrough' supported='yes'/> <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> </mode> virsh capabilities: <cpu> <arch>x86_64</arch> <model>Cascadelake-Server</model> <vendor>Intel</vendor> <microcode version='83898371'/> <counter name='tsc' frequency='2095077000' scaling='yes'/> <topology sockets='1' dies='1' cores='20' threads='2'/> <feature name='ds'/> <feature name='acpi'/> <feature name='ss'/> <feature name='ht'/> <feature name='tm'/> <feature name='pbe'/> <feature name='dtes64'/> <feature name='monitor'/> <feature name='ds_cpl'/> <feature name='vmx'/> <feature name='smx'/> <feature name='est'/> <feature name='tm2'/> <feature name='xtpr'/> <feature name='pdcm'/> <feature name='dca'/> <feature name='osxsave'/> <feature name='tsc_adjust'/> <feature name='cmt'/> <feature name='intel-pt'/> <feature name='pku'/> <feature name='ospke'/> <feature name='md-clear'/> <feature name='stibp'/> <feature name='arch-capabilities'/> <feature name='xsaves'/> <feature name='mbm_total'/> <feature name='mbm_local'/> <feature name='invtsc'/> <feature name='rdctl-no'/> <feature name='ibrs-all'/> <feature name='skip-l1dfl-vmentry'/> <feature name='mds-no'/> <feature name='tsx-ctrl'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu>
(I'm just looking at this bug.) IIUC, this is not the 'arch-facilities' (a RHEL-7-only thing) vs. 'arch-capabilities' (RHEL-8) issue from last year, which OSP fixed it by not advertising the 'arch-facilities' CPU feature on the source host (RHEL-7): https://bugzilla.redhat.com/show_bug.cgi?id=1867128 — "[OSP-16] [Downstream-Only] Don't provide 'arch-facilities' CPU f.eature to migration XML, to avoid live migration breakage from EL7 to EL8"
Version details: (NB: This is a nested KVM environment.) Source ------ On the "host" (a level-1 guest running RHEL-7): kernel-3.10.0-1160.6.1.el7.x86_64 microcode_ctl-2.1-73.2.el7_9.x86_64 libvirt-daemon-kvm-4.5.0-36.el7_9.3.x86_64 qemu-kvm-rhev-2.12.0-48.el7_9.1.x86_64 Destination ------------ On the "host" (a level-1 guest running RHEL-8): kernel-4.18.0-240.el8.x86_64 microcode_ctl-20200609-2.el8.x86_64 qemu-kvm-5.1.0-14.module+el8.3.0+8790+80f9c6d8.1.x86_64 libvirt-daemon-kvm-6.6.0-7.1.module+el8.3.0+8852+b44fca9f.x86_64 - - - (Note: on both source and destination QEMU / libvirt are running within the 'nova_libvirt' container running on respective "hosts".)
Created attachment 1751325 [details] Nova Compute log from the destination host that has the XML being passed to compareCPU(), the libvirt API
In the compute log file for the target host we can find the guest CPU XML that is being checked at: 2021-01-26 22:37:31.346 7 DEBUG nova.virt.libvirt.driver [req-a141dd51-ffb9-46c8-a8e7-e37d2bf68422 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] [instance: 462432e3-cd25-4c52-9c61-34aca9174bf4] cpu compare xml: <cpu> save that to guest.xml Now take the CPU from virsh capabilities and virs domcapbilities saving to hostcaps.xml and domcaps.xml respectively. $ virsh cpu-baseline --features guest.xml > guest-full.xml $ virsh cpu-baseline --features hostcaps.xml > hostcaps-full.xml $ virsh cpu-baseline --features domcaps.xml > domcaps-full.xml When I now compare them diff -u guest-full.xml hostcaps-full.xml | grep -E '^(-|\+)' --- guest-full.xml 2021-01-27 17:22:20.655831989 +0000 +++ hostcaps-full.xml 2021-01-27 17:22:27.262779417 +0000 - <model fallback='forbid'>Skylake-Server-IBRS</model> + <model fallback='forbid'>Cascadelake-Server</model> + <feature policy='require' name='acpi'/> + <feature policy='require' name='arch-capabilities'/> + <feature policy='require' name='dca'/> + <feature policy='require' name='ds'/> + <feature policy='require' name='ds_cpl'/> + <feature policy='require' name='dtes64'/> + <feature policy='require' name='est'/> - <feature policy='require' name='hypervisor'/> - <feature policy='require' name='ibpb'/> + <feature policy='require' name='ht'/> + <feature policy='require' name='ibrs-all'/> + <feature policy='require' name='intel-pt'/> + <feature policy='require' name='invtsc'/> + <feature policy='require' name='mds-no'/> + <feature policy='require' name='monitor'/> + <feature policy='require' name='pbe'/> + <feature policy='require' name='pdcm'/> + <feature policy='require' name='rdctl-no'/> + <feature policy='require' name='skip-l1dfl-vmentry'/> + <feature policy='require' name='smx'/> + <feature policy='require' name='tm'/> + <feature policy='require' name='tm2'/> - <feature policy='require' name='umip'/> + <feature policy='require' name='tsx-ctrl'/> + <feature policy='require' name='xtpr'/> Those three missing features are what will cause the virConnectCompareCPU method to return failure. If we meanwhile compare against the domcaps $ diff -u guest-full.xml domcaps-full.xml | grep -E '^(-|\+)' --- guest-full.xml 2021-01-27 17:22:20.655831989 +0000 +++ domcaps-full.xml 2021-01-27 17:22:32.321739161 +0000 - <model fallback='forbid'>Skylake-Server-IBRS</model> + <model fallback='forbid'>Cascadelake-Server</model> + <feature policy='require' name='amd-ssbd'/> + <feature policy='require' name='arch-capabilities'/> + <feature policy='require' name='ibrs-all'/> + <feature policy='require' name='invtsc'/> + <feature policy='require' name='mds-no'/> + <feature policy='require' name='pschange-mc-no'/> + <feature policy='require' name='rdctl-no'/> + <feature policy='require' name='skip-l1dfl-vmentry'/> + <feature policy='require' name='tsx-ctrl'/> we see full compatibility. This difference reflects the design limitations of the original virConnectCompareCPU() API that Nova is using. This API compares against the host physical CPUID. There are features in this CPUID that KVM doesn't expose, and there are also features KVM emulates which are not in the host CPUID. The latter is what's causing the problem. If Nova simply didn't call virConnectCompareCPU at all, then libvirt would do the CPU comparison itself during migration and "do the right thing". If Nova absolutely must do a CPU comparison itself, then it needs to change to use virConnectCompareHypervisorCPU instead which reflects the CPUID that KVM is actually able to expose
Thanks for the detailed comment, Dan. I agree, currently Nova's usage of compareCPU() and baselineCPU() is outdated. And it should switch to compareHypervisorCPU() and baselineHypervisorCPU() APIs. FWIW, that is what I've outlined in the design of this Nova spec here[1] Where the commit message does recognize the problem: Make Nova's guest CPU selection approach more effective and reliable by introducing two new QEMU- and libvirt-based CPU configuration APIs: baselineHypervisorCPU() and compareHypervisorCPU(). These new APIs are more "hypervisor-literate" compared to the existing libvirt APIs that Nova uses. As in, the new APIs take into account what the "host hypervisor" (meaning: KVM, QEMU, and what libvirt knows about the host) is capable of. Taking advantage of these newer APIs will allow Nova to make more well-informed decisions when determining CPU models that are compatible across different hosts. And there's WIP to that that end[2] here. I'll work with upstream Nova to accelerate it. [1] Add "CPU selection with hypervisor consideration" spec — https://opendev.org/openstack/nova-specs/commit/70811da221035044e27 [2] https://review.opendev.org/c/openstack/nova/+/762330/ — CPU selection with hypervisor consideration
Created attachment 1751339 [details] guest.xml (generated by putting the <cpu> </cpu> elements from the nova-compute.log.1 at the timestamp 2021-01-26 22:37:31.346)
Created attachment 1751342 [details] domcaps.xml from L0 The domcaps.xml is generated by taking the 'arch', 'model', 'vendor' and 'feature' (only from the CPU) from the `virsh domcapabilities` on the baremetal host (L0), and putting it all under a <cpu> element in domcaps.xml
Created attachment 1751345 [details] hostcaps.xml from L0 The hostcaps.xml is generated by running `virsh domcapabilities` on the L0 host. - - - (With these three files — guest.xml, domcaps.xml, and hostcaps.xml — you can now reconstruct the `virsh cpu-baseline` results and the `diffs`s on any machine based on what Dan describes in comment#4.)
Oops, I forgot to send my comment and Daniel explained it already. The only thing I have to add is a link to bug 1611845 in which we were discussing the same issue and switching to the *HypervisorCPU APIs was suggested.
For what it's worth, I've got a patch here that removes the compareCPU() check on the destination, and let libvirt do the right thing: https://review.opendev.org/c/openstack/nova/+/772917 — [WIP] libvirt: Remove compareCPU() check on the destination
With the patch I have failure on source: 2021-01-29 22:00:26.661 9 ERROR nova.virt.libvirt.driver [-] [instance: d84c2601-201e-4e06-9cb7-debf06c66ed7] Live Migration failure: operation failed: guest CPU doesn't match specification: missing features: hle,rtm: libvirt.libvirtError: operation failed: guest CPU doesn't match specification: missing features: hle,rtm
(In reply to Lukas Bezdicka from comment #12) > libvirt.libvirtError: operation failed: guest CPU doesn't match > specification: missing features: hle,rtm This is most likely caused by trying to migrate a domain from host with TSX enabled to a host with TSX disabled.
(In reply to Lukas Bezdicka from comment #12) > With the patch I have failure on source: > > 2021-01-29 22:00:26.661 9 ERROR nova.virt.libvirt.driver [-] [instance: > d84c2601-201e-4e06-9cb7-debf06c66ed7] Live Migration failure: operation > failed: guest CPU doesn't match specification: missing features: hle,rtm: > libvirt.libvirtError: operation failed: guest CPU doesn't match > specification: missing features: hle,rtm Okay, that error is unrelated to the original problem. As for the error you're seeing, I just learnt from KVM developers that what you're hitting is due to a different reason -- RHEL-8.3 has disabled TSX (which disables those two CPU features: 'hle', and 'rtm'). https://bugzilla.redhat.com/show_bug.cgi?id=1828642 — kernel: Disable Intel TSX by default on newer CPUs The only workaround in this case (where you're using 'host-model' — Nova defaults to this) is to temporarily turn on TSX on the RHEL-8.3 kernel command-line, in /etc/default/grub GRUB_CMDLINE_LINUX_DEFAULT="[...] tsx=on"
Note that host-model only provides *safety* for live migration: if migration succeeds, the guest will run correctly and will have no ABI change. Host-passthrough does not provide safety, that is it's up to the administrator to ensure that the source and destination hosts are not identical in both hardware, kernel/QEMU version, microcode version and configuration. However, even for host-model there is no guarantee that migration succeeds if there are differences between source and destination hosts (again, for any of hardware, kernel version, microcode version and configuration). _Usually_ old->new works, but not always. In the past Intel has disabled features in microcode updates (which you could get just by updating your destination host to a more recent RHEL minor release). More rarely features were disabled in newer processor generations. Regarding software and configuration, migrating to a newer QEMU should always be safe, but bug 1828642 is one case where a newer kernel version changes the defaults and makes it impossible to migrate to a newer kernel (without manually reverting the configuration changes).
(In reply to Paolo Bonzini from comment #15) > Note that host-model only provides *safety* for live migration: if migration > succeeds, the guest will run correctly and will have no ABI change. > Host-passthrough does not provide safety, that is it's up to the > administrator to ensure that the source and destination hosts are not > identical in both hardware, kernel/QEMU version, microcode version and > configuration. > > However, even for host-model there is no guarantee that migration succeeds > if there are differences between source and destination hosts (again, for > any of hardware, kernel version, microcode version and configuration). Note host-model is just a syntax sugar around a named CPU model. So essentially this is saying that there is no guarantee of forwards migration for any CPU model. This rather compromises the main point of using a named CPU model / host-model. Obviously if the hardware has disabled a feature (due to microcode update) then there's usually nothing the software stack can do to fix that. If the hardware still supports the feture (because the user intentionally didn't install the microcode which breaks compat), then IMHO it is unreasonable for the kernel to then intentionally break forwards compatibility out of the box within a minor y-stream update.
For what it's worth, I've filed this: https://bugzilla.redhat.com/show_bug.cgi?id=1923118 — [kernel] "redhat/configs: Change Intel TSX default to off" breaks live migration of KVM guests I'm not expecting a revert here; but I filed it for the sake of discussion.
I completely forgot: last year, we _had_ similar report upstream Nova of failing live migration with Cascade Lake CPU as a destination, and we went with this band-aid: https://review.opendev.org/c/openstack/nova/+/757577 — Handle disabled CPU features to fix live migration failures commit eeeca4ceff576beaa8558360c8a6a165d716f996 Author: Andrew Bonney <andrew.bonney.uk> Date: Tue Oct 6 14:42:38 2020 +0100 Handle disabled CPU features to fix live migration failures When performing a live migration between hypervisors running libvirt, where one or more CPU features are disabled, nova does not take account of these. This results in migration failures as none of the available hypervisor targets appear compatible. This patch ensures that the libvirt 'disable' poicy is taken account of, at least in a basic sense, by explicitly ignoring items flagged in this way when enumerating CPU features. Closes-Bug: #1898715 Change-Id: Iaf14ca97cfac99dd280d1114123f2d4bb6292b63
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:3483