caching of qemu capabilities is problematic when host changes between reboots. That's a frequent case with nesting, when people try to "emulate" different capabilities for testing purposes Since nesting is now supported it would be great to clear out the cache on every boot. It shouldn't be that expensive
(In reply to Michal Skrivanek from comment #0) > caching of qemu capabilities is problematic when host changes between > reboots. That's a frequent case with nesting, when people try to "emulate" > different capabilities for testing purposes > Since nesting is now supported it would be great to clear out the cache on > every boot. It shouldn't be that expensive It really is expensive & avoiding this cost on reboot is a critical reason why we have the caching. Can you give an example of things you are changing in the L1 virtual host, that cause the L2 capabilities cache to become invalid ?
(In reply to Daniel Berrangé from comment #1) > (In reply to Michal Skrivanek from comment #0) > > caching of qemu capabilities is problematic when host changes between > > reboots. That's a frequent case with nesting, when people try to "emulate" > > different capabilities for testing purposes > > Since nesting is now supported it would be great to clear out the cache on > > every boot. It shouldn't be that expensive > > It really is expensive & avoiding this cost on reboot is a critical reason > why we have the caching. > > Can you give an example of things you are changing in the L1 virtual host, > that cause the L2 capabilities cache to become invalid ? guest CPU model. This happened in QE, I believe they use a single template and deploy several L1 hosts, each "emulating" different CPU for L2 guests
(In reply to Michal Skrivanek from comment #2) > they use a single template and deploy several L1 hosts, each "emulating" different CPU for L2 guests let me rephrase They deploy several L1 VMs, each with different CPU, serving as hosts for L2 guests. The problem is that libvirt caches the capabilities in that L1 VM template from the first run, and when they are cloned and reconfigured to use different CPU they preserve the (now wrong) cache. Same would happen just by simply changing the L1 VM's CPU type for whatever purpose. Since we support nested virt now, we should allow easy change of the nested host's CPU. Could we perhaps do some basic sanity check on boot, that the flags (or something else?) is the same as last boot?
(In reply to Michal Skrivanek from comment #3) > Could we perhaps do some basic sanity check on boot, that the flags (or > something else?) is the same as last boot? That's why I changed the subject to refering to invalidation when the host CPUID changes.
Comparing complete CPUID data would be impractical and quite complicated as we'd have to ignore some pieces which differ depending on the CPU core or thread they were gathered from. I think it should be sufficient to compare the CPU vendor, model name, and signature (family, model, and stepping) in addition to the microcode version which we check now. Theoretically, vendor and signature should be enough, but we're mostly targeting nested scenario with this issue and QEMU may use the same signature for multiple CPU models. Specifically, Opteron_G1 and Opteron_G2 both report family=15, model=6, stepping=1.
Yeah, vendor+name+signature sounds reasonable to me.
Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2020-May/msg00848.html
Pushed upstream as commit a551dd5fdf71b252949e258eb49403df4d8db82d Refs: v6.3.0-164-ga551dd5fdf Author: Jiri Denemark <jdenemar> AuthorDate: Wed Apr 1 00:44:00 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Mon May 25 16:09:41 2020 +0200 hostcpu: Introduce virHostCPUGetSignature The purpose of this function is to give a short description that would be change when a host CPU is replaced with a different model. This is currently implemented by reading /proc/cpuinfo. It should be implemented for all architectures for which the QEMU driver stores host CPU data in the capabilities cache. In other words for archs that support host-model CPUs. Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko> commit 44f826e4a0a865fce0059cdd826432b8144f6e3e Refs: v6.3.0-165-g44f826e4a0 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Apr 1 19:55:27 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Mon May 25 16:09:41 2020 +0200 hostcpu: Implement virHostCPUGetSignature for x86 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko> commit 2a68ceaa6e2e45bbb05ffa15b4cdf45cba38958f Refs: v6.3.0-166-g2a68ceaa6e Author: Jiri Denemark <jdenemar> AuthorDate: Thu Apr 2 22:35:30 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Mon May 25 16:09:41 2020 +0200 hostcpu: Implement virHostCPUGetSignature for ppc64 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko> commit d3d87e0cefd7144c559dd23fef789e7e37f74e76 Refs: v6.3.0-167-gd3d87e0cef Author: Jiri Denemark <jdenemar> AuthorDate: Mon Apr 20 15:48:13 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Mon May 25 16:09:58 2020 +0200 hostcpu: Implement virHostCPUGetSignature for s390 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko> Reviewed-by: Boris Fiuczynski <fiuczy.com> commit 004804a7d77d0b63ce2f5fcb8499c94b77a5ef5c Refs: v6.3.0-168-g004804a7d7 Author: Jiri Denemark <jdenemar> AuthorDate: Fri May 15 22:00:29 2020 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Mon May 25 16:10:04 2020 +0200 qemu: Invalidate capabilities when host CPU changes The host CPU related info stored in the capabilities cache is no longer valid after the host CPU changes. This is not a frequent situation in real world, but it can easily happen in nested scenarios when a disk image is started with various CPUs. https://bugzilla.redhat.com/show_bug.cgi?id=1778819 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko>
Verify this bug with libvirt-daemon-6.6.0-7.module+el8.3.0+8424+5ea525c5.x86_64 and qemu-kvm-5.1.0-14.module+el8.3.0+8438+644aff69.x86_64: 1. enable nested virt # modprobe -r kvm_intel # modprobe kvm_intel nested=1 # cat /sys/module/kvm_intel/parameters/nested 1 2. start a guest which use host-model # virsh dumpxml vm1 ... <cpu mode='host-model' check='partial'> ... 3. start guest and check which cpu model libvirt used: # virsh start vm1 Domain vm1 started # virsh dumpxml vm1 ... <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> ... 4. login guest and check guest cpu information: IN GUEST: # lscpu ... Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel Xeon Processor (Cascadelake) Stepping: 6 ... 5. install libvirt and qemu-kvm and check domcapabilities: # virsh domcapabilities <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> <feature policy='disable' name='mpx'/> </mode> 6. check timestamp of libvirt qemu cache file: # stat /var/cache/libvirt/qemu/capabilities/3c76bc41d59c0c7314b1ae8e63f4f765d2cf16abaeea081b3ca1f5d8732f7bb1.xml File: /var/cache/libvirt/qemu/capabilities/3c76bc41d59c0c7314b1ae8e63f4f765d2cf16abaeea081b3ca1f5d8732f7bb1.xml Size: 120418 Blocks: 240 IO Block: 4096 regular file Device: fd00h/64768d Inode: 13389701 Links: 1 Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:virt_cache_t:s0 Access: 2020-10-14 17:59:21.787660194 +0800 Modify: 2020-10-14 16:59:20.501990280 +0800 Change: 2020-10-14 16:59:20.501990280 +0800 7. destroy guest and update guest cpu to use a different cpu model # virsh destroy vm1 # virsh dumpxml vm1 <cpu mode='custom' match='exact' check='partial'> <model fallback='forbid'>Skylake-Server-noTSX-IBRS</model> <feature policy='require' name='vmx'/> <numa> <cell id='0' cpus='0-1' memory='512000' unit='KiB'/> <cell id='1' cpus='2-3' memory='512000' unit='KiB'/> </numa> </cpu> # virsh start vm1 Domain vm1 started 8. login guest and check guest cpu: # lscpu ... Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel Xeon Processor (Skylake, IBRS, no TSX) Stepping: 4 ... 9. check libvirt qemu cache file timestamp: # stat /var/cache/libvirt/qemu/capabilities/3c76bc41d59c0c7314b1ae8e63f4f765d2cf16abaeea081b3ca1f5d8732f7bb1.xml File: /var/cache/libvirt/qemu/capabilities/3c76bc41d59c0c7314b1ae8e63f4f765d2cf16abaeea081b3ca1f5d8732f7bb1.xml Size: 123731 Blocks: 248 IO Block: 4096 regular file Device: fd00h/64768d Inode: 13389701 Links: 1 Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:virt_cache_t:s0 Access: 2020-10-14 18:13:37.881875279 +0800 Modify: 2020-10-14 18:13:37.881875279 +0800 Change: 2020-10-14 18:13:37.881875279 +0800 8. check libvirtd debug log and find debug log: 2020-10-14 10:13:37.169+0000: 1120: debug : virQEMUCapsIsValid:4947 : Outdated cc apabilities for '/usr/libexec/qemu-kvm': host CPU changed ('GenuineIntel, Intel Xeon Processor (Skylake, IBRS, no TSX), family: 6, model: 85, stepping: 4' vs 'GG enuineIntel, Intel Xeon Processor (Cascadelake), family: 6, model: 85, stepping:: 6')
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137