+++ This bug was initially created as a clone of Bug #1761678 +++ Description of problem: Wrongly support "Cascadelake-Server" on physical host without avx512_vnni cpu flag Version-Release number of selected component (if applicable): libvirt-4.5.0-35.module+el8.1.0+4227+b2722cb3.x86_64 qemu-kvm-2.12.0-88.module+el8.1.0+4233+bc44be3f.x86_64 kernel-4.18.0-147.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. Check physical host cpu info # lscpu ... Model name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz ... Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d # lscpu |grep avx512_vnni No output 2. Check "virsh capabilities" and "virsh domcapabilities" # virsh capabilities <capabilities> <host> <uuid>4c4c4544-0044-3210-8034-cac04f4e5232</uuid> <cpu> <arch>x86_64</arch> <model>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <microcode version='33554526'/> <topology sockets='1' cores='16' threads='2'/> ... # virsh domcapabilities ... <cpu> <mode name='host-passthrough' supported='yes'/> <mode name='host-model' supported='yes'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='disable' name='avx512vnni'/> </mode> <mode name='custom' supported='yes'> ... <model usable='yes'>Skylake-Server</model> <model usable='yes'>Skylake-Server-IBRS</model> <model usable='yes'>Skylake-Client</model> <model usable='yes'>Skylake-Client-IBRS</model> ... <model usable='no'>Cascadelake-Server</model> ... </mode> </cpu> ... 3. Start a shutdown VM with the following conf # virsh domstate avocado-vt-vm1 shut off # virsh dumpxml avocado-vt-vm1 |grep "<cpu" -A2 <cpu mode='host-model' check='partial'> <model fallback='allow'/> </cpu> # virsh start avocado-vt-vm1 Domain avocado-vt-vm1 started # virsh dumpxml avocado-vt-vm1 |grep "<cpu" -A20 <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='disable' name='avx512vnni'/> <feature policy='disable' name='mpx'/> </cpu> # ps -ef |grep avocado-vt-vm1 -cpu Cascadelake-Server,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,skip-l1dfl-vmentry=on,avx512vnni=off # virsh console avocado-vt-vm1 onnected to domain avocado-vt-vm1 Escape character is ^] Red Hat Enterprise Linux 8.1 (Ootpa) Kernel 4.18.0-147.el8.x86_64 on an x86_64 localhost login: root Password: Last login: Tue Oct 15 02:45:16 from 192.168.122.1 [root@localhost ~]# lscpu ... Model name: Intel Xeon Processor (Cascadelake) ... Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat umip pku ospke md_clear arch_capabilities [root@localhost ~]# lscpu |grep avx512_vnni No output Actual results: As step-2 and step-3 show. Expected results: Since the physical host does not support Cascadelake-Server cpu model and the "virsh domcapabilities" also shows this model is not supported. however, the default for host-mode cpu conf is Cascadelake-Server and VM can start successfully with this kind of conf although without avx512_vnni enabled. Additional info: --- Additional comment from Jiri Denemark on 2019-10-16 07:27:37 UTC --- The strange behavior is caused by insufficient CPU signature checks in libvirt. We only consider family and model parts of the signuture, which is the same for both Skylake-Server and Cascadelake-Server CPUs (family 6, model 85). The two CPUs differ only in stepping which is ignored by libvirt. --- Additional comment from Jiri Denemark on 2020-04-14 12:23:05 UTC --- This was fixed in a series of commits which ends with commit 5d6059f8ec16d64f240dc5e6413ca55a3b46b3f7 Refs: v6.2.0-111-g5d6059f8ec Author: Jiri Denemark <jdenemar> AuthorDate: Thu Mar 26 21:55:14 2020 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Wed Apr 8 17:52:50 2020 +0200 cpu_map: Distinguish Cascadelake-Server from Skylake-Server The signatures of these two CPU model differ only in stepping as both report family 6 and model 85. Skylake-Server uses stepping 4 or less and Cascadelake-Server uses stepping 5..7. https://bugzilla.redhat.com/show_bug.cgi?id=1761678 Signed-off-by: Jiri Denemark <jdenemar> Reviewed-by: Ján Tomko <jtomko>
Reproduce this bug with packages as below: # rpm -q libvirt-libs qemu-kvm libvirt-libs-4.5.0-42.module+el8.2.0+6024+15a2423f.x86_64 qemu-kvm-2.12.0-99.module+el8.2.0+6870+55b789b4.2.x86_64 Test on below packages, all the results are expected. # rpm -q libvirt-libs qemu-kvm libvirt-libs-6.0.0-22.module+el8.2.1+6815+1c792dc8.x86_64 qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64 1. Check the cpu info on the host: # lscpu ... Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz Stepping: 4 ... Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d # lscpu | grep avx512_vnni #(no outputs) 2. Check the outputs of virsh capabilities and domcapabilities, the information is correct # virsh capabilities <capabilities> <host> <uuid>4c4c4544-0044-3210-8035-cac04f305332</uuid> <cpu> <arch>x86_64</arch> <model>Skylake-Server-IBRS</model> ... # virsh capabilities | grep avx512_vnni #(no outputs) # virsh domcapabilities ... <cpu> <mode name='host-passthrough' supported='yes'/> <mode name='host-model' supported='yes'> <model fallback='forbid'>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='pschange-mc-no'/> </mode> ... 3. Start vm with cpu as "host-model", the cpu model for the live vm is correct. # virsh dumpxml rhel | grep cpu <vcpu placement='static'>1</vcpu> <cpu mode='host-model' check='partial'/> # virsh start rhel Domain rhel started # virsh dumpxml rhel | grep '<cpu mode' -A20 <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='clflushopt'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='pschange-mc-no'/> </cpu> # ps aux | gerp qemu ... -cpu Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on ... login vm to check: [root@localhost ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel Xeon Processor (Skylake, IBRS) Stepping: 4 CPU MHz: 2095.076 BogoMIPS: 4190.15 Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 4096K L3 cache: 16384K NUMA node0 CPU(s): 0 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat umip pku ospke md_clear arch_capabilities
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3172