Description of problem: Host with an Intel Xeon Ice Lake goes to Non-Operation after update CPU Cluster to "Secure Intel Icelake Server" Version-Release number of selected component (if applicable): RHVM: 4.5.3.7 How reproducible: RHVM: 4.5.3.7 Cluster compatibility version: 4.7 RHVH Host: 4.5.3 Upgrade CPU Type from Intel Icelake Server Family -> Secure Intel Icelake Server Family. RHV-M put the host to "NonOperational" because of "host does not meet the cluster's minimum CPU level. Missing CPU features : taa-no" Actual results: Put Host in "NonOperational" Expected results: Enabling Secure Ice Lake cpu. Additional info: In RHV Manager, it is detected that the Host is not affected by the TAA but as a kernel feature not a CPU.
Additional information: https://www.qemu.org/docs/master/system/qemu-cpu-models.html?highlight=taa-no Recommended to inform that the guest that the host is not vulnerable to CVE-2019-11135, TSX Asynchronous Abort (TAA). This too is an MSR feature, so it does not show up in the Linux /proc/cpuinfo in the host or guest. It should only be enabled for VMs if the host reports Not affected in the /sys/devices/system/cpu/vulnerabilities/tsx_async_abort file. It seems like RHV is not able to understand that this is not a cpu flag.
RHV actually calls "cpu flags" a combination (set) of two things reported by the host: 1. the flags from /proc/cpuinfo 2. the features from virsh domcapabilities The "taa-no" should be listed as a feature. Could you please provide us with the output of "virsh domcapabilities" command executed on the host?
System A: # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf domcapabilities <domainCapabilities> <path>/usr/libexec/qemu-kvm</path> <domain>kvm</domain> <machine>pc-i440fx-rhel7.6.0</machine> <arch>x86_64</arch> <vcpu max='240'/> <iothreads supported='yes'/> <os supported='yes'> <enum name='firmware'/> <loader supported='yes'> <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value> <enum name='type'> <value>rom</value> <value>pflash</value> </enum> <enum name='readonly'> <value>yes</value> <value>no</value> </enum> <enum name='secure'> <value>no</value> </enum> </loader> </os> <cpu> <mode name='host-passthrough' supported='yes'> <enum name='hostPassthroughMigratable'> <value>on</value> <value>off</value> </enum> </mode> <mode name='maximum' supported='yes'> <enum name='maximumMigratable'> <value>on</value> <value>off</value> </enum> </mode> <mode name='host-model' supported='yes'> <model fallback='forbid'>Icelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='avx512ifma'/> <feature policy='require' name='sha-ni'/> <feature policy='require' name='rdpid'/> <feature policy='require' name='fsrm'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='ibrs'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> <feature policy='disable' name='mpx'/> <feature policy='disable' name='intel-pt'/> </mode> <mode name='custom' supported='yes'> <model usable='yes'>qemu64</model> <model usable='yes'>qemu32</model> <model usable='no'>phenom</model> <model usable='yes'>pentium3</model> <model usable='yes'>pentium2</model> <model usable='yes'>pentium</model> <model usable='yes'>n270</model> <model usable='yes'>kvm64</model> <model usable='yes'>kvm32</model> <model usable='yes'>coreduo</model> <model usable='yes'>core2duo</model> <model usable='no'>athlon</model> <model usable='yes'>Westmere-IBRS</model> <model usable='yes'>Westmere</model> <model usable='no'>Snowridge</model> <model usable='yes'>Skylake-Server-noTSX-IBRS</model> <model usable='no'>Skylake-Server-IBRS</model> <model usable='no'>Skylake-Server</model> <model usable='yes'>Skylake-Client-noTSX-IBRS</model> <model usable='no'>Skylake-Client-IBRS</model> <model usable='no'>Skylake-Client</model> <model usable='yes'>SandyBridge-IBRS</model> <model usable='yes'>SandyBridge</model> <model usable='yes'>Penryn</model> <model usable='no'>Opteron_G5</model> <model usable='no'>Opteron_G4</model> <model usable='no'>Opteron_G3</model> <model usable='yes'>Opteron_G2</model> <model usable='yes'>Opteron_G1</model> <model usable='yes'>Nehalem-IBRS</model> <model usable='yes'>Nehalem</model> <model usable='yes'>IvyBridge-IBRS</model> <model usable='yes'>IvyBridge</model> <model usable='yes'>Icelake-Server-noTSX</model> <model usable='no'>Icelake-Server</model> <model usable='yes'>Icelake-Client-noTSX</model> <model usable='no' deprecated='yes'>Icelake-Client</model> <model usable='yes'>Haswell-noTSX-IBRS</model> <model usable='yes'>Haswell-noTSX</model> <model usable='no'>Haswell-IBRS</model> <model usable='no'>Haswell</model> <model usable='no'>EPYC-Rome</model> <model usable='no'>EPYC-Milan</model> <model usable='no'>EPYC-IBPB</model> <model usable='no'>EPYC</model> <model usable='no'>Dhyana</model> <model usable='no'>Cooperlake</model> <model usable='yes'>Conroe</model> <model usable='yes'>Cascadelake-Server-noTSX</model> <model usable='no'>Cascadelake-Server</model> <model usable='yes'>Broadwell-noTSX-IBRS</model> <model usable='yes'>Broadwell-noTSX</model> <model usable='no'>Broadwell-IBRS</model> <model usable='no'>Broadwell</model> <model usable='yes'>486</model> </mode> </cpu> <memoryBacking supported='yes'> <enum name='sourceType'> <value>file</value> <value>anonymous</value> <value>memfd</value> </enum> </memoryBacking> <devices> <disk supported='yes'> <enum name='diskDevice'> <value>disk</value> <value>cdrom</value> <value>floppy</value> <value>lun</value> </enum> <enum name='bus'> <value>ide</value> <value>fdc</value> <value>scsi</value> <value>virtio</value> <value>usb</value> <value>sata</value> </enum> <enum name='model'> <value>virtio</value> <value>virtio-transitional</value> <value>virtio-non-transitional</value> </enum> </disk> <graphics supported='yes'> <enum name='type'> <value>vnc</value> <value>spice</value> <value>egl-headless</value> </enum> </graphics> <video supported='yes'> <enum name='modelType'> <value>vga</value> <value>cirrus</value> <value>qxl</value> <value>virtio</value> <value>none</value> <value>bochs</value> <value>ramfb</value> </enum> </video> <hostdev supported='yes'> <enum name='mode'> <value>subsystem</value> </enum> <enum name='startupPolicy'> <value>default</value> <value>mandatory</value> <value>requisite</value> <value>optional</value> </enum> <enum name='subsysType'> <value>usb</value> <value>pci</value> <value>scsi</value> </enum> <enum name='capsType'/> <enum name='pciBackend'/> </hostdev> <rng supported='yes'> <enum name='model'> <value>virtio</value> <value>virtio-transitional</value> <value>virtio-non-transitional</value> </enum> <enum name='backendModel'> <value>random</value> <value>egd</value> <value>builtin</value> </enum> </rng> <filesystem supported='yes'> <enum name='driverType'> <value>path</value> <value>handle</value> <value>virtiofs</value> </enum> </filesystem> <tpm supported='yes'> <enum name='model'> <value>tpm-tis</value> <value>tpm-crb</value> </enum> <enum name='backendModel'> <value>passthrough</value> <value>emulator</value> </enum> </tpm> </devices> <features> <gic supported='no'/> <vmcoreinfo supported='yes'/> <genid supported='yes'/> <backingStoreInput supported='yes'/> <backup supported='yes'/> <sev supported='no'/> </features> </domainCapabilities> # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 106 model name : Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz stepping : 6 microcode : 0xd000389 cpu MHz : 3300.000 cache size : 18432 KB physical id : 0 siblings : 24 core id : 0 cpu cores : 12 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs mmio_stale_data eibrs_pbrsb bogomips : 4200.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 57 bits virtual power management: [...] # cat /sys/devices/system/cpu/vulnerabilities/tsx_async_abort Not affected System B: # virsh -c qemu:///system?authfile=/etc/ovirt-hosted-engine/virsh_auth.conf domcapabilities <domainCapabilities> <path>/usr/libexec/qemu-kvm</path> <domain>kvm</domain> <machine>pc-i440fx-rhel7.6.0</machine> <arch>x86_64</arch> <vcpu max='240'/> <iothreads supported='yes'/> <os supported='yes'> <enum name='firmware'/> <loader supported='yes'> <value>/usr/share/OVMF/OVMF_CODE.secboot.fd</value> <enum name='type'> <value>rom</value> <value>pflash</value> </enum> <enum name='readonly'> <value>yes</value> <value>no</value> </enum> <enum name='secure'> <value>no</value> </enum> </loader> </os> <cpu> <mode name='host-passthrough' supported='yes'> <enum name='hostPassthroughMigratable'> <value>on</value> <value>off</value> </enum> </mode> <mode name='maximum' supported='yes'> <enum name='maximumMigratable'> <value>on</value> <value>off</value> </enum> </mode> <mode name='host-model' supported='yes'> <model fallback='forbid'>Icelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='avx512ifma'/> <feature policy='require' name='sha-ni'/> <feature policy='require' name='rdpid'/> <feature policy='require' name='fsrm'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='invtsc'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='ibrs'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> <feature policy='disable' name='mpx'/> <feature policy='disable' name='intel-pt'/> </mode> <mode name='custom' supported='yes'> <model usable='yes'>qemu64</model> <model usable='yes'>qemu32</model> <model usable='no'>phenom</model> <model usable='yes'>pentium3</model> <model usable='yes'>pentium2</model> <model usable='yes'>pentium</model> <model usable='yes'>n270</model> <model usable='yes'>kvm64</model> <model usable='yes'>kvm32</model> <model usable='yes'>coreduo</model> <model usable='yes'>core2duo</model> <model usable='no'>athlon</model> <model usable='yes'>Westmere-IBRS</model> <model usable='yes'>Westmere</model> <model usable='no'>Snowridge</model> <model usable='yes'>Skylake-Server-noTSX-IBRS</model> <model usable='no'>Skylake-Server-IBRS</model> <model usable='no'>Skylake-Server</model> <model usable='yes'>Skylake-Client-noTSX-IBRS</model> <model usable='no'>Skylake-Client-IBRS</model> <model usable='no'>Skylake-Client</model> <model usable='yes'>SandyBridge-IBRS</model> <model usable='yes'>SandyBridge</model> <model usable='yes'>Penryn</model> <model usable='no'>Opteron_G5</model> <model usable='no'>Opteron_G4</model> <model usable='no'>Opteron_G3</model> <model usable='yes'>Opteron_G2</model> <model usable='yes'>Opteron_G1</model> <model usable='yes'>Nehalem-IBRS</model> <model usable='yes'>Nehalem</model> <model usable='yes'>IvyBridge-IBRS</model> <model usable='yes'>IvyBridge</model> <model usable='yes'>Icelake-Server-noTSX</model> <model usable='no'>Icelake-Server</model> <model usable='yes'>Icelake-Client-noTSX</model> <model usable='no' deprecated='yes'>Icelake-Client</model> <model usable='yes'>Haswell-noTSX-IBRS</model> <model usable='yes'>Haswell-noTSX</model> <model usable='no'>Haswell-IBRS</model> <model usable='no'>Haswell</model> <model usable='no'>EPYC-Rome</model> <model usable='no'>EPYC-Milan</model> <model usable='no'>EPYC-IBPB</model> <model usable='no'>EPYC</model> <model usable='no'>Dhyana</model> <model usable='no'>Cooperlake</model> <model usable='yes'>Conroe</model> <model usable='yes'>Cascadelake-Server-noTSX</model> <model usable='no'>Cascadelake-Server</model> <model usable='yes'>Broadwell-noTSX-IBRS</model> <model usable='yes'>Broadwell-noTSX</model> <model usable='no'>Broadwell-IBRS</model> <model usable='no'>Broadwell</model> <model usable='yes'>486</model> </mode> </cpu> <memoryBacking supported='yes'> <enum name='sourceType'> <value>file</value> <value>anonymous</value> <value>memfd</value> </enum> </memoryBacking> <devices> <disk supported='yes'> <enum name='diskDevice'> <value>disk</value> <value>cdrom</value> <value>floppy</value> <value>lun</value> </enum> <enum name='bus'> <value>ide</value> <value>fdc</value> <value>scsi</value> <value>virtio</value> <value>usb</value> <value>sata</value> </enum> <enum name='model'> <value>virtio</value> <value>virtio-transitional</value> <value>virtio-non-transitional</value> </enum> </disk> <graphics supported='yes'> <enum name='type'> <value>vnc</value> <value>spice</value> <value>egl-headless</value> </enum> </graphics> <video supported='yes'> <enum name='modelType'> <value>vga</value> <value>cirrus</value> <value>qxl</value> <value>virtio</value> <value>none</value> <value>bochs</value> <value>ramfb</value> </enum> </video> <hostdev supported='yes'> <enum name='mode'> <value>subsystem</value> </enum> <enum name='startupPolicy'> <value>default</value> <value>mandatory</value> <value>requisite</value> <value>optional</value> </enum> <enum name='subsysType'> <value>usb</value> <value>pci</value> <value>scsi</value> </enum> <enum name='capsType'/> <enum name='pciBackend'/> </hostdev> <rng supported='yes'> <enum name='model'> <value>virtio</value> <value>virtio-transitional</value> <value>virtio-non-transitional</value> </enum> <enum name='backendModel'> <value>random</value> <value>egd</value> <value>builtin</value> </enum> </rng> <filesystem supported='yes'> <enum name='driverType'> <value>path</value> <value>handle</value> <value>virtiofs</value> </enum> </filesystem> <tpm supported='yes'> <enum name='model'> <value>tpm-tis</value> <value>tpm-crb</value> </enum> <enum name='backendModel'> <value>passthrough</value> <value>emulator</value> </enum> </tpm> </devices> <features> <gic supported='no'/> <vmcoreinfo supported='yes'/> <genid supported='yes'/> <backingStoreInput supported='yes'/> <backup supported='yes'/> <sev supported='no'/> </features> </domainCapabilities> # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 106 model name : Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz stepping : 6 microcode : 0xd000389 cpu MHz : 3400.000 cache size : 49152 KB physical id : 0 siblings : 64 core id : 0 cpu cores : 32 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_capabilities bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs mmio_stale_data eibrs_pbrsb bogomips : 5200.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 57 bits virtual power management: [...] # cat /sys/devices/system/cpu/vulnerabilities/tsx_async_abort Not affected
It seems the taa-no is not reported at all. However, the RHV should not update the CPU type if there was a host not compatible with the new type. Was the host perhaps added after the update? Or was there e.g. a new version of microcode, libvirt, or quemu installed meanwhile? Are all hosts in the cluster non operational or just some of them?
(In reply to Lucia Jelinkova from comment #4) > It seems the taa-no is not reported at all. However, the RHV should not > update the CPU type if there was a host not compatible with the new type. > Was the host perhaps added after the update? Or was there e.g. a new version > of microcode, libvirt, or quemu installed meanwhile? > > Are all hosts in the cluster non operational or just some of them? Hi, no, this is about me wanting to set the secure version of the CPU. Greetings Klaas
It would still be beneficial if we could gain more information about what/how happened. That would help us to pinpoint the component that causes this - it could be microcode, qemu, libvirt or RHV itself.
microcode has recently been updated for this CPU - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20230214 could be that it changed the reported capabilities, do you have any that are not updated yet, i.e. with older version than d000389? Maybe we don't need to require taa-no when we disable TSX, I can't recall why we do that, possibly for the cases where you want to enable TSX anyway?
We copied that configuration from Qemu's latest definition for Icelake Server (at that time it was version 3) to be aligned with them. https://github.com/qemu/qemu/blob/6c938efc27c2c9c9b02d574d0522a83dc06c72c8/target/i386/cpu.c#L3602
(In reply to Lucia Jelinkova from comment #6) > It would still be beneficial if we could gain more information about > what/how happened. That would help us to pinpoint the component that causes > this - it could be microcode, qemu, libvirt or RHV itself. It's a new RHV setup on new hardware because we have to move datacenters. So there is no single component that was changed that lead to this case. Also those are my first icelake cpus in rhv :) So I can only say from a kernel point of view, those machines should be taa-no because: # cat /sys/devices/system/cpu/vulnerabilities/tsx_async_abort Not affected Kernel recognizes it correctly I would say. (In reply to Michal Skrivanek from comment #7) > microcode has recently been updated for this CPU - https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20230214 > could be that it changed the reported capabilities, do you have any that are not updated yet, i.e. with older version than d000389? same here, I think those servers already came with that, there is no microcode update during boot: "kernel: microcode: sig=0x606a6, pf=0x1, revision=0xd000389" is supplied by dell bios. I mean I could try to downgrade it, but seeing as kernel is correctly recognizing it, I don't think that's needed. It seems like libvirt/qemu does not correctly get the taa-no information from kernel.
it's likely the v4 and probably the new microcode stopped reporting taa-no altogether since it doesn't' really make sense with TSX disabled. I'm not sur about all the other changes in v4 but possibly we can just drop the taa-no requirement and be done with it. Out of the box it should be always fine since we use the -noTSX model anyway.
(In reply to Michal Skrivanek from comment #10) > it's likely the v4 and probably the new microcode stopped reporting taa-no but shouldn't then the kernel also not know about the state of it? I mean qemu and kernel should use the same way of detecting taa-no, right?
taa reporting in kernel is done by https://github.com/torvalds/linux/blob/865fdb08197e657c59e74a35fa32362b12397f58/arch/x86/kernel/cpu/common.c#L1374, it will show not affected because both rtm and tsx-ctrl missing. qemu is just reports the feature individually, if the MSR is not there it's not reporting no-taa, and then it's missing in oVirt's requirements for the CPU model. It also could be it never really worked, it was added at a time when the mitigation didn't even exist and it was just supposed to be reported in future.
Verified with: ovirt-engine-4.5.3.8-2.el8ev.noarch Steps: 1. Create a cluster with Intel Icelake Server Family cpu type 2. Add an Ice Lake host to the cluster 3. Upgrade the cluster cpu type to Secure Intel Icelake Server Family Result: The Ice Lake host status is up after the cluster cpu type is upgraded to Secure Intel Icelake Server Family.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Virtualization security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:3771