Description of problem: VM kernel calltrace with unchecked MSR access error Version-Release number of selected component (if applicable): 4.18.0-180.el8.x86_64(host & VM) qemu-kvm-4.2.0-12.module+el8.2.0+5858+afd073bc.x86_64 Host used: AMD EPYC 7251 8-Core Processor How reproducible: 100% Steps to Reproduce: 1.Boot rhel 8.2 VM with libvirt xml: ... <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>EPYC-IBPB</model> <vendor>AMD</vendor> <topology sockets='1' dies='1' cores='8' threads='1'/> <feature policy='require' name='x2apic'/> <feature policy='require' name='tsc-deadline'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='ssbd'/> <feature policy='require' name='cmp_legacy'/> <feature policy='require' name='perfctr_core'/> <feature policy='require' name='clzero'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='virt-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='disable' name='monitor'/> <feature policy='disable' name='svm'/> <feature policy='require' name='topoext'/> </cpu> ... 2.Check kernel dmesg within VM 3. Actual results: Found kernel calltrace: [ 2.150980] unchecked MSR access error: WRMSR to 0x48 (tried to write 0x0000000000000004) at rIP: 0xffffffffb4e61f14 (native_write_msr+0x4/0x20) [ 2.152832] Call Trace: [ 2.153188] speculation_ctrl_update+0x78/0x1f0 [ 2.153862] speculation_ctrl_update_current+0x1b/0x20 [ 2.154579] ssb_prctl_set+0xb2/0xd0 [ 2.155096] arch_seccomp_spec_mitigate+0x27/0x40 [ 2.155754] do_seccomp+0x691/0x6e0 [ 2.156245] do_syscall_64+0x5b/0x1a0 [ 2.156769] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2.156773] RIP: 0033:0x7f8cfa8eb6ed [ 2.156775] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6b 57 2c 00 f7 d8 64 89 01 48 [ 2.156776] RSP: 002b:00007ffedac391f8 EFLAGS: 00000246 ORIG_RAX: 000000000000013d [ 2.156777] RAX: ffffffffffffffda RBX: 00005582071d08e0 RCX: 00007f8cfa8eb6ed [ 2.156778] RDX: 00005582071e14e0 RSI: 0000000000000000 RDI: 0000000000000001 [ 2.156779] RBP: 00005582071e14e0 R08: 00005582071d08e0 R09: 000000004000003e [ 2.156779] R10: 0000000000000007 R11: 0000000000000246 R12: 00005582071a9838 [ 2.156780] R13: 00007ffedac39248 R14: 00007ffedac39250 R15: 00007f8cfc0abb74 Expected results: No calltrace found Additional info: Host cpuinfo: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 8 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC 7251 8-Core Processor Stepping: 2 CPU MHz: 2898.083 BogoMIPS: 4192.04 Virtualization: AMD-V L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 4096K NUMA node0 CPU(s): 0,8,16,24 NUMA node1 CPU(s): 2,10,18,26 NUMA node2 CPU(s): 4,12,20,28 NUMA node3 CPU(s): 6,14,22,30 NUMA node4 CPU(s): 1,9,17,25 NUMA node5 CPU(s): 3,11,19,27 NUMA node6 CPU(s): 5,13,21,29 NUMA node7 CPU(s): 7,15,23,31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca VM cpuinfo: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC Processor (with IBPB) Stepping: 2 CPU MHz: 2096.058 BogoMIPS: 4192.11 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibpb vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero arat arch_capabilities Mitigation status on host: itlb_multihit:Not affected l1tf:Not affected mds:Not affected meltdown:Not affected spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp spectre_v1:Mitigation: usercopy/swapgs barriers and __user pointer sanitization spectre_v2:Mitigation: Full AMD retpoline, IBPB: conditional, STIBP: disabled, RSB filling tsx_async_abort:Not affected Mitigation status on VM: itlb_multihit:Not affected l1tf:Not affected mds:Not affected meltdown:Not affected spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp spectre_v1:Mitigation: usercopy/swapgs barriers and __user pointer sanitization spectre_v2:Mitigation: Full AMD retpoline, IBPB: conditional, STIBP: disabled, RSB filling tsx_async_abort:Not affected
I confirm that as well, happens on my machine - it happens on RHEL 8 an on upstream guest, when enable +topoext. It also seem to crash the guest sometimes, and sometimes it survives.
[ 2.509605] unchecked MSR access error: WRMSR to 0x48 (tried to write 0x0000000000000006) at rIP: 0xffffffff8506b0f4 (native_write_msr+0x4/0x20) [ 2.510953] Call Trace: [ 2.511226] __switch_to_xtra+0x116/0x4f0 [ 2.511672] ? __switch_to_asm+0x34/0x70 [ 2.512096] __switch_to+0x37b/0x420 [ 2.512487] ? __switch_to_asm+0x34/0x70 [ 2.512923] __schedule+0x2b8/0x710 [ 2.513299] schedule+0x4a/0xb0 [ 2.513637] exit_to_usermode_loop+0x76/0x130 [ 2.514109] prepare_exit_to_usermode+0xa8/0xc0 [ 2.514589] ret_from_intr+0x25/0x25 [ 2.514978] RIP: 0033:0x7f18c41f106e [ 2.515361] Code: b6 04 17 0f b6 0c 16 85 c0 75 e1 29 c8 c5 f8 77 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 01 d7 48 01 d6 f3 0f bc d1 0f b6 04 17 <0f> b6 14 16 29 d0 c5 f8 77 c3 0f 1f 84 00 00 00 00 00 3d e0 0f 00 [ 2.517343] RSP: 002b:00007ffe00443178 EFLAGS: 00000242 ORIG_RAX: ffffffffffffff13 [ 2.518144] RAX: 0000000000000074 RBX: 00007f18c3d9ac30 RCX: 000000000000ffff [ 2.518898] RDX: 0000000000000000 RSI: 00007f18c3d95eeb RDI: 00007f18c3d95989 [ 2.519647] RBP: 00007f18c3d95989 R08: 000055972509efe0 R09: 0000000000000010 [ 2.520404] R10: 00007f18c42569e0 R11: 0000000000000007 R12: 00007f18c3d99c60 [ 2.521155] R13: 00000000000000fd R14: 000055972509e000 R15: 000055972509eab0 [ 2.522135] fuse: init (API version 7.31) [mlevitsk@starship-f31vm ~]$ uname -r 5.6.19-200.fc31.x86_64 Qemu command line: QEMU command line is: /home/mlevitsk/UPSTREAM/qemu/build-prod/output/bin/qemu-system-x86_64 -smp 8 -name debug-threads=on -accel kvm -nodefaults -display none -machine kernel-irqchip=on -name guest=Fedora31,debug-threads=on -uuid b24adc30-b88d-11ea-b6a0-b42e99a86b5a -machine q35,vmport=off,sata=off,usb=off,smbus=off -rtc base=localtime,clock=host -global mc146818rtc.lost_tick_policy=discard -global kvm-pit.lost_tick_policy=discard -no-hpet -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -device pcie-root-port,slot=0,id=nport.0,bus=pcie.0,addr=0x1.0x0,multifunction=on -device pcie-root-port,slot=1,id=nport.1,bus=pcie.0,addr=0x1.0x1, -device pcie-root-port,slot=2,id=rport.0,bus=pcie.0,addr=0x1c.0x0,multifunction=on -device pcie-root-port,slot=3,id=rport.1,bus=pcie.0,addr=0x1c.0x1 -device pcie-root-port,slot=4,id=rport.2,bus=pcie.0,addr=0x1c.0x2 -device pcie-root-port,slot=5,id=rport.3,bus=pcie.0,addr=0x1c.0x3 -device pcie-root-port,slot=6,id=rport.4,bus=pcie.0,addr=0x1c.0x4 -device pcie-root-port,slot=7,id=rport.5,bus=pcie.0,addr=0x1c.0x5 -device pcie-root-port,slot=8,id=rport.6,bus=pcie.0,addr=0x1c.0x6 -device pcie-root-port,slot=9,id=rport.7,bus=pcie.0,addr=0x1c.0x7 -device ich9-intel-hda,id=sound0,msi=on,bus=pcie.0,addr=0x1f.0x4 -audiodev pa,id=snd0,server=/run/user/103992/pulse/native -device hda-micro,id=sound0-codec0,bus=sound0.0,cad=0,audiodev=snd0 -boot menu=on,strict=on,splash-time=1000 -L .bios -machine pflash0=flash0,pflash1=flash1 -blockdev node-name=flash0,driver=file,filename=./.bios/OVMF_CODE.fd,read-only=on -blockdev node-name=flash1,driver=file,filename=./.bios/OVMF_VARS.fd -smp maxcpus=64,cores=32,threads=2,sockets=1 -cpu host,invtsc,+topoext -m 8G -device virtio-vga,max_outputs=1,id=gpu1,bus=nport.0 -display gtk,gl=on,window-close=off,zoom-to-fit=on -device virtio-scsi-pci,id=scsi-ctrl,bus=rport.1 -blockdev node-name=disk_root_,driver=file,discard=unmap,aio=native,cache.direct=on,filename=./disk_s1.qcow2 -blockdev node-name=disk_root,driver=qcow2,file=disk_root_,discard=unmap -device scsi-hd,drive=disk_root,bus=scsi-ctrl.0,bootindex=1 -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0_Fedora31,vhost=on -device virtio-net-pci,netdev=tap0,mac=52:50:00:1a:32:3e,bus=rport.2,id=virt_networking_nic0 -device virtio-keyboard-pci,bus=rport.3,id=virt_input_devices_virtio-keyboard-pci -device virtio-tablet-pci,bus=rport.5,id=virt_input_devices_virtio-tablet-pci -chardev socket,path=/run/vmspawn/fedora31_21l9kuxz/hmp_monitor.socket,id=internal_hmp_monitor_socket_chardev,server=on,wait=off -mon chardev=internal_hmp_monitor_socket_chardev,mode=readline -chardev socket,path=/run/vmspawn/fedora31_21l9kuxz/qmp_monitor.socket,id=internal_qmp_monitor_socket_chardev,server=on,wait=off -mon chardev=internal_qmp_monitor_socket_chardev,mode=control -chardev socket,path=/run/vmspawn/fedora31_21l9kuxz/serial.socket,id=internal_serial0_chardev,server=on,wait=off -device isa-serial,chardev=internal_serial0_chardev Qemu on host is upstream qemu, git master of today.
And host is my 3970X, which is similar to an EPYC ROME.
Hang on, it's writing 0x6 for you, it was writing 0x4 for the original reporter. I reckon 6 is STIBP+SSBD
I found mostly the root cause of this on my machine, but it is different from the root cause on the bug reporter machine. On my machine, strangely I don't have support for 'IBRS', so called 'Indirect Branch Restricted Speculation' But I do have support for the rest of spectre_v2 mitigation and namely I do have support for STIBP and IBPB However code in kvm_spec_ctrl_valid_bits does #GP on both IBRS and STIBP bits when IBRS is not supported, despite the fact that AMD has separate bits for both. So I guess I'll open a new bug for my case. For the case described in the bug, Guo, Zhiyi, could you provide cpuid of the host machine? Something like 'taskset -c 0 cpuid -r -1' On your machine apparently kvm doesn't like guest setting the SSBD bit on IA32_SPEC_CTRL This should only happen when kvm_spec_ctrl_valid_bits (in the kernel) detects that either guest or the host don't support it. Best regards, Maxim Levitsky
(In reply to Maxim Levitsky from comment #5) > I found mostly the root cause of this on my machine, but it is different > from the root cause on the bug reporter machine. > > On my machine, strangely I don't have support for 'IBRS', so called > 'Indirect Branch Restricted Speculation' > But I do have support for the rest of spectre_v2 mitigation and namely I do > have support for STIBP and IBPB > > > However code in kvm_spec_ctrl_valid_bits does #GP on both IBRS and STIBP > bits when IBRS is not supported, > despite the fact that AMD has separate bits for both. > > So I guess I'll open a new bug for my case. > > > For the case described in the bug, > > Guo, Zhiyi, could you provide cpuid of the host machine? > Something like 'taskset -c 0 cpuid -r -1' > > > On your machine apparently kvm doesn't like guest setting the SSBD bit on > IA32_SPEC_CTRL > This should only happen when kvm_spec_ctrl_valid_bits (in the kernel) > detects that either guest > or the host don't support it. I think the stuff I'm remembering is: https://lkml.org/lkml/2019/8/21/158 from last year which was similar, then I see the recent: 'kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD'
I opened a new bug for my case. https://bugzilla.redhat.com/show_bug.cgi?id=1853447 For _this_ bug, after talking with Paulo today, we have a theory: It is possible that this 1st generation EPYC supports SSBD in non standard way, as indicated in AMD description here: https://bugzilla.kernel.org/show_bug.cgi?id=199889 'NON-ARCHITECTURAL MSRS' In this case, SSBD bit in IA32_SPEC_CTRL shouldn't be advertised to the guest, but rather only support for this bit in so called 'VIRT_SPEC_CTRL'. I don't know if qemu enforces this, but it was enforced in the configuration > <feature policy='require' name='amd-ssbd'/> It is pity that RHEL8 doesn't have 'cpuid' tool anymore https://bugzilla.redhat.com/show_bug.cgi?id=1568562 Next week I'll look how to manually dump cpuid on the host to check if these 'NON-ARCHITECTURAL MSRS' are indeed involved in this case.
I can't reproduce the issue on rhel8.3-av, guest works well without any call trace. qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420 host kernel: 4.18.0-221.el8.x86_64 guest kernel: 4.18.0-214.el8.x86_64 libvirt-client-6.0.0-25.module+el8.3.0+7176+57f10f42.x86_64 Host: AMD EPYC 7251 8-Core Processor <cpu mode='host-model'/> QEMU cli generated: -cpu EPYC-IBPB,x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,arch-capabilities=on,xsaves=on,cmp-legacy=on,perfctr-core=on,clzero=on,virt-ssbd=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,monitor=off,svm=off,hv-time,hv-vapic,hv-spinlocks=0x1000
You are missing +topoext, and the guest needs more that one vcpu. Otherwise the guest thinks that SMT is disabled, and thus this mitigation is not needed.
(In reply to Maxim Levitsky from comment #10) > You are missing +topoext, and the guest needs more that one vcpu. > Otherwise the guest thinks that SMT is disabled, and thus this mitigation is > not needed. Seems topoext is removed from libvirt cpu mode=host-model. However, I retried with qemu directly with adding +topoext, still didn't hit the call trace. QEMU cli: -smp 16,maxcpus=16,cores=4,threads=2,dies=1,sockets=2 \ -cpu EPYC-IBPB,x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,arch-capabilities=on,xsaves=on,cmp-legacy=on,perfctr-core=on,clzero=on,virt-ssbd=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,monitor=off,svm=off,hv-time,hv-vapic,hv-spinlocks=0x1000,+topoext Guest cpu vulnerabilities are as below: # grep . /sys/devices/system/cpu/vulnerabilities/* /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected /sys/devices/system/cpu/vulnerabilities/mds:Not affected /sys/devices/system/cpu/vulnerabilities/meltdown:Not affected /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: Speculative Store Bypass disabled via prctl and seccomp /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: usercopy/swapgs barriers and __user pointer sanitization /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full AMD retpoline, IBPB: conditional, STIBP: disabled, RSB filling /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected
@Yumei Huang, please disregard my comment #10, I mixed up this with the other bug which I opened a separate bugreport about, sorry my mistake. It looks now that libvirt only enables 'virt-ssbd' and this kind of confirms the theory that we had. Could you install and run 'cpuid' in the host and guest though to be sure?
(In reply to Maxim Levitsky from comment #12) > @Yumei Huang, please disregard my comment #10, I mixed up this with the > other bug which I opened a separate bugreport about, sorry my mistake. > > It looks now that libvirt only enables 'virt-ssbd' and this kind of confirms > the theory that we had. > Could you install and run 'cpuid' in the host and guest though to be sure? Sure. On host: # cpuid -r -1 CPU: 0x00000000 0x00: eax=0x0000000d ebx=0x68747541 ecx=0x444d4163 edx=0x69746e65 0x00000001 0x00: eax=0x00800f12 ebx=0x01100800 ecx=0x7ed8320b edx=0x178bfbff 0x00000002 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000005 0x00: eax=0x00000040 ebx=0x00000040 ecx=0x00000003 edx=0x00000011 0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000001 edx=0x00000000 0x00000007 0x00: eax=0x00000000 ebx=0x209c01a9 ecx=0x00000000 edx=0x00000000 0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000d 0x00: eax=0x00000007 ebx=0x00000340 ecx=0x00000340 edx=0x00000000 0x0000000d 0x01: eax=0x0000000f ebx=0x00000340 ecx=0x00000000 edx=0x00000000 0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000 0x80000000 0x00: eax=0x8000001f ebx=0x68747541 ecx=0x444d4163 edx=0x69746e65 0x80000001 0x00: eax=0x00800f12 ebx=0x40000000 ecx=0x35c233ff edx=0x2fd3fbff 0x80000002 0x00: eax=0x20444d41 ebx=0x43595045 ecx=0x35323720 edx=0x2d382031 0x80000003 0x00: eax=0x65726f43 ebx=0x6f725020 ecx=0x73736563 edx=0x2020726f 0x80000004 0x00: eax=0x20202020 ebx=0x20202020 ecx=0x20202020 edx=0x00202020 0x80000005 0x00: eax=0xff40ff40 ebx=0xff40ff40 ecx=0x20080140 edx=0x40040140 0x80000006 0x00: eax=0x36006400 ebx=0x56006400 ecx=0x02006140 edx=0x0100c140 0x80000007 0x00: eax=0x00000000 ebx=0x0000001b ecx=0x00000000 edx=0x00006799 0x80000008 0x00: eax=0x00003030 ebx=0x00001007 ecx=0x0000600f edx=0x00000000 0x80000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000a 0x00: eax=0x00000001 ebx=0x00008000 ecx=0x00000000 edx=0x0001bcff 0x8000000b 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000d 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000e 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000f 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000010 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000012 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000014 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000015 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000016 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000017 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000018 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000019 0x00: eax=0xf040f040 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001a 0x00: eax=0x00000003 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001b 0x00: eax=0x000003ff ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001d 0x00: eax=0x00004121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000 0x8000001d 0x01: eax=0x00004122 ebx=0x00c0003f ecx=0x000000ff edx=0x00000000 0x8000001d 0x02: eax=0x00004143 ebx=0x01c0003f ecx=0x000003ff edx=0x00000002 0x8000001d 0x03: eax=0x00004163 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001 0x8000001e 0x00: eax=0x00000001 ebx=0x00000100 ecx=0x00000300 edx=0x00000000 0x8000001f 0x00: eax=0x0000000f ebx=0x0000016f ecx=0x0000000f edx=0x00000008 0x80860000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0xc0000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 In guest: # cpuid -r -1 CPU: 0x00000000 0x00: eax=0x0000000d ebx=0x68747541 ecx=0x444d4163 edx=0x69746e65 0x00000001 0x00: eax=0x00800f12 ebx=0x00080800 ecx=0xfff83203 edx=0x178bfbff 0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004b edx=0x002cff80 0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000004 0x00: eax=0x0c000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001 0x00000004 0x01: eax=0x0c000122 ebx=0x00c0003f ecx=0x000000ff edx=0x00000001 0x00000004 0x02: eax=0x0c004043 ebx=0x01c0003f ecx=0x000003ff edx=0x00000000 0x00000004 0x03: eax=0x0c01c163 ebx=0x03c0003f ecx=0x00001fff edx=0x00000006 0x00000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000003 edx=0x00000000 0x00000006 0x00: eax=0x00000004 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000007 0x00: eax=0x00000000 ebx=0x209c01ab ecx=0x00000000 edx=0x20000000 0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000000 0x0000000b 0x01: eax=0x00000003 ebx=0x00000008 ecx=0x00000201 edx=0x00000000 0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x0000000d 0x00: eax=0x00000007 ebx=0x00000340 ecx=0x00000340 edx=0x00000000 0x0000000d 0x01: eax=0x0000000f ebx=0x00000340 ecx=0x00000000 edx=0x00000000 0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000 0x40000000 0x00: eax=0x40000005 ebx=0x7263694d ecx=0x666f736f edx=0x76482074 0x40000001 0x00: eax=0x31237648 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x40000002 0x00: eax=0x00001bbc ebx=0x00060001 ecx=0x00000000 edx=0x00000000 0x40000003 0x00: eax=0x00000232 ebx=0x00000000 ecx=0x00000000 edx=0x00000008 0x40000004 0x00: eax=0x00000008 ebx=0x00001000 ecx=0x00000000 edx=0x00000000 0x40000005 0x00: eax=0xffffffff ebx=0x00000040 ecx=0x00000000 edx=0x00000000 0x80000000 0x00: eax=0x8000001e ebx=0x68747541 ecx=0x444d4163 edx=0x69746e65 0x80000001 0x00: eax=0x00800f12 ebx=0x00000000 ecx=0x00c003f3 edx=0x2fd3fbff 0x80000002 0x00: eax=0x20444d41 ebx=0x43595045 ecx=0x6f725020 edx=0x73736563 0x80000003 0x00: eax=0x2820726f ebx=0x68746977 ecx=0x50424920 edx=0x00002942 0x80000004 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000005 0x00: eax=0x01ff01ff ebx=0x01ff01ff ecx=0x20080140 edx=0x40040140 0x80000006 0x00: eax=0x00000000 ebx=0x42004200 ecx=0x02006140 edx=0x00408140 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000008 0x00: eax=0x00003030 ebx=0x02001001 ecx=0x00000007 edx=0x00000000 0x80000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000b 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000d 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000e 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000000f 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000010 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000011 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000012 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000013 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000014 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000015 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000016 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000017 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000018 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x80000019 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001b 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x8000001d 0x00: eax=0x00004121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001 0x8000001d 0x01: eax=0x00004122 ebx=0x00c0003f ecx=0x000000ff edx=0x00000001 0x8000001d 0x02: eax=0x00004043 ebx=0x01c0003f ecx=0x000003ff edx=0x00000000 0x8000001d 0x03: eax=0x0001c163 ebx=0x03c0003f ecx=0x00001fff edx=0x00000006 0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000000 edx=0x00000000 0x80860000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0xc0000000 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
Host: 0x80000008 0x00: ebx=0x00001007 This means that host only supports IBPB, thus SSBD has to be supported via MSR_AMD64_LS_CFG, which I assume is true since host kernel did report support for it. So this confirms ours theory Now for guest: 0x80000008 0x00: ebx=0x02001001 Here we see that guest nicely have IBPB and SSBD in virt_spec_ctrl, just like it should be (writes to VIRT_SPEC_CTRL are trapped to the host and forwared to the non arch MSR_AMD64_LS_CFG) In regard to 'fake' intel cpu bug mitigations bits we have in guest: 0x00000007 0x00: edx=0x20000000 Here we just have support for ARCH_CAPABILITIES virtualized, so no intel cpu bug mitigations bit are on at all. So it looks all right, and I guess the bug can be closed now. I guess that the root cause of this bug was this forced features: <feature policy='require' name='ssbd'/> <feature policy='require' name='amd-ssbd'/> Which forced the guest to think that IA32_SPEC_CTRL is supported and has SSBD bit in it.
(In reply to Maxim Levitsky from comment #14) > Host: > > 0x80000008 0x00: ebx=0x00001007 > > This means that host only supports IBPB, thus SSBD has to be supported via > MSR_AMD64_LS_CFG, which I assume is true since host kernel did report > support for > it. > > So this confirms ours theory > > Now for guest: > > 0x80000008 0x00: ebx=0x02001001 > > Here we see that guest nicely have IBPB and SSBD in virt_spec_ctrl, just > like it should be > (writes to VIRT_SPEC_CTRL are trapped to the host and forwared to the non > arch MSR_AMD64_LS_CFG) > > In regard to 'fake' intel cpu bug mitigations bits we have in guest: > 0x00000007 0x00: edx=0x20000000 > Here we just have support for ARCH_CAPABILITIES virtualized, so no intel cpu > bug mitigations bit are on at all. > > So it looks all right, and I guess the bug can be closed now. > > I guess that the root cause of this bug was this forced features: > > <feature policy='require' name='ssbd'/> > <feature policy='require' name='amd-ssbd'/> > > Which forced the guest to think that IA32_SPEC_CTRL is supported and has > SSBD bit in it. I see. In my test, these two flags are no longer added. Is the fix on qemu or libvirt? BTW, I had a try with adding the flags in qemu cli, got warnings about host not supporting them. (qemu) qemu-kvm: warning: host doesn't support requested feature: CPUID.07H:EDX.ssbd [bit 31] qemu-kvm: warning: host doesn't support requested feature: CPUID.80000008H:EBX.amd-ssbd [bit 24]
> BTW, I had a try with adding the flags in qemu cli, got warnings about host not supporting them. Which is exactly how it should be. The only question that remains, which is mostly of academic interest, is if these flags weren't wrongly 'supported' with original kernel/qemu versions when the bug was reported, and thus set by default by libvirt, or if these two features were just forced and then it would be libvirt bug or even user error if the user added them explicitly like you tried.
(In reply to Maxim Levitsky from comment #16) > > BTW, I had a try with adding the flags in qemu cli, got warnings about host not supporting them. > > Which is exactly how it should be. > > The only question that remains, which is mostly of academic interest, > is if these flags weren't wrongly 'supported' with original kernel/qemu > versions when the bug was reported, > and thus set by default by libvirt, > or if these two features were just forced and then it would be libvirt bug > or even user error if the user > added them explicitly like you tried. Thanks for the explanation. As the issue is gone, I think we can put it aside and close this bug. What do you think, Zhiyi?
(In reply to Yumei Huang from comment #17) > (In reply to Maxim Levitsky from comment #16) > > > BTW, I had a try with adding the flags in qemu cli, got warnings about host not supporting them. > > > > Which is exactly how it should be. > > > > The only question that remains, which is mostly of academic interest, > > is if these flags weren't wrongly 'supported' with original kernel/qemu > > versions when the bug was reported, > > and thus set by default by libvirt, > > or if these two features were just forced and then it would be libvirt bug > > or even user error if the user > > added them explicitly like you tried. > > Thanks for the explanation. As the issue is gone, I think we can put it > aside and close this bug. What do you think, Zhiyi? Agree. I cannot reproduce this issue as well, close this as closed currentrelease
Hi Maxim, I hit the issue on 8.2.1-av. Do we plan to fix it in 8.2.1 zstream? qemu-kvm-4.2.0-29.module+el8.2.1+8442+7a3eadf7.5 host kernel: 4.18.0-193.28.1.el8_2.x86_64 guest kernel: 4.18.0-240.el8.x86_64 Host: AMD EPYC 7251 8-Core Processor QEMU cli: # /usr/libexec/qemu-kvm \ -name "mouse-vm" \ -sandbox on \ -machine pc-q35-rhel8.2.0 \ -cpu host \ -nodefaults \ -vga std \ -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1,server,nowait \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor,server,nowait \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pcie-root-port,port=0x10,chassis=1,id=root0,bus=pcie.0,multifunction=on,addr=0x2 \ -device pcie-root-port,port=0x11,chassis=2,id=root1,bus=pcie.0,addr=0x2.0x1 \ -device pcie-root-port,port=0x12,chassis=3,id=root2,bus=pcie.0,addr=0x2.0x2 \ -device pcie-root-port,port=0x13,chassis=4,id=root3,bus=pcie.0,addr=0x2.0x3 \ -device pcie-root-port,port=0x14,chassis=5,id=root4,bus=pcie.0,addr=0x2.0x4 \ -device pcie-root-port,port=0x15,chassis=6,id=root5,bus=pcie.0,addr=0x2.0x5 \ -device pcie-root-port,port=0x16,chassis=7,id=root6,bus=pcie.0,addr=0x2.0x6 \ -device pcie-root-port,port=0x17,chassis=8,id=root7,bus=pcie.0,addr=0x2.0x7 \ -device nec-usb-xhci,id=usb1,bus=root0 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=root1 \ -device scsi-hd,id=image1,drive=drive_image1,bus=virtio_scsi_pci0.0,channel=0,scsi-id=0,lun=0,bootindex=0 \ -device virtio-net-pci,mac=9a:8a:8b:8c:8d:8e,id=net0,vectors=4,netdev=tap0,bus=root2 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device virtio-balloon-pci,id=balloon0,bus=root3 \ -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/yuhuang/rhel830-64-virtio-scsi.qcow2 \ -netdev tap,id=tap0,vhost=on \ -m 4096,slots=256,maxmem=128G \ -object memory-backend-ram,id=mem-1,size=2048M,prealloc=yes -numa node,memdev=mem-1 \ -object memory-backend-file,id=mem-2,size=2048M,prealloc=yes,mem-path=/dev/hugepages -numa node,memdev=mem-2 \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -vnc :10 \ -rtc base=utc,clock=host \ -boot menu=off,strict=off,order=cdn,once=c \ -enable-kvm \ -qmp tcp:0:3333,server,nowait \ -serial tcp:0:4444,server,nowait \ -monitor stdio Guest call trace: 2020-10-26-02:19:38: [ 2.143166] unchecked MSR access error: WRMSR to 0x48 (tried to write 0x0000000000000004) at rIP: 0xffffffffa8a64f74 (native_write_msr+0x4/0x20) 2020-10-26-02:19:38: [ 2.143168] Call Trace: 2020-10-26-02:19:38: [ 2.143193] speculation_ctrl_update+0x78/0x1f0 2020-10-26-02:19:38: [ 2.143201] speculation_ctrl_update_current+0x1b/0x20 2020-10-26-02:19:38: [ 2.143204] ssb_prctl_set+0xb2/0xd0 2020-10-26-02:19:38: [ 2.143207] arch_seccomp_spec_mitigate+0x3e/0x40 2020-10-26-02:19:38: [ 2.143210] do_seccomp+0x691/0x6e0 2020-10-26-02:19:38: [ 2.143215] do_syscall_64+0x5b/0x1a0 2020-10-26-02:19:38: [ 2.143220] entry_SYSCALL_64_after_hwframe+0x65/0xca 2020-10-26-02:19:38: [ 2.143224] RIP: 0033:0x7f65ee30d78d 2020-10-26-02:19:38: [ 2.143227] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cb 56 2c 00 f7 d8 64 89 01 48 2020-10-26-02:19:38: [ 2.143229] RSP: 002b:00007ffff5305fc8 EFLAGS: 00000246 ORIG_RAX: 000000000000013d 2020-10-26-02:19:38: [ 2.143232] RAX: ffffffffffffffda RBX: 00005608499fbbf0 RCX: 00007f65ee30d78d 2020-10-26-02:19:38: [ 2.143233] RDX: 0000560849a01de0 RSI: 0000000000000000 RDI: 0000000000000001 2020-10-26-02:19:38: [ 2.143234] RBP: 0000560849a01de0 R08: 00005608499fbbf0 R09: 000000004000003e 2020-10-26-02:19:38: [ 2.143235] R10: 0000000000000007 R11: 0000000000000246 R12: 00005608499a0cd8 2020-10-26-02:19:38: [ 2.143235] R13: 00007ffff5306018 R14: 00007ffff5306020 R15: 00007f65efad4674
I personally don't know. Note that the issue is about SSBD bit, which is not something I investigated a lot. I think the issue is that host only supports SSBD via non architectural msr tweak, while we expose it via the architectural IA32_SPEC_CTRL bit. And due to that and comment #6, I think that we probably need to backport the 'kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD'
It looks like we did do that, and that's now triggered https://bugzilla.redhat.com/show_bug.cgi?id=1915229