Hide Forgot
Created attachment 1196157 [details] BSOD snapshot Description of problem: When added cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep,-avx2,-bmi1,+fsgsbase,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme", guest occurred BSOD. After changed the cpu flag as "-cpu SandyBridge ", the guest could normal boot into os. Version-Release number of selected component (if applicable): kernel-2.6.32-642.el6.x86_64 qemu version: qemu-kvm-rhev-0.12.1.2-2.415.el6.15 Virtio driver version: 62.70.104.81 How reproducible: 100% Steps to Reproduce: 1. boot a guest: /usr/libexec/qemu-kvm \ -name instance-f37065fc-f948-4d3d-a817-800a376ef5f9 \ -M rhel6.5.0 \ -cpu SandyBridge,-invpcid,+erms,-bmi2,+smep,-avx2,-bmi1,+fsgsbase,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme \ -enable-kvm -m 3G -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 \ -uuid f37065fc-f948-4d3d-a817-800a376ef5f9 \ -smbios type=1,manufacturer="Red Hat",product="RHEV Hypervisor",version=4.0-0.7.el7,serial=4C4C4545-0056-4210-8032-C3C04F463358,uuid=8c2c5ee5-9775-4ae4-a3f9-5483fd3457f2 \ -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/test,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ -drive file=2012r2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 \ -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:d7:09:25,bus=pci.0 \ -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 \ -vga cirrus -vnc 0.0.0.0:12 -k en-us -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \ -drive file=en_windows_server_2012_r2_x64_dvd_2707946.iso,media=cdrom,id=cdrom,if=none -device ide-drive,drive=cdrom \ -cdrom virtio81.iso 2.Check the guest status Actual results: BSOD Expected results: Boot normally Additional info: 1. Tried to boot after deleted all virtio device, BSOD also occurred.
Additional info: 1. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,-avx2,-bmi1,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme", the system can boot normally; 2. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep", the system occurred BSOD. 3. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+fsgsbase", the system occurred BSOD. Best Regards~ Peixiu Hou
Hi Peixiu Hou, I haven't been able to reproduce this on my test host. Would it be possible to share a memory dump or provide access to a repro machine? Thanks, Ladi
(In reply to Ladi Prosek from comment #4) > Hi Peixiu Hou, > > I haven't been able to reproduce this on my test host. Would it be possible > to share a memory dump or provide access to a repro machine? > > Thanks, > Ladi Hi Ladi, You can use the follow machine to reproduce it. If point cpu flag with +smep or +fsgsbase, bug will be reproduced. Host ip: 10.66.6.137 Password: Assentor01 The memory dump file has been uploaded to follow location: http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/cpu_flag_BSOD.DMP.zip Best Regards~ Peixiu Hou
(In reply to Peixiu Hou from comment #5) > (In reply to Ladi Prosek from comment #4) > > Hi Peixiu Hou, > > > > I haven't been able to reproduce this on my test host. Would it be possible > > to share a memory dump or provide access to a repro machine? > > > > Thanks, > > Ladi > > Hi Ladi, > > You can use the follow machine to reproduce it. If point cpu flag with +smep > or +fsgsbase, bug will be reproduced. > Host ip: 10.66.6.137 > Password: Assentor01 > > The memory dump file has been uploaded to follow location: > http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/ > cpu_flag_BSOD.DMP.zip Thank you! The dump doesn't look related but having the machine is awesome. Ok if I keep it for another day or two? Radim has been helping me look into it and this is what we know so far (the fsgsbase case): * host CPU does not support the feature, you'll get an error with '-cpu ..,enforce' * cpuid in the guest returns ebx=1 for eax=7,ecx=0 so this indicates to the guest OS that fsgsbase is supported * guest OS tries to set bit 16 in cr4 to enable the fsgsbase functionality * this triggers GP / PRIV_INSTRUCTION (c0000096) and Windows crashes At this point it looks like a discrepancy between what KVM claims to support via cpuid and what it really supports when it comes to the mov cr4, rax part.
(In reply to Ladi Prosek from comment #6) > (In reply to Peixiu Hou from comment #5) > > (In reply to Ladi Prosek from comment #4) > > > Hi Peixiu Hou, > > > > > > I haven't been able to reproduce this on my test host. Would it be possible > > > to share a memory dump or provide access to a repro machine? > > > > > > Thanks, > > > Ladi > > > > Hi Ladi, > > > > You can use the follow machine to reproduce it. If point cpu flag with +smep > > or +fsgsbase, bug will be reproduced. > > Host ip: 10.66.6.137 > > Password: Assentor01 > > > > The memory dump file has been uploaded to follow location: > > http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/ > > cpu_flag_BSOD.DMP.zip > > Thank you! The dump doesn't look related but having the machine is awesome. > Ok if I keep it for another day or two? Yes, you can~ > > Radim has been helping me look into it and this is what we know so far (the > fsgsbase case): > > * host CPU does not support the feature, you'll get an error with '-cpu > ..,enforce' > * cpuid in the guest returns ebx=1 for eax=7,ecx=0 so this indicates to the > guest OS that fsgsbase is supported > * guest OS tries to set bit 16 in cr4 to enable the fsgsbase functionality > * this triggers GP / PRIV_INSTRUCTION (c0000096) and Windows crashes > > At this point it looks like a discrepancy between what KVM claims to support > via cpuid and what it really supports when it comes to the mov cr4, rax part.
I think I understand it now. tl;dr CR4 accesses are not properly intercepted in RHEL 6. RHEL 6: vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK); where #define KVM_GUEST_CR4_MASK \ (X86_CR4_VME | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_PGE | X86_CR4_VMXE \ | X86_CR4_OSXSAVE | X86_CR4_PCIDE) RHEL 7: vmx->vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS; ... vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits); where #define KVM_CR4_GUEST_OWNED_BITS \ (X86_CR4_PVI | X86_CR4_DE | X86_CR4_PCE | X86_CR4_OSFXSR \ | X86_CR4_OSXMMEXCPT | X86_CR4_TSD) CR4_GUEST_HOST_MASK is supposed to be 1 where the bit is host-owned and VMEXIT on write is desired. In RHEL 6 X86_CR4_RDWRGSFS and X86_CR4_SMEP are *not* set so the write will go directly to the CPU and fail because it doesn't support the feature. This is in line with the lack of "cr_{read,write} 4" trace output we observed on the test host. Radim, can you please double-check this?
Your analysis seems correct. The guest would get a #GP when setting the bit. We ought to have backported 66865b5304ecaa7a817b63b625847b3f53a5de00 ("KVM: VMX: Make guest cr4 mask more conservative") when we first hit this problem, but we added two RHEL6-only workarounds instead ... I have started a brew build with the fix (please ignore the brew name -- I noticed it now and it would take another ~25 minutes to start a build ...): https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11902969 If you run it, the guest should fail with an internal emulation error when executing an instruction the needs fsgsbase, because KVM doesn't emulate them (not even upstream). This bug looks like WONTFIX to me -- our software stack just didn't complain early about an user error and RHEL6 is too mature to fix that. Btw. can this happen with RHEL6 libvirt in the middle?
(In reply to Radim Krčmář from comment #9) > If you run it, the guest should fail with an internal emulation error when > executing an instruction the needs fsgsbase, because KVM doesn't emulate > them (not even upstream). Here's what I get with the test kernel: (qemu) KVM: entry failed, hardware error 0x80000021 kvm_run returned -22 rax 00000000000106b8 rbx 00000000b1193dfe rcx 0000000000000001 rdx 0000000000000000 rsi 0000000000000000 rdi 0000000000000000 rsp ffffd00020445b10 rbp ffffd00020445c10 r8 000000000000030a r9 ffffe000000e2ce0 r10 00000000c0010010 r11 fffff803071dcc01 r12 fffff80306293e10 r13 0000000000000002 r14 fffff803062a8f80 r15 0000000000000001 rip fffff8030776f2f3 rflags 00000246 cs 0010 (00000000/00000000 p 1 dpl 0 db 0 s 1 type b l 1 g 0 avl 0) ds 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) es 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) ss 0018 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0) fs 0053 (00000000/00003c00 p 1 dpl 3 db 1 s 1 type 3 l 0 g 0 avl 0) gs 002b (fffff80307371000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) tr 0040 (fffff8030a759080/00000067 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0) gdt fffff8030a758000/7f idt fffff8030a758080/fff cr0 80050031 cr2 ffffc00000010000 cr3 210000 cr4 106b8 cr8 0 efer d01 > This bug looks like WONTFIX to me -- our software stack just didn't complain > early about an user error and RHEL6 is too mature to fix that. > > Btw. can this happen with RHEL6 libvirt in the middle? Peixiu Hou, do the problematic cpu flags come from a libvirt-based deployment or is it just command line fuzzing?
> (In reply to Radim Krčmář from comment #9) >> If you run it, the guest should fail with an internal emulation error when >> executing an instruction the needs fsgsbase, because KVM doesn't emulate >> them (not even upstream). > > Here's what I get with the test kernel: > > (qemu) KVM: entry failed, hardware error 0x80000021 > kvm_run returned -22 Ah, it results in more interesting situation: VMX tries to set the guest cr4 in the hardware cr4 during guest entry, which fails because bit 16 is reserved. Upstream KVM still accepts any CPUID from userspace and we have only stopped QEMU from being silly with CPUID in upstream/RHEL7, so we can only backport from QEMU and/or libvirt.
(In reply to Ladi Prosek from comment #10) > (In reply to Radim Krčmář from comment #9) > > If you run it, the guest should fail with an internal emulation error when > > executing an instruction the needs fsgsbase, because KVM doesn't emulate > > them (not even upstream). > > Here's what I get with the test kernel: > > (qemu) KVM: entry failed, hardware error 0x80000021 > kvm_run returned -22 > rax 00000000000106b8 rbx 00000000b1193dfe rcx 0000000000000001 rdx > 0000000000000000 > rsi 0000000000000000 rdi 0000000000000000 rsp ffffd00020445b10 rbp > ffffd00020445c10 > r8 000000000000030a r9 ffffe000000e2ce0 r10 00000000c0010010 r11 > fffff803071dcc01 > r12 fffff80306293e10 r13 0000000000000002 r14 fffff803062a8f80 r15 > 0000000000000001 > rip fffff8030776f2f3 rflags 00000246 > cs 0010 (00000000/00000000 p 1 dpl 0 db 0 s 1 type b l 1 g 0 avl 0) > ds 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) > es 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) > ss 0018 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0) > fs 0053 (00000000/00003c00 p 1 dpl 3 db 1 s 1 type 3 l 0 g 0 avl 0) > gs 002b (fffff80307371000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) > tr 0040 (fffff8030a759080/00000067 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) > ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0) > gdt fffff8030a758000/7f > idt fffff8030a758080/fff > cr0 80050031 cr2 ffffc00000010000 cr3 210000 cr4 106b8 cr8 0 efer d01 > > > > This bug looks like WONTFIX to me -- our software stack just didn't complain > > early about an user error and RHEL6 is too mature to fix that. > > > > Btw. can this happen with RHEL6 libvirt in the middle? > > Peixiu Hou, do the problematic cpu flags come from a libvirt-based > deployment or is it just command line fuzzing? Hi Ladi, I found this issue when I tried to reproduce a customer's issue(https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/77). The problematic cpu flags come from customer used. I'm not sure if it come from libvirt-based deployment, but I tried use virt-manager to boot a guest, the cpu flag just as "-cpu SandyBridge". Best Regards~ Peixiu Hou
(In reply to Peixiu Hou from comment #12) > Hi Ladi, > > I found this issue when I tried to reproduce a customer's > issue(https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/77). > The problematic cpu flags come from customer used. I'm not sure if it come > from libvirt-based deployment, but I tried use virt-manager to boot a guest, > the cpu flag just as "-cpu SandyBridge". Thanks! So based on this and Radim's assessment I'm closing this as WONTFIX. The workaround is simply to not use these flags with old host CPUs.
Created attachment 1212326 [details] system report kernel panic when boot a rhel7.3 guest with cpu flag as "-cpu host,+smep"
Created attachment 1212327 [details] rhel7.3 guest normal booted with cpu flag as "-cpu host,+fsgsbase"
Hi Ladi, Jiri, I tried this issue with linux guest, details as following: 1. Booted a rhel7.3 guest with cpu flag as "-cpu host,+smep", it's failed, system reported Kernel Panic info, you can check the attachment picture linux_cpu_flags_smep.png. 2. Booted a rhel7.3 guest with cpu flag as "-cpu host,+fsgsbase", it's successed, the guest cpu info as attachment picture limux_cpu_flag_fsgsbase.png. It seems that we need to differentiate the linux and windows system in the Doc Text. Thanks~ Peixiu Hou