Bug 1371765 - Guest occurred BSOD when added cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep,..." -win2012R2
Summary: Guest occurred BSOD when added cpu flags as "-cpu SandyBridge,-invpcid,+erms...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ladi Prosek
QA Contact: Virtualization Bugs
Jiri Herrmann
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-31 03:36 UTC by Peixiu Hou
Modified: 2016-10-31 07:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Booting virtual machines with the "fsgsbase" and "smep" flags on older host CPUs fails The "fsgsbase" and "smep" CPU flags are not properly emulated on certain older CPU models, such as the early Intel Xeon E processors. As a consequence, using "fsgsbase" or "smep" when booting a guest virtual machine on a host with such a CPU causes the boot to fail. To work around this problem, do not use "fsgsbase" and "smep" if the CPU does not support them.
Clone Of:
Environment:
Last Closed: 2016-10-14 06:53:44 UTC
Target Upstream Version:


Attachments (Terms of Use)
BSOD snapshot (20.25 KB, image/png)
2016-08-31 03:36 UTC, Peixiu Hou
no flags Details
system report kernel panic when boot a rhel7.3 guest with cpu flag as "-cpu host,+smep" (14.55 KB, image/png)
2016-10-20 05:22 UTC, Peixiu Hou
no flags Details
rhel7.3 guest normal booted with cpu flag as "-cpu host,+fsgsbase" (272.82 KB, image/png)
2016-10-20 05:24 UTC, Peixiu Hou
no flags Details

Description Peixiu Hou 2016-08-31 03:36:49 UTC
Created attachment 1196157 [details]
BSOD snapshot

Description of problem:
When added cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep,-avx2,-bmi1,+fsgsbase,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme", guest occurred BSOD. 
After changed the cpu flag as "-cpu SandyBridge ", the guest could normal boot into os.

Version-Release number of selected component (if applicable):
kernel-2.6.32-642.el6.x86_64
qemu version: qemu-kvm-rhev-0.12.1.2-2.415.el6.15
Virtio driver version: 62.70.104.81

How reproducible:
100%

Steps to Reproduce:
1. boot a guest:
/usr/libexec/qemu-kvm \
-name instance-f37065fc-f948-4d3d-a817-800a376ef5f9 \
-M rhel6.5.0 \
-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep,-avx2,-bmi1,+fsgsbase,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme \
-enable-kvm -m 3G -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 \
-uuid f37065fc-f948-4d3d-a817-800a376ef5f9 \
-smbios type=1,manufacturer="Red Hat",product="RHEV Hypervisor",version=4.0-0.7.el7,serial=4C4C4545-0056-4210-8032-C3C04F463358,uuid=8c2c5ee5-9775-4ae4-a3f9-5483fd3457f2 \
-nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/test,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -no-kvm-pit-reinjection -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
-drive file=2012r2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:d7:09:25,bus=pci.0 \
-chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 \
-vga cirrus -vnc 0.0.0.0:12 -k en-us -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 \
-drive file=en_windows_server_2012_r2_x64_dvd_2707946.iso,media=cdrom,id=cdrom,if=none -device ide-drive,drive=cdrom \
-cdrom virtio81.iso

2.Check the guest status


Actual results:
BSOD

Expected results:
Boot normally

Additional info:
1. Tried to boot after deleted all virtio device, BSOD also occurred.

Comment 3 Peixiu Hou 2016-09-28 07:30:18 UTC
Additional info:
1. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,-avx2,-bmi1,-abm,-pdpe1gb,+rdrand,+f16c,+osxsave,-movbe,+dca,+pcid,+pdcm,+xtpr,-fma,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme", the system can boot normally;
2. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+smep", the system occurred BSOD.
3. Tried this issue with cpu flags as "-cpu SandyBridge,-invpcid,+erms,-bmi2,+fsgsbase", the system occurred BSOD.


Best Regards~
Peixiu Hou

Comment 4 Ladi Prosek 2016-10-10 17:32:49 UTC
Hi Peixiu Hou,

I haven't been able to reproduce this on my test host. Would it be possible to share a memory dump or provide access to a repro machine?

Thanks,
Ladi

Comment 5 Peixiu Hou 2016-10-12 03:14:53 UTC
(In reply to Ladi Prosek from comment #4)
> Hi Peixiu Hou,
> 
> I haven't been able to reproduce this on my test host. Would it be possible
> to share a memory dump or provide access to a repro machine?
> 
> Thanks,
> Ladi

Hi Ladi,

You can use the follow machine to reproduce it. If point cpu flag with +smep or +fsgsbase, bug will be reproduced.
Host ip: 10.66.6.137 
Password: Assentor01

The memory dump file has been uploaded to follow location:
http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/cpu_flag_BSOD.DMP.zip


Best Regards~
Peixiu Hou

Comment 6 Ladi Prosek 2016-10-12 15:23:43 UTC
(In reply to Peixiu Hou from comment #5)
> (In reply to Ladi Prosek from comment #4)
> > Hi Peixiu Hou,
> > 
> > I haven't been able to reproduce this on my test host. Would it be possible
> > to share a memory dump or provide access to a repro machine?
> > 
> > Thanks,
> > Ladi
> 
> Hi Ladi,
> 
> You can use the follow machine to reproduce it. If point cpu flag with +smep
> or +fsgsbase, bug will be reproduced.
> Host ip: 10.66.6.137 
> Password: Assentor01
> 
> The memory dump file has been uploaded to follow location:
> http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/
> cpu_flag_BSOD.DMP.zip

Thank you! The dump doesn't look related but having the machine is awesome. Ok if I keep it for another day or two?

Radim has been helping me look into it and this is what we know so far (the fsgsbase case):

* host CPU does not support the feature, you'll get an error with '-cpu ..,enforce'
* cpuid in the guest returns ebx=1 for eax=7,ecx=0 so this indicates to the guest OS that fsgsbase is supported
* guest OS tries to set bit 16 in cr4 to enable the fsgsbase functionality
* this triggers GP / PRIV_INSTRUCTION (c0000096) and Windows crashes

At this point it looks like a discrepancy between what KVM claims to support via cpuid and what it really supports when it comes to the mov cr4, rax part.

Comment 7 Peixiu Hou 2016-10-13 02:54:35 UTC
(In reply to Ladi Prosek from comment #6)
> (In reply to Peixiu Hou from comment #5)
> > (In reply to Ladi Prosek from comment #4)
> > > Hi Peixiu Hou,
> > > 
> > > I haven't been able to reproduce this on my test host. Would it be possible
> > > to share a memory dump or provide access to a repro machine?
> > > 
> > > Thanks,
> > > Ladi
> > 
> > Hi Ladi,
> > 
> > You can use the follow machine to reproduce it. If point cpu flag with +smep
> > or +fsgsbase, bug will be reproduced.
> > Host ip: 10.66.6.137 
> > Password: Assentor01
> > 
> > The memory dump file has been uploaded to follow location:
> > http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/bug1371765/
> > cpu_flag_BSOD.DMP.zip
> 
> Thank you! The dump doesn't look related but having the machine is awesome.
> Ok if I keep it for another day or two?

Yes, you can~
> 
> Radim has been helping me look into it and this is what we know so far (the
> fsgsbase case):
> 
> * host CPU does not support the feature, you'll get an error with '-cpu
> ..,enforce'
> * cpuid in the guest returns ebx=1 for eax=7,ecx=0 so this indicates to the
> guest OS that fsgsbase is supported
> * guest OS tries to set bit 16 in cr4 to enable the fsgsbase functionality
> * this triggers GP / PRIV_INSTRUCTION (c0000096) and Windows crashes
> 
> At this point it looks like a discrepancy between what KVM claims to support
> via cpuid and what it really supports when it comes to the mov cr4, rax part.

Comment 8 Ladi Prosek 2016-10-13 12:45:52 UTC
I think I understand it now. tl;dr CR4 accesses are not properly intercepted in RHEL 6.


RHEL 6:
  vmcs_writel(CR4_GUEST_HOST_MASK, KVM_GUEST_CR4_MASK);
where
  #define KVM_GUEST_CR4_MASK						\
	(X86_CR4_VME | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_PGE | X86_CR4_VMXE \
		| X86_CR4_OSXSAVE | X86_CR4_PCIDE)

RHEL 7:
  vmx->vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS;
  ...
  vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits);
where
  #define KVM_CR4_GUEST_OWNED_BITS				      \
	(X86_CR4_PVI | X86_CR4_DE | X86_CR4_PCE | X86_CR4_OSFXSR      \
	 | X86_CR4_OSXMMEXCPT | X86_CR4_TSD)


CR4_GUEST_HOST_MASK is supposed to be 1 where the bit is host-owned and VMEXIT on write is desired. In RHEL 6 X86_CR4_RDWRGSFS and X86_CR4_SMEP are *not* set so the write will go directly to the CPU and fail because it doesn't support the feature. This is in line with the lack of "cr_{read,write} 4" trace output we observed on the test host.

Radim, can you please double-check this?

Comment 9 Radim Krčmář 2016-10-13 13:31:46 UTC
Your analysis seems correct.  The guest would get a #GP when setting the bit.
We ought to have backported 66865b5304ecaa7a817b63b625847b3f53a5de00 ("KVM: VMX: Make guest cr4 mask more conservative") when we first hit this problem, but we added two RHEL6-only workarounds instead ...

I have started a brew build with the fix (please ignore the brew name -- I noticed it now and it would take another ~25 minutes to start a build ...):
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=11902969

If you run it, the guest should fail with an internal emulation error when executing an instruction the needs fsgsbase, because KVM doesn't emulate them (not even upstream).

This bug looks like WONTFIX to me -- our software stack just didn't complain early about an user error and RHEL6 is too mature to fix that.

Btw. can this happen with RHEL6 libvirt in the middle?

Comment 10 Ladi Prosek 2016-10-13 17:21:52 UTC
(In reply to Radim Krčmář from comment #9)
> If you run it, the guest should fail with an internal emulation error when
> executing an instruction the needs fsgsbase, because KVM doesn't emulate
> them (not even upstream).

Here's what I get with the test kernel:

(qemu) KVM: entry failed, hardware error 0x80000021
kvm_run returned -22
rax 00000000000106b8 rbx 00000000b1193dfe rcx 0000000000000001 rdx 0000000000000000
rsi 0000000000000000 rdi 0000000000000000 rsp ffffd00020445b10 rbp ffffd00020445c10
r8  000000000000030a r9  ffffe000000e2ce0 r10 00000000c0010010 r11 fffff803071dcc01
r12 fffff80306293e10 r13 0000000000000002 r14 fffff803062a8f80 r15 0000000000000001
rip fffff8030776f2f3 rflags 00000246
cs 0010 (00000000/00000000 p 1 dpl 0 db 0 s 1 type b l 1 g 0 avl 0)
ds 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
es 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
ss 0018 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0)
fs 0053 (00000000/00003c00 p 1 dpl 3 db 1 s 1 type 3 l 0 g 0 avl 0)
gs 002b (fffff80307371000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
tr 0040 (fffff8030a759080/00000067 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
gdt fffff8030a758000/7f
idt fffff8030a758080/fff
cr0 80050031 cr2 ffffc00000010000 cr3 210000 cr4 106b8 cr8 0 efer d01


> This bug looks like WONTFIX to me -- our software stack just didn't complain
> early about an user error and RHEL6 is too mature to fix that.
> 
> Btw. can this happen with RHEL6 libvirt in the middle?

Peixiu Hou, do the problematic cpu flags come from a libvirt-based deployment or is it just command line fuzzing?

Comment 11 Radim Krčmář 2016-10-13 18:16:24 UTC
> (In reply to Radim Krčmář from comment #9)
>> If you run it, the guest should fail with an internal emulation error when
>> executing an instruction the needs fsgsbase, because KVM doesn't emulate
>> them (not even upstream).
> 
> Here's what I get with the test kernel:
> 
> (qemu) KVM: entry failed, hardware error 0x80000021
> kvm_run returned -22

Ah, it results in more interesting situation: VMX tries to set the guest cr4 in the hardware cr4 during guest entry, which fails because bit 16 is reserved.

Upstream KVM still accepts any CPUID from userspace and we have only stopped QEMU from being silly with CPUID in upstream/RHEL7, so we can only backport from QEMU and/or libvirt.

Comment 12 Peixiu Hou 2016-10-14 02:53:51 UTC
(In reply to Ladi Prosek from comment #10)
> (In reply to Radim Krčmář from comment #9)
> > If you run it, the guest should fail with an internal emulation error when
> > executing an instruction the needs fsgsbase, because KVM doesn't emulate
> > them (not even upstream).
> 
> Here's what I get with the test kernel:
> 
> (qemu) KVM: entry failed, hardware error 0x80000021
> kvm_run returned -22
> rax 00000000000106b8 rbx 00000000b1193dfe rcx 0000000000000001 rdx
> 0000000000000000
> rsi 0000000000000000 rdi 0000000000000000 rsp ffffd00020445b10 rbp
> ffffd00020445c10
> r8  000000000000030a r9  ffffe000000e2ce0 r10 00000000c0010010 r11
> fffff803071dcc01
> r12 fffff80306293e10 r13 0000000000000002 r14 fffff803062a8f80 r15
> 0000000000000001
> rip fffff8030776f2f3 rflags 00000246
> cs 0010 (00000000/00000000 p 1 dpl 0 db 0 s 1 type b l 1 g 0 avl 0)
> ds 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
> es 002b (00000000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
> ss 0018 (00000000/ffffffff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0)
> fs 0053 (00000000/00003c00 p 1 dpl 3 db 1 s 1 type 3 l 0 g 0 avl 0)
> gs 002b (fffff80307371000/ffffffff p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0)
> tr 0040 (fffff8030a759080/00000067 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
> ldt 0000 (00000000/ffffffff p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0)
> gdt fffff8030a758000/7f
> idt fffff8030a758080/fff
> cr0 80050031 cr2 ffffc00000010000 cr3 210000 cr4 106b8 cr8 0 efer d01
> 
> 
> > This bug looks like WONTFIX to me -- our software stack just didn't complain
> > early about an user error and RHEL6 is too mature to fix that.
> > 
> > Btw. can this happen with RHEL6 libvirt in the middle?
> 
> Peixiu Hou, do the problematic cpu flags come from a libvirt-based
> deployment or is it just command line fuzzing?

Hi Ladi,

I found this issue when I tried to reproduce a customer's issue(https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/77). The problematic cpu flags come from customer used. I'm not sure if it come from libvirt-based deployment, but I tried use virt-manager to boot a guest, the cpu flag just as "-cpu SandyBridge". 


Best Regards~
Peixiu Hou

Comment 13 Ladi Prosek 2016-10-14 06:53:44 UTC
(In reply to Peixiu Hou from comment #12)
> Hi Ladi,
> 
> I found this issue when I tried to reproduce a customer's
> issue(https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/77).
> The problematic cpu flags come from customer used. I'm not sure if it come
> from libvirt-based deployment, but I tried use virt-manager to boot a guest,
> the cpu flag just as "-cpu SandyBridge". 

Thanks! So based on this and Radim's assessment I'm closing this as WONTFIX. The workaround is simply to not use these flags with old host CPUs.

Comment 19 Peixiu Hou 2016-10-20 05:22:04 UTC
Created attachment 1212326 [details]
system report kernel panic when boot a rhel7.3 guest with cpu flag as "-cpu host,+smep"

Comment 20 Peixiu Hou 2016-10-20 05:24:10 UTC
Created attachment 1212327 [details]
rhel7.3 guest normal booted with cpu flag as "-cpu host,+fsgsbase"

Comment 21 Peixiu Hou 2016-10-20 05:32:42 UTC
Hi Ladi, Jiri,

I tried this issue with linux guest, details as following:

1. Booted a rhel7.3 guest with cpu flag as "-cpu host,+smep", it's failed, system reported Kernel Panic info, you can check the attachment picture linux_cpu_flags_smep.png.
2. Booted a rhel7.3 guest with cpu flag as "-cpu host,+fsgsbase", it's successed, the guest cpu info as attachment picture limux_cpu_flag_fsgsbase.png.

It seems that we need to differentiate the linux and windows system in the Doc Text.

Thanks~
Peixiu Hou


Note You need to log in before you can comment on or make changes to this bug.