Bug 808346

Summary: windows2k8-64 BSOD on boot with -cpu SandyBridge or Westmere & vPMU enabled
Product: Red Hat Enterprise Linux 6 Reporter: Miya Chen <michen>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: acathrow, areis, bsarathy, dyasny, gleb, juzhang, mkenneth, rhod, shuang, tburke, virt-maint, vrozenfe, yvugenfi
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-12 18:36:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screen shot of BSOD none

Description Miya Chen 2012-03-30 08:04:56 UTC
Description of problem:
windows2k8-64 BSOD when booting with -cpu SandyBridge

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.267.el6ev.x86_64
2.6.32-221.el6.x86_64
seabios-0.6.1.2-15.el6.x86_64

How reproducible:
everytime

Steps to Reproduce:
1. Boot windows2k8-64 guest with cpu model - SandyBridge
/usr/libexec/qemu-kvm -M rhel6.3.0 -enable-kvm -m 1G -smp 2 -rtc base=localtime,clock=host,driftfix=slew -drive file=/root/win2008-64-virtio.qcow2,if=none,id=virtio0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=virtio0,id=virtio0-device,bootindex=0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=20:54:00:6a:c7:d8,bus=pci.0,addr=0x3,bootindex=1 -usb -device usb-tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -boot menu=on -monitor stdio -vga cirrus -vnc :10 -cpu SandyBridge

  
Actual results:
Windows BSOD immediately after launching the qemu command line
Screen shot is attached.
No dump file in guest.

Expected results:


Additional info:
1. Host cpu info:
processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Xeon(R) CPU E31280 @ 3.50GHz
stepping	: 7
cpu MHz		: 1600.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 6983.24
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

2. Tried with win2k3-64 with the same command line, it can boot up successfully.

Comment 1 Miya Chen 2012-03-30 08:06:17 UTC
Created attachment 573910 [details]
screen shot of BSOD

Comment 3 Dor Laor 2012-04-01 12:13:46 UTC
Are you running on a sandyBridge Host?
Can you re-try w/ -cpu sandyBridge,enforce
Can you re-try w/ -cpu sandyBridge,-xsave

Comment 4 Miya Chen 2012-04-02 09:09:22 UTC
(In reply to comment #3)
> Are you running on a sandyBridge Host?
yes, it is a SandyBridge host

> Can you re-try w/ -cpu sandyBridge,enforce
Tried, the same BSOD

> Can you re-try w/ -cpu sandyBridge,-xsave
Tried, the same BSOD

Comment 7 Eduardo Habkost 2012-04-10 20:29:08 UTC
So, exception code is 0xC0000096 STATUS_PRIVILEGED_INSTRUCTION.

This is where the crash happens:

0xfffff8000a8450d0:  push   %rsp
0xfffff8000a8450d1:  and    $0x24,%al
0xfffff8000a8450d3:  jae    0xfffff8000a8450d9
0xfffff8000a8450d5:  movb   $0x0,-0x1(%rcx)
0xfffff8000a8450d9:  add    $0x38,%rcx
0xfffff8000a8450dd:  sub    $0x1,%r8
0xfffff8000a8450e1:  jne    0xfffff8000a8450bc
0xfffff8000a8450e3:  jmp    0xfffff8000a8450fd
0xfffff8000a8450e5:  xor    %ecx,%ecx
0xfffff8000a8450e7:  callq  *-0x15bf5(%rip)        # 0xfffff8000a82f4f8
0xfffff8000a8450ed:  xor    %edx,%edx
0xfffff8000a8450ef:  lea    0x20(%rsp),%r8
0xfffff8000a8450f4:  lea    0xa(%rdx),%ecx
0xfffff8000a8450f7:  callq  *-0x15c05(%rip)        # 0xfffff8000a82f4f8
0xfffff8000a8450fd:  mov    -0xbe8c(%rip),%r9d        # 0xfffff8000a839278
0xfffff8000a845104:  xor    %r8d,%r8d
0xfffff8000a845107:  test   %r9d,%r9d
0xfffff8000a84510a:  je     0xfffff8000a84512e

This looks like the CPUID checking code. 0xA (set on %ecx) is probably the CPUID leaf being checked. I will assume that -0xbe8c(%rip) is where the CPUID EAX result is written.

This is the content of the memory at that address:
fffff8000a839278: 0x00000004 0x00000003 0x00000000 0x00000000
fffff8000a839288: 0x00000000 0x00000000 0x00000000 0x00000000

I don't know if this is the value seen by that code, because I am looking at the memory _after_ Windows already crashed.

0xfffff8000a84510c:  xor    %edx,%edx
0xfffff8000a84510e:  lea    0x186(%r8),%ecx
0xfffff8000a845115:  shr    $0x20,%rdx
0xfffff8000a845119:  xor    %eax,%eax
0xfffff8000a84511b:  wrmsr  

Here it's trying to write to MSR 0x186 (PerfEvtSel0). It is available only if CPUID.0AH:EAX[15:8] > 0, but leaf 0xA _is_ available on the rhel6.3.0 machine-type. Now we have to check why/if KVM is raising an exception when the guest tries to write to that MSR.

Comment 8 Eduardo Habkost 2012-04-10 20:38:31 UTC
I just tested using -M rhel6.2.0 (that doesn't have the CPU monitoring leaf available), and it works as expected.

It also boots if using -M rhel6.3.0 -cpu SandyBridge,level=9, to disable the CPUID 0xA leaf.

We can't set level=9 on SandyBridge, though, as leaf 0xD is necessary for XSAVE.

Gleb, what do you think? Should we aim to get vPMU working smoothly on SandyBridge, or should we disable PMU on SandyBridge to avoid risk?

Comment 9 Eduardo Habkost 2012-04-10 20:41:35 UTC
Note that this bug affects Westmere too (-M rhel6.3.0 -cpu Westmere), as it has level=11.

Comment 10 Gleb Natapov 2012-04-11 09:01:13 UTC
(In reply to comment #8)
> I just tested using -M rhel6.2.0 (that doesn't have the CPU monitoring leaf
> available), and it works as expected.
> 
> It also boots if using -M rhel6.3.0 -cpu SandyBridge,level=9, to disable the
> CPUID 0xA leaf.
> 
> We can't set level=9 on SandyBridge, though, as leaf 0xD is necessary for
> XSAVE.
> 
> Gleb, what do you think? Should we aim to get vPMU working smoothly on
> SandyBridge, or should we disable PMU on SandyBridge to avoid risk?

From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But
we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA should return zeroes in all registers, if it is not it's the bug that should be fixed.

Comment 11 Ademar Reis 2012-04-11 13:20:54 UTC
(In reply to comment #10)
> From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in
> kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But
> we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA
> should return zeroes in all registers, if it is not it's the bug that should be
> fixed.

Please retest with current packages.

Comment 12 Eduardo Habkost 2012-04-11 15:59:28 UTC
By looking at the kernel-2.6.32-221.el6 source code, it looks like KVM get_supported_cpuid() incorrectly returns the host CPU CPUID bits completely unmodified on leaf 0xA. I suppose the 6.2.0 kernel (-220.el6) also does that.

So this is a bug in the 6.2 kernel that will cause issues if using the 6.3 qemu-kvm binary. Is it a bug we will want to fix on 6.2.z, or is it an use case we don't support?

Comment 14 Miya Chen 2012-04-12 10:01:02 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in
> > kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But
> > we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA
> > should return zeroes in all registers, if it is not it's the bug that should be
> > fixed.
> 
> Please retest with current packages.

Tried with 2.6.32-264.el6.x86_64, windows guest can boot up successfully.

Comment 15 Ademar Reis 2012-04-12 18:36:04 UTC
I believe running RHEL6.3's qemu with the RHEL6.2 kernel is not supported, so I'm closing this bug. Please reopen if I'm wrong.