Hide Forgot
Description of problem: windows2k8-64 BSOD when booting with -cpu SandyBridge Version-Release number of selected component (if applicable): qemu-kvm-0.12.1.2-2.267.el6ev.x86_64 2.6.32-221.el6.x86_64 seabios-0.6.1.2-15.el6.x86_64 How reproducible: everytime Steps to Reproduce: 1. Boot windows2k8-64 guest with cpu model - SandyBridge /usr/libexec/qemu-kvm -M rhel6.3.0 -enable-kvm -m 1G -smp 2 -rtc base=localtime,clock=host,driftfix=slew -drive file=/root/win2008-64-virtio.qcow2,if=none,id=virtio0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=virtio0,id=virtio0-device,bootindex=0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=20:54:00:6a:c7:d8,bus=pci.0,addr=0x3,bootindex=1 -usb -device usb-tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -boot menu=on -monitor stdio -vga cirrus -vnc :10 -cpu SandyBridge Actual results: Windows BSOD immediately after launching the qemu command line Screen shot is attached. No dump file in guest. Expected results: Additional info: 1. Host cpu info: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Xeon(R) CPU E31280 @ 3.50GHz stepping : 7 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid bogomips : 6983.24 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: 2. Tried with win2k3-64 with the same command line, it can boot up successfully.
Created attachment 573910 [details] screen shot of BSOD
Are you running on a sandyBridge Host? Can you re-try w/ -cpu sandyBridge,enforce Can you re-try w/ -cpu sandyBridge,-xsave
(In reply to comment #3) > Are you running on a sandyBridge Host? yes, it is a SandyBridge host > Can you re-try w/ -cpu sandyBridge,enforce Tried, the same BSOD > Can you re-try w/ -cpu sandyBridge,-xsave Tried, the same BSOD
So, exception code is 0xC0000096 STATUS_PRIVILEGED_INSTRUCTION. This is where the crash happens: 0xfffff8000a8450d0: push %rsp 0xfffff8000a8450d1: and $0x24,%al 0xfffff8000a8450d3: jae 0xfffff8000a8450d9 0xfffff8000a8450d5: movb $0x0,-0x1(%rcx) 0xfffff8000a8450d9: add $0x38,%rcx 0xfffff8000a8450dd: sub $0x1,%r8 0xfffff8000a8450e1: jne 0xfffff8000a8450bc 0xfffff8000a8450e3: jmp 0xfffff8000a8450fd 0xfffff8000a8450e5: xor %ecx,%ecx 0xfffff8000a8450e7: callq *-0x15bf5(%rip) # 0xfffff8000a82f4f8 0xfffff8000a8450ed: xor %edx,%edx 0xfffff8000a8450ef: lea 0x20(%rsp),%r8 0xfffff8000a8450f4: lea 0xa(%rdx),%ecx 0xfffff8000a8450f7: callq *-0x15c05(%rip) # 0xfffff8000a82f4f8 0xfffff8000a8450fd: mov -0xbe8c(%rip),%r9d # 0xfffff8000a839278 0xfffff8000a845104: xor %r8d,%r8d 0xfffff8000a845107: test %r9d,%r9d 0xfffff8000a84510a: je 0xfffff8000a84512e This looks like the CPUID checking code. 0xA (set on %ecx) is probably the CPUID leaf being checked. I will assume that -0xbe8c(%rip) is where the CPUID EAX result is written. This is the content of the memory at that address: fffff8000a839278: 0x00000004 0x00000003 0x00000000 0x00000000 fffff8000a839288: 0x00000000 0x00000000 0x00000000 0x00000000 I don't know if this is the value seen by that code, because I am looking at the memory _after_ Windows already crashed. 0xfffff8000a84510c: xor %edx,%edx 0xfffff8000a84510e: lea 0x186(%r8),%ecx 0xfffff8000a845115: shr $0x20,%rdx 0xfffff8000a845119: xor %eax,%eax 0xfffff8000a84511b: wrmsr Here it's trying to write to MSR 0x186 (PerfEvtSel0). It is available only if CPUID.0AH:EAX[15:8] > 0, but leaf 0xA _is_ available on the rhel6.3.0 machine-type. Now we have to check why/if KVM is raising an exception when the guest tries to write to that MSR.
I just tested using -M rhel6.2.0 (that doesn't have the CPU monitoring leaf available), and it works as expected. It also boots if using -M rhel6.3.0 -cpu SandyBridge,level=9, to disable the CPUID 0xA leaf. We can't set level=9 on SandyBridge, though, as leaf 0xD is necessary for XSAVE. Gleb, what do you think? Should we aim to get vPMU working smoothly on SandyBridge, or should we disable PMU on SandyBridge to avoid risk?
Note that this bug affects Westmere too (-M rhel6.3.0 -cpu Westmere), as it has level=11.
(In reply to comment #8) > I just tested using -M rhel6.2.0 (that doesn't have the CPU monitoring leaf > available), and it works as expected. > > It also boots if using -M rhel6.3.0 -cpu SandyBridge,level=9, to disable the > CPUID 0xA leaf. > > We can't set level=9 on SandyBridge, though, as leaf 0xD is necessary for > XSAVE. > > Gleb, what do you think? Should we aim to get vPMU working smoothly on > SandyBridge, or should we disable PMU on SandyBridge to avoid risk? From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA should return zeroes in all registers, if it is not it's the bug that should be fixed.
(In reply to comment #10) > From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in > kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But > we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA > should return zeroes in all registers, if it is not it's the bug that should be > fixed. Please retest with current packages.
By looking at the kernel-2.6.32-221.el6 source code, it looks like KVM get_supported_cpuid() incorrectly returns the host CPU CPUID bits completely unmodified on leaf 0xA. I suppose the 6.2.0 kernel (-220.el6) also does that. So this is a bug in the 6.2 kernel that will cause issues if using the 6.3 qemu-kvm binary. Is it a bug we will want to fix on 6.2.z, or is it an use case we don't support?
(In reply to comment #11) > (In reply to comment #10) > > From commend #1 the kernel is 2.6.32-221.el6.x86_64. vMPU was introduce in > > kernel-2.6.32-245.el6. The configuration is not valid. It is not rhel6.3. But > > we shouldn't return garbage in leaf 0xA regardless. In that setup leaf 0xA > > should return zeroes in all registers, if it is not it's the bug that should be > > fixed. > > Please retest with current packages. Tried with 2.6.32-264.el6.x86_64, windows guest can boot up successfully.
I believe running RHEL6.3's qemu with the RHEL6.2 kernel is not supported, so I'm closing this bug. Please reopen if I'm wrong.