Bug 1421583

Summary: Windows 2016 fail to boot with cpu model SandyBridge and above on Skylake host
Product: Red Hat Enterprise Linux 6 Reporter: Guo, Zhiyi <zhguo>
Component: kernelAssignee: Radim Krčmář <rkrcmar>
kernel sub component: KVM QA Contact: Guo, Zhiyi <zhguo>
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: unspecified CC: ailan, areis, bdas, chayang, juzhang, knoel, lijin, michen, mkenneth, ngu, rbalakri, rkrcmar, virt-maint, vrozenfe, zhguo
Version: 6.9   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-06 10:51:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
windows 2016 kernel dump log none

Description Guo, Zhiyi 2017-02-13 07:43:20 UTC
Description of problem:
Windows 2016 fail to boot with cpu model SandyBridge and above on Skylake host

Version-Release number of selected component (if applicable):
kernel version: 2.6.32-691.el6.x86_64
qemu: qemu-kvm-0.12.1.2-2.501.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot windows 2016 guest with cli:
/usr/libexec/qemu-kvm -m 8G \
        -cpu SandyBridge,check \
        -smp 4,threads=2,cores=2 \
        -vnc :0 \
        -vga std \
        -drive file=/home/win2016.qcow2,id=windows-drive,if=none,format=qcow2,cache=none,werror=stop,rerror=stop \
        -device ide-drive,id=windows-drive,drive=windows-drive,bootindex=1 \
        -cdrom en_windows_server_2016_x64_dvd_9327751.iso \
        -boot menu=on \
        -monitor stdio \
        -netdev tap,id=idinWyYp,vhost=on -device e1000,mac=42:ce:a9:d2:4d:d7,id=idlbq7eA,netdev=idinWyYp \
2.
3.

Actual results:
Windows 2016 guest cannot boot, stop code prompt: Stop Code: KMODE EXCEPTION NOT HANDLED

Expected results:
Windows 2016 guest can boot normally

Additional info:
host info:
lscpu:
CPU family:            6
Model:                 94
Model name:            Intel(R) Xeon(R) CPU E3-1280 v5 @ 3.70GHz
Stepping:              3
CPU MHz:               800.000
BogoMIPS:              7392.06
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7

cpuinfo:
processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 94
model name	: Intel(R) Xeon(R) CPU E3-1280 v5 @ 3.70GHz
stepping	: 3
microcode	: 158
cpu MHz		: 800.000
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 1
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb xsaveopt pln pts dtherm hwp hwp_noitfy hwp_act_window hwp_epp tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx
bogomips	: 7392.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

Same fail obtain when booting guest with cpu model Broadwell Haswell SandyBridge and host, windows cannot generate kernel dump log when meeting error.

Windows 2016 guest will boot fail but can generate kernel dump log when using -cpu SandyBridge,-avx,check. Attach kernel dump log for analysis

Windows 2016 guest can boot without error when using cpu model Westmere. And also no boot error when trying with -cpu Westmere,+avx,check

Comment 1 Guo, Zhiyi 2017-02-13 07:48:17 UTC
Created attachment 1249775 [details]
windows 2016 kernel dump log

Only skylake host can reproduce this issue, issue cannot reproduce on Haswell host and Broadwell host

Comment 2 Guo, Zhiyi 2017-02-13 07:49:25 UTC
Issue also cannot reproduce with win2012 r2 guest and win10 guest on same skylake host

Comment 12 Radim Krčmář 2017-02-20 20:02:07 UTC
Chaning the component to backport b65d6e17fe22 ("kvm: x86: mask out XSAVES").

No RHEL6 QEMU CPU models have XSAVES, so proper CPU feature description could get rid of most problems, but a kernel patch is needed as QEMU would still pass all XSAVE features with "-cpu host" and therefore fail on Skylake.

Comment 14 Jan Kurik 2017-12-06 10:51:07 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/