Bug 1346153

Summary: [server] KVM guest Windows server 2016 can not boot up in RHEL 6.8
Product: Red Hat Enterprise Linux 6 Reporter: Alpus Chen <achen35>
Component: qemu-kvmAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact: Yehuda Zimmerman <yzimmerm>
Priority: high    
Version: 6.8CC: achen35, ailan, bdas, chayang, coli, dbayly, dhsia, hannan.nozari, jherrman, jkachuck, jsuchane, juzhang, juzou, kli2, kshieh, lijin, michen, mkenneth, phrdina, skinjo, virt-bugs, virt-maint, vrozenfe, wyu
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Limited CPU support for Windows 10 and Windows Server 2016 guests On a Red Hat Enterprise 6 host, Windows 10 and Windows Server 2016 guests can only be created when using the following CPU models: * the Intel Xeon E series * the Intel Xeon E7 family * Intel Xeon v2, v3, and v4 * Opteron G2, G3, G4, G5, and G6 For these CPU models, also make sure to set the CPU model of the guest to match the CPU model detected by running the "virsh capabilities" command on the host. Using the application default or hypervisor default prevents the guests from booting properly. To be able to use Windows 10 guests on Legacy Intel Core 2 processors (also known as Penryn) or Intel Xeon 55xx and 75xx processor families (also known as Nehalem), add the following flag to the Domain XML file, with either Penryn or Nehalem as MODELNAME: <cpu mode='custom' match='exact'> <model>MODELNAME</model> <feature name='erms' policy='require'/> </cpu> Other CPU models are not supported, and both Windows 10 guests and Windows Server 2016 guests created on them are likely to become unresponsive during the boot process.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-26 10:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1313865, 1359965    
Attachments:
Description Flags
guest vm Windows server 2016 hangs
none
BSOD when adding kvm module parameter "ignore_msrs=1"
none
dmesg log
none
sosreport
none
sosreport log from Intel Sandy-Bridge system none

Description Alpus Chen 2016-06-14 06:37:54 UTC
Created attachment 1167770 [details]
guest vm Windows server 2016 hangs

Description of problem:

Trying to install a KVM guest "Windows Server 2016" from ISO image, the installation stop at Windows logo screen, the message "kvm: 8580: cpu0 unhandled rdmsr: 0x3a" appears in the dmesg when guest VM power on.

Also tried to add kvm module parameter ignore_msrs=1, BSOD occurs.

I'm searching web but can not find the RHEL 6.8 KVM guest compatibility guide, does Windows Server 2016 is a supported guest OS by RHEL 6.8 KVM?


Version-Release number of selected component (if applicable):
Linux localhost.localdomain 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Alpus Chen 2016-06-14 06:39:32 UTC
Created attachment 1167783 [details]
BSOD when adding kvm module parameter "ignore_msrs=1"

Comment 2 Alpus Chen 2016-06-14 06:44:35 UTC
Created attachment 1167786 [details]
dmesg log

Comment 4 Joseph Kachuck 2016-06-14 14:40:05 UTC
Hello Lenovo,
Please attach a sosreport from a host system directly after seeing this issue.

Please confirm if you have been able to recreate this issue on more then one physical system.

Thank You
Joe Kachuck

Comment 5 Alpus Chen 2016-06-15 03:49:16 UTC
Created attachment 1168151 [details]
sosreport

Comment 6 Alpus Chen 2016-06-15 03:55:50 UTC
(In reply to Joseph Kachuck from comment #4)
> Hello Lenovo,
> Please attach a sosreport from a host system directly after seeing this
> issue.

  sosreport attached.

> Please confirm if you have been able to recreate this issue on more then one
> physical system.

  The issue is seen on two systems.

Comment 8 juzhang 2016-06-16 13:22:13 UTC
Hi Wyu,

Could you handle this issue?

Best Regards,
Junyi

Comment 10 Amnon Ilan 2016-06-20 11:01:44 UTC
Hi Alpus, 

Please note the following CPU model limitations for Win10 on RHEL6.8:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html
Limited CPU support for Windows 10 guests

WS2016 is similar. 
What is the CPU model you are using?

And one more question:
Are you trying to run it in a nested environment?

Comment 11 Bandan Das 2016-06-20 17:30:30 UTC
(In reply to wangyu from comment #9)
> (In reply to Amnon Ilan from comment #7)
> > Hi QE, 
> > Can you try reproducing it? I guess WS2016 should have the same CPU
> > limitations as Win10 on 6.8.
> > Thanks,
> > Amnon
> 
> Hi Amnon
> 
> QE have tested WS2016 with different cpu models on rhel6.8 host. It almost
> has the same CPU limitations for win10-64 on RHEL6.7.z.

Do you see this message on the host ?
cpu0 ignored rdmsr: 0x3a

I just want to confirm whether the feature_control msr read failures are harmless.

Comment 12 Yu Wang 2016-06-21 01:36:05 UTC
> Do you see this message on the host ?
> cpu0 ignored rdmsr: 0x3a
> 
> I just want to confirm whether the feature_control msr read failures are
> harmless.

Hi Bandan Das

I didn't see any message like "cpu0 ignored rdmsr: 0x3a" on the host.

Thanks
Yu Wang

Comment 13 Bandan Das 2016-06-21 03:14:44 UTC
(In reply to wangyu from comment #12)
> > Do you see this message on the host ?
> > cpu0 ignored rdmsr: 0x3a
> > 
> > I just want to confirm whether the feature_control msr read failures are
> > harmless.
> 
> Hi Bandan Das
> 
> I didn't see any message like "cpu0 ignored rdmsr: 0x3a" on the host.

Thanks for confirming. 
Alpus, based on the above observation, now I am curious to know the answer to Amnon's second question :)  Are you running nested ?

> Thanks
> Yu Wang

Comment 14 Alpus Chen 2016-06-21 03:49:03 UTC
CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680 v3 @ 
2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.

Seems this cpu model is not supported, will Red Hat provide a document for user awareness? Thanks.

Comment 15 Vadim Rozenfeld 2016-06-21 08:35:06 UTC
(In reply to Alpus Chen from comment #14)
> CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680
> v3 @ 
> 2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.

By any chance, did you try to install/activate Hyper-V server features?
 
Vadim.

Comment 16 Alpus Chen 2016-06-21 09:05:52 UTC
(In reply to Vadim Rozenfeld from comment #15)
> By any chance, did you try to install/activate Hyper-V server features?
>  
> Vadim.

Can you direct me how to install/activate it?

Comment 17 Vadim Rozenfeld 2016-06-21 11:54:54 UTC
(In reply to Alpus Chen from comment #16)
> (In reply to Vadim Rozenfeld from comment #15)
> > By any chance, did you try to install/activate Hyper-V server features?
> >  
> > Vadim.
> 
> Can you direct me how to install/activate it?

You should be able to do it by installing Hyper-V Server Role ( Server Manager -> Local Server -> Manage -> Add Roles and Features ).

But my question was if this role was activated already, because if it was and nested is turned off, then you probably can experience that SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD (#GP -> KiTrap0D).

Vadim.

Comment 18 Alpus Chen 2016-06-22 06:38:11 UTC
(In reply to Vadim Rozenfeld from comment #17)
> You should be able to do it by installing Hyper-V Server Role ( Server
> Manager -> Local Server -> Manage -> Add Roles and Features ).
> 
> But my question was if this role was activated already, because if it was
> and nested is turned off, then you probably can experience that
> SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD (#GP -> KiTrap0D).
> 
> Vadim.

Assume the setting you mentioned is for Windows OS, I'm seeing Windows2016 VM hang at first time boot up, so there is no chance to do the setting.

Comment 19 Amnon Ilan 2016-06-22 08:36:27 UTC
(In reply to Alpus Chen from comment #14)
> CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680
> v3 @ 
> 2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.
> 
> Seems this cpu model is not supported, will Red Hat provide a document for
> user awareness? Thanks.

Hi Alpus, 

Based on our testing, Haswell cpu model should work.
The document Red Hat provides is for Win10 as I wrote above:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

WS2016 is not released by Microsoft yet (it's tech-preview).
We should have a similar document for WS2016 on the RN of the next RHEL 
release (assuming WS2016 is released before that).

Can you send us logs/sopsreport from the other system? (trying to 
dig some more info to analyze it)

Comment 20 Vadim Rozenfeld 2016-06-22 09:11:44 UTC
Hi Alpus,

Could you please also specify the WS2016 preview and build numbers?
This information is usually included into the installation media name, for example:
en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso

Best regards,
Vadim.

Comment 21 Alpus Chen 2016-06-22 09:21:11 UTC
(In reply to Vadim Rozenfeld from comment #20)
> Hi Alpus,
> 
> Could you please also specify the WS2016 preview and build numbers?
> This information is usually included into the installation media name, for
> example:
> en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso
> 
> Best regards,
> Vadim.

This is the WS2016 ISO image I'm using - 14342.1000.160506-1708.RS1_RELEASE_SERVER_OEMRET_X64FRE_EN-US.ISO

Comment 22 Vadim Rozenfeld 2016-06-22 09:35:04 UTC
(In reply to Alpus Chen from comment #21)
> (In reply to Vadim Rozenfeld from comment #20)
> > Hi Alpus,
> > 
> > Could you please also specify the WS2016 preview and build numbers?
> > This information is usually included into the installation media name, for
> > example:
> > en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso
> > 
> > Best regards,
> > Vadim.
> 
> This is the WS2016 ISO image I'm using -
> 14342.1000.160506-1708.RS1_RELEASE_SERVER_OEMRET_X64FRE_EN-US.ISO

I see. It's OEM, not MSDN distributed one.

Comment 23 Alpus Chen 2016-06-23 08:11:50 UTC
(In reply to Amnon Ilan from comment #19)
> Can you send us logs/sopsreport from the other system? (trying to 
> dig some more info to analyze it)

Reproduced the issue on the other system (Intel Sandy-Bridge cpu), sosreport log attached.

Comment 24 Alpus Chen 2016-06-23 08:13:41 UTC
Created attachment 1171334 [details]
sosreport log from Intel Sandy-Bridge system

Comment 25 Vadim Rozenfeld 2016-06-28 00:14:28 UTC
(In reply to Alpus Chen from comment #24)
> Created attachment 1171334 [details]
> sosreport log from Intel Sandy-Bridge system

Hello Alpus

Looks like 14342.1000.160506-1708 is just another one Insider preview build.
We saw a lot of problems with betas and release candidates in the past that just pop up and then disappeared between betas. 
Can I ask you to give a try to something more "official", something like Windows Server 2016 Essentials Technical Preview 5 (x64) which should be available for download through msdn web-site?

Thanks,
Vadim.

Comment 26 Alpus Chen 2016-06-29 02:35:50 UTC
(In reply to Vadim Rozenfeld from comment #25)
> Hello Alpus
> 
> Looks like 14342.1000.160506-1708 is just another one Insider preview build.
> We saw a lot of problems with betas and release candidates in the past that
> just pop up and then disappeared between betas. 
> Can I ask you to give a try to something more "official", something like
> Windows Server 2016 Essentials Technical Preview 5 (x64) which should be
> available for download through msdn web-site?
> 
> Thanks,
> Vadim.

The issue can be reproduced with Windows Server 2016 TP5 image (en_windows_server_2016_technical_preview_5_x64_dvd_8512312.iso).

Comment 27 Vadim Rozenfeld 2016-06-29 09:34:52 UTC
Can QE try reproducing this issue with Windows Server 2016 TP5 ?

Thanks,
Vadim.

Comment 28 Yu Wang 2016-06-29 10:07:23 UTC
(In reply to Vadim Rozenfeld from comment #27)
> Can QE try reproducing this issue with Windows Server 2016 TP5 ?
> 
> Thanks,
> Vadim.

Hi Vadim,

QE result of comment#9 were tested with Tp5 (en_windows_server_2016_technical_preview_5_x64_dvd_8512312.iso)

more info:
https://mojo.redhat.com/docs/DOC-1084102


cpu model:

Intel:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 61
model name : Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz 

AMD:
processor    : 31
vendor_id    : AuthenticAMD
cpu family    : 21
model        : 2
model name    : AMD Opteron(tm) Processor 6386 SE     


Thanks
Yu Wang

Comment 29 Alpus Chen 2016-07-13 10:15:08 UTC
Any update on this bug? Thanks.

Comment 30 Vadim Rozenfeld 2016-07-14 06:30:36 UTC
Hello Alpus,
I will update this case shortly.
Vadim.

Comment 31 Vadim Rozenfeld 2016-07-19 06:38:38 UTC
looks like just misunderstood this issue. 
When Alpus mentioned Haswell cpu in comment #14, he was referencing to the system (host) cpu. Am I right Alpus? While the VM's cpu type was qemu64,
which is default cpu type for -M rhel6.6.0. The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.

I was able to install WS2016 on rhel6.8 qemu with the following command line 

#!/bin/sh


QEMU=/home/vrozenfe/work/rhel6/qemu-kvm/x86_64-softmmu/qemu-system-x86_64
MACHINE=rhel6.6.0
IMG=ws2016tp5.qcow2
CDROM=en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso

sudo $QEMU -name W2016 -M $MACHINE -cpu qemu64,+fsgsbase -enable-kvm -m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid e1bd15ac-5081-f186-8798-0924490b40ce -nodefconfig -nodefaults -monitor stdio -rtc base=localtime,driftfix=slew -no-reboot -no-shutdown -drive file=/home/vrozenfe/work/images/$IMG,if=none,id=drive-ide0-0-0,cache=off,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive file=/run/media/vrozenfe/elements/isos/$CDROM,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:c6:f1:dc,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vga cirrus -msg timestamp=on

while it still hung after removing "+fsgsbase" flag

Best regards,
Vadim.

Comment 33 Alpus Chen 2016-07-19 10:06:34 UTC
(In reply to Vadim Rozenfeld from comment #31)
> looks like just misunderstood this issue. 
> When Alpus mentioned Haswell cpu in comment #14, he was referencing to the
> system (host) cpu. Am I right Alpus? 

   Yes, it's physical cpu installed on the system, sorry for the confusion.
 
> The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.

  Verified that VM WS2016 can install successfully after adding +fsgsbase flag to cpu type qemu64.

> Best regards,
> Vadim.

Comment 34 Vadim Rozenfeld 2016-07-19 10:47:03 UTC
(In reply to Alpus Chen from comment #33)
> (In reply to Vadim Rozenfeld from comment #31)
> > looks like just misunderstood this issue. 
> > When Alpus mentioned Haswell cpu in comment #14, he was referencing to the
> > system (host) cpu. Am I right Alpus? 
> 
>    Yes, it's physical cpu installed on the system, sorry for the confusion.
>  
> > The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.
> 
>   Verified that VM WS2016 can install successfully after adding +fsgsbase
> flag to cpu type qemu64.
> 
> > Best regards,
> > Vadim.

Thanks for you prompt reply.
I really don't know what would be the resolution for this issue.
Re-assigning it back to Amnon for his decision.

Best regards,
Vadim.

Comment 35 Amnon Ilan 2016-07-20 12:41:37 UTC
In general, the default cpu - qemu64, is the lowest denominator in terms of cpu features, in order to be able to run on older real cpus.
I think the resolution for that is to document it in the doc mentioned above:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

The doc should mention picking the right cpu type (not the default).

Comment 36 Alpus Chen 2016-07-21 06:51:02 UTC
One question: I setup the VM cpu type by the following steps:

"Virtual Machine Manager" -> "Show virtual hardware details" -> "Processor" -> "Configuration" -> "Copy host CPU configuration"

Then the tool will automatically select "SandyBridge" while the host(physical) cpu on the system is Haswell, is it expected result? If I manually select "Haswell", error messages pops up saying that "Error starting domain: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: rtm, hle". 

Read the libvirt cpu map file /usr/share/libvirt/cpu_map.xml, it seems the tool will select VM cpu type based on current host cpu exported flags, so it may not match up VM cpu type and host cpu type?

Comment 37 Jaroslav Suchanek 2016-07-25 16:00:08 UTC
Pavel, any insight into the issue from comment 36? Thanks.

Comment 38 Pavel Hrdina 2016-07-26 11:28:57 UTC
Hi, so the thing that libvirt (controlled by virt-manager) chooses SandyBridge is because there was a bug in TSX feature (represented as RTM and HLE) for Haswell and early Broadwell processors.  It was disabled for those affected processors via microcode update.

The cpu definition stored in cpu_map.xml has those features defined for Haswell, but your host doesn't have those features even though it's an Haswell processor because it was disabled by the microcode update therefore libvirt fallback to the previous generation as the best match.

RHEL-7.2 contains libvirt with data for Haswell and Broadwell without TSX and it's correctly detected as Haswell-noTSX.

Comment 39 Alpus Chen 2016-07-28 03:42:11 UTC
Thanks for the explanation.

Per comment #35, will Red Hat update the document to mention picking the right cpu type (not the default)?

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

Comment 40 Amnon Ilan 2016-08-01 14:22:09 UTC
Yes, we plan to update with the following corrected text (and 
add Win2016 once it is released):

Limited CPU support for Windows 10 guests

On a Red Hat Enterprise 6 host, Windows 10 guests can only be created when using the following CPU models:

* the Intel Xeon E series
* the Intel Xeon E7 family
* Intel Xeon v2, v3, and v4
* Opteron G2, G3, G4, G5, and G6

For these CPU models, also make sure to set the CPU model of the guest to match the CPU model detected by running the "virsh capabilities" command on the host. Using the application default or hypervisor default prevents the guests from booting properly.

To be able to use Windows 10 guests on Legacy Intel Core 2 processors (also known as Penryn) or Intel Xeon 55xx and 75xx processor families (also known as Nehalem), add the following flag to the Domain XML file, with either Penryn or Nehalem as MODELNAME:

   <cpu mode='custom' match='exact'>
     <model>MODELNAME</model>
     <feature name='fsgsbase' policy='require'/>
   </cpu>

Other CPU models are not supported, and Windows 10 guests created on them are likely to terminate unexpectedly with a stop error, also known as the blue screen of death (BSOD).

Comment 41 Jiri Herrmann 2016-08-02 12:12:43 UTC
The Release Notes document with the updated description has been republished for the Customer Portal:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

If there is still anything amiss with the document, please let me know.

Comment 43 Amnon Ilan 2016-08-16 13:04:27 UTC
Closing this bz, feel free to reopen.

Comment 46 Hannan 2017-10-07 20:46:03 UTC
Hi guys,

I am having same problem. Is there any solution on this? Is it working on CentOS 7.x?

Thanks

Comment 53 Red Hat Bugzilla 2023-09-14 23:59:41 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days