Bug 1346153 - [server] KVM guest Windows server 2016 can not boot up in RHEL 6.8 [NEEDINFO]
Summary: [server] KVM guest Windows server 2016 can not boot up in RHEL 6.8
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.8
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Vadim Rozenfeld
QA Contact: Virtualization Bugs
Yehuda Zimmerman
URL:
Whiteboard:
Depends On:
Blocks: 1313865 1359965
TreeView+ depends on / blocked
 
Reported: 2016-06-14 06:37 UTC by Alpus Chen
Modified: 2018-08-26 10:28 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Limited CPU support for Windows 10 and Windows Server 2016 guests On a Red Hat Enterprise 6 host, Windows 10 and Windows Server 2016 guests can only be created when using the following CPU models: * the Intel Xeon E series * the Intel Xeon E7 family * Intel Xeon v2, v3, and v4 * Opteron G2, G3, G4, G5, and G6 For these CPU models, also make sure to set the CPU model of the guest to match the CPU model detected by running the "virsh capabilities" command on the host. Using the application default or hypervisor default prevents the guests from booting properly. To be able to use Windows 10 guests on Legacy Intel Core 2 processors (also known as Penryn) or Intel Xeon 55xx and 75xx processor families (also known as Nehalem), add the following flag to the Domain XML file, with either Penryn or Nehalem as MODELNAME: <cpu mode='custom' match='exact'> <model>MODELNAME</model> <feature name='erms' policy='require'/> </cpu> Other CPU models are not supported, and both Windows 10 guests and Windows Server 2016 guests created on them are likely to become unresponsive during the boot process.
Clone Of:
Environment:
Last Closed: 2018-08-26 10:28:24 UTC
Target Upstream Version:
ailan: needinfo? (skinjo)


Attachments (Terms of Use)
guest vm Windows server 2016 hangs (139.29 KB, image/png)
2016-06-14 06:37 UTC, Alpus Chen
no flags Details
BSOD when adding kvm module parameter "ignore_msrs=1" (279.79 KB, image/png)
2016-06-14 06:39 UTC, Alpus Chen
no flags Details
dmesg log (83.34 KB, text/plain)
2016-06-14 06:44 UTC, Alpus Chen
no flags Details
sosreport (8.91 MB, application/x-xz)
2016-06-15 03:49 UTC, Alpus Chen
no flags Details
sosreport log from Intel Sandy-Bridge system (7.76 MB, application/x-xz)
2016-06-23 08:13 UTC, Alpus Chen
no flags Details

Description Alpus Chen 2016-06-14 06:37:54 UTC
Created attachment 1167770 [details]
guest vm Windows server 2016 hangs

Description of problem:

Trying to install a KVM guest "Windows Server 2016" from ISO image, the installation stop at Windows logo screen, the message "kvm: 8580: cpu0 unhandled rdmsr: 0x3a" appears in the dmesg when guest VM power on.

Also tried to add kvm module parameter ignore_msrs=1, BSOD occurs.

I'm searching web but can not find the RHEL 6.8 KVM guest compatibility guide, does Windows Server 2016 is a supported guest OS by RHEL 6.8 KVM?


Version-Release number of selected component (if applicable):
Linux localhost.localdomain 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Alpus Chen 2016-06-14 06:39:32 UTC
Created attachment 1167783 [details]
BSOD when adding kvm module parameter "ignore_msrs=1"

Comment 2 Alpus Chen 2016-06-14 06:44:35 UTC
Created attachment 1167786 [details]
dmesg log

Comment 4 Joseph Kachuck 2016-06-14 14:40:05 UTC
Hello Lenovo,
Please attach a sosreport from a host system directly after seeing this issue.

Please confirm if you have been able to recreate this issue on more then one physical system.

Thank You
Joe Kachuck

Comment 5 Alpus Chen 2016-06-15 03:49:16 UTC
Created attachment 1168151 [details]
sosreport

Comment 6 Alpus Chen 2016-06-15 03:55:50 UTC
(In reply to Joseph Kachuck from comment #4)
> Hello Lenovo,
> Please attach a sosreport from a host system directly after seeing this
> issue.

  sosreport attached.

> Please confirm if you have been able to recreate this issue on more then one
> physical system.

  The issue is seen on two systems.

Comment 8 juzhang 2016-06-16 13:22:13 UTC
Hi Wyu,

Could you handle this issue?

Best Regards,
Junyi

Comment 10 Amnon Ilan 2016-06-20 11:01:44 UTC
Hi Alpus, 

Please note the following CPU model limitations for Win10 on RHEL6.8:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html
Limited CPU support for Windows 10 guests

WS2016 is similar. 
What is the CPU model you are using?

And one more question:
Are you trying to run it in a nested environment?

Comment 11 Bandan Das 2016-06-20 17:30:30 UTC
(In reply to wangyu from comment #9)
> (In reply to Amnon Ilan from comment #7)
> > Hi QE, 
> > Can you try reproducing it? I guess WS2016 should have the same CPU
> > limitations as Win10 on 6.8.
> > Thanks,
> > Amnon
> 
> Hi Amnon
> 
> QE have tested WS2016 with different cpu models on rhel6.8 host. It almost
> has the same CPU limitations for win10-64 on RHEL6.7.z.

Do you see this message on the host ?
cpu0 ignored rdmsr: 0x3a

I just want to confirm whether the feature_control msr read failures are harmless.

Comment 12 Yu Wang 2016-06-21 01:36:05 UTC
> Do you see this message on the host ?
> cpu0 ignored rdmsr: 0x3a
> 
> I just want to confirm whether the feature_control msr read failures are
> harmless.

Hi Bandan Das

I didn't see any message like "cpu0 ignored rdmsr: 0x3a" on the host.

Thanks
Yu Wang

Comment 13 Bandan Das 2016-06-21 03:14:44 UTC
(In reply to wangyu from comment #12)
> > Do you see this message on the host ?
> > cpu0 ignored rdmsr: 0x3a
> > 
> > I just want to confirm whether the feature_control msr read failures are
> > harmless.
> 
> Hi Bandan Das
> 
> I didn't see any message like "cpu0 ignored rdmsr: 0x3a" on the host.

Thanks for confirming. 
Alpus, based on the above observation, now I am curious to know the answer to Amnon's second question :)  Are you running nested ?

> Thanks
> Yu Wang

Comment 14 Alpus Chen 2016-06-21 03:49:03 UTC
CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680 v3 @ 
2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.

Seems this cpu model is not supported, will Red Hat provide a document for user awareness? Thanks.

Comment 15 Vadim Rozenfeld 2016-06-21 08:35:06 UTC
(In reply to Alpus Chen from comment #14)
> CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680
> v3 @ 
> 2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.

By any chance, did you try to install/activate Hyper-V server features?
 
Vadim.

Comment 16 Alpus Chen 2016-06-21 09:05:52 UTC
(In reply to Vadim Rozenfeld from comment #15)
> By any chance, did you try to install/activate Hyper-V server features?
>  
> Vadim.

Can you direct me how to install/activate it?

Comment 17 Vadim Rozenfeld 2016-06-21 11:54:54 UTC
(In reply to Alpus Chen from comment #16)
> (In reply to Vadim Rozenfeld from comment #15)
> > By any chance, did you try to install/activate Hyper-V server features?
> >  
> > Vadim.
> 
> Can you direct me how to install/activate it?

You should be able to do it by installing Hyper-V Server Role ( Server Manager -> Local Server -> Manage -> Add Roles and Features ).

But my question was if this role was activated already, because if it was and nested is turned off, then you probably can experience that SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD (#GP -> KiTrap0D).

Vadim.

Comment 18 Alpus Chen 2016-06-22 06:38:11 UTC
(In reply to Vadim Rozenfeld from comment #17)
> You should be able to do it by installing Hyper-V Server Role ( Server
> Manager -> Local Server -> Manage -> Add Roles and Features ).
> 
> But my question was if this role was activated already, because if it was
> and nested is turned off, then you probably can experience that
> SYSTEM_THREAD_EXCEPTION_NOT_HANDLED BSOD (#GP -> KiTrap0D).
> 
> Vadim.

Assume the setting you mentioned is for Windows OS, I'm seeing Windows2016 VM hang at first time boot up, so there is no chance to do the setting.

Comment 19 Amnon Ilan 2016-06-22 08:36:27 UTC
(In reply to Alpus Chen from comment #14)
> CPU model on the system is Intel Haswell cpu - Intel(R) Xeon(R) CPU E5-2680
> v3 @ 
> 2.50GHz, Windows 2016 VM is running on RHEL 6.8 KVM host, not nested.
> 
> Seems this cpu model is not supported, will Red Hat provide a document for
> user awareness? Thanks.

Hi Alpus, 

Based on our testing, Haswell cpu model should work.
The document Red Hat provides is for Win10 as I wrote above:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

WS2016 is not released by Microsoft yet (it's tech-preview).
We should have a similar document for WS2016 on the RN of the next RHEL 
release (assuming WS2016 is released before that).

Can you send us logs/sopsreport from the other system? (trying to 
dig some more info to analyze it)

Comment 20 Vadim Rozenfeld 2016-06-22 09:11:44 UTC
Hi Alpus,

Could you please also specify the WS2016 preview and build numbers?
This information is usually included into the installation media name, for example:
en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso

Best regards,
Vadim.

Comment 21 Alpus Chen 2016-06-22 09:21:11 UTC
(In reply to Vadim Rozenfeld from comment #20)
> Hi Alpus,
> 
> Could you please also specify the WS2016 preview and build numbers?
> This information is usually included into the installation media name, for
> example:
> en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso
> 
> Best regards,
> Vadim.

This is the WS2016 ISO image I'm using - 14342.1000.160506-1708.RS1_RELEASE_SERVER_OEMRET_X64FRE_EN-US.ISO

Comment 22 Vadim Rozenfeld 2016-06-22 09:35:04 UTC
(In reply to Alpus Chen from comment #21)
> (In reply to Vadim Rozenfeld from comment #20)
> > Hi Alpus,
> > 
> > Could you please also specify the WS2016 preview and build numbers?
> > This information is usually included into the installation media name, for
> > example:
> > en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso
> > 
> > Best regards,
> > Vadim.
> 
> This is the WS2016 ISO image I'm using -
> 14342.1000.160506-1708.RS1_RELEASE_SERVER_OEMRET_X64FRE_EN-US.ISO

I see. It's OEM, not MSDN distributed one.

Comment 23 Alpus Chen 2016-06-23 08:11:50 UTC
(In reply to Amnon Ilan from comment #19)
> Can you send us logs/sopsreport from the other system? (trying to 
> dig some more info to analyze it)

Reproduced the issue on the other system (Intel Sandy-Bridge cpu), sosreport log attached.

Comment 24 Alpus Chen 2016-06-23 08:13:41 UTC
Created attachment 1171334 [details]
sosreport log from Intel Sandy-Bridge system

Comment 25 Vadim Rozenfeld 2016-06-28 00:14:28 UTC
(In reply to Alpus Chen from comment #24)
> Created attachment 1171334 [details]
> sosreport log from Intel Sandy-Bridge system

Hello Alpus

Looks like 14342.1000.160506-1708 is just another one Insider preview build.
We saw a lot of problems with betas and release candidates in the past that just pop up and then disappeared between betas. 
Can I ask you to give a try to something more "official", something like Windows Server 2016 Essentials Technical Preview 5 (x64) which should be available for download through msdn web-site?

Thanks,
Vadim.

Comment 26 Alpus Chen 2016-06-29 02:35:50 UTC
(In reply to Vadim Rozenfeld from comment #25)
> Hello Alpus
> 
> Looks like 14342.1000.160506-1708 is just another one Insider preview build.
> We saw a lot of problems with betas and release candidates in the past that
> just pop up and then disappeared between betas. 
> Can I ask you to give a try to something more "official", something like
> Windows Server 2016 Essentials Technical Preview 5 (x64) which should be
> available for download through msdn web-site?
> 
> Thanks,
> Vadim.

The issue can be reproduced with Windows Server 2016 TP5 image (en_windows_server_2016_technical_preview_5_x64_dvd_8512312.iso).

Comment 27 Vadim Rozenfeld 2016-06-29 09:34:52 UTC
Can QE try reproducing this issue with Windows Server 2016 TP5 ?

Thanks,
Vadim.

Comment 28 Yu Wang 2016-06-29 10:07:23 UTC
(In reply to Vadim Rozenfeld from comment #27)
> Can QE try reproducing this issue with Windows Server 2016 TP5 ?
> 
> Thanks,
> Vadim.

Hi Vadim,

QE result of comment#9 were tested with Tp5 (en_windows_server_2016_technical_preview_5_x64_dvd_8512312.iso)

more info:
https://mojo.redhat.com/docs/DOC-1084102


cpu model:

Intel:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 61
model name : Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz 

AMD:
processor    : 31
vendor_id    : AuthenticAMD
cpu family    : 21
model        : 2
model name    : AMD Opteron(tm) Processor 6386 SE     


Thanks
Yu Wang

Comment 29 Alpus Chen 2016-07-13 10:15:08 UTC
Any update on this bug? Thanks.

Comment 30 Vadim Rozenfeld 2016-07-14 06:30:36 UTC
Hello Alpus,
I will update this case shortly.
Vadim.

Comment 31 Vadim Rozenfeld 2016-07-19 06:38:38 UTC
looks like just misunderstood this issue. 
When Alpus mentioned Haswell cpu in comment #14, he was referencing to the system (host) cpu. Am I right Alpus? While the VM's cpu type was qemu64,
which is default cpu type for -M rhel6.6.0. The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.

I was able to install WS2016 on rhel6.8 qemu with the following command line 

#!/bin/sh


QEMU=/home/vrozenfe/work/rhel6/qemu-kvm/x86_64-softmmu/qemu-system-x86_64
MACHINE=rhel6.6.0
IMG=ws2016tp5.qcow2
CDROM=en_microsoft_hyper-v_server_2016_technical_preview_5_x64_dvd_8512629.iso

sudo $QEMU -name W2016 -M $MACHINE -cpu qemu64,+fsgsbase -enable-kvm -m 2048 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid e1bd15ac-5081-f186-8798-0924490b40ce -nodefconfig -nodefaults -monitor stdio -rtc base=localtime,driftfix=slew -no-reboot -no-shutdown -drive file=/home/vrozenfe/work/images/$IMG,if=none,id=drive-ide0-0-0,cache=off,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive file=/run/media/vrozenfe/elements/isos/$CDROM,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:c6:f1:dc,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vga cirrus -msg timestamp=on

while it still hung after removing "+fsgsbase" flag

Best regards,
Vadim.

Comment 33 Alpus Chen 2016-07-19 10:06:34 UTC
(In reply to Vadim Rozenfeld from comment #31)
> looks like just misunderstood this issue. 
> When Alpus mentioned Haswell cpu in comment #14, he was referencing to the
> system (host) cpu. Am I right Alpus? 

   Yes, it's physical cpu installed on the system, sorry for the confusion.
 
> The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.

  Verified that VM WS2016 can install successfully after adding +fsgsbase flag to cpu type qemu64.

> Best regards,
> Vadim.

Comment 34 Vadim Rozenfeld 2016-07-19 10:47:03 UTC
(In reply to Alpus Chen from comment #33)
> (In reply to Vadim Rozenfeld from comment #31)
> > looks like just misunderstood this issue. 
> > When Alpus mentioned Haswell cpu in comment #14, he was referencing to the
> > system (host) cpu. Am I right Alpus? 
> 
>    Yes, it's physical cpu installed on the system, sorry for the confusion.
>  
> > The problem with cpu type qemu64 is that it doesn't provide +fsgsbase flag by default.
> 
>   Verified that VM WS2016 can install successfully after adding +fsgsbase
> flag to cpu type qemu64.
> 
> > Best regards,
> > Vadim.

Thanks for you prompt reply.
I really don't know what would be the resolution for this issue.
Re-assigning it back to Amnon for his decision.

Best regards,
Vadim.

Comment 35 Amnon Ilan 2016-07-20 12:41:37 UTC
In general, the default cpu - qemu64, is the lowest denominator in terms of cpu features, in order to be able to run on older real cpus.
I think the resolution for that is to document it in the doc mentioned above:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

The doc should mention picking the right cpu type (not the default).

Comment 36 Alpus Chen 2016-07-21 06:51:02 UTC
One question: I setup the VM cpu type by the following steps:

"Virtual Machine Manager" -> "Show virtual hardware details" -> "Processor" -> "Configuration" -> "Copy host CPU configuration"

Then the tool will automatically select "SandyBridge" while the host(physical) cpu on the system is Haswell, is it expected result? If I manually select "Haswell", error messages pops up saying that "Error starting domain: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: rtm, hle". 

Read the libvirt cpu map file /usr/share/libvirt/cpu_map.xml, it seems the tool will select VM cpu type based on current host cpu exported flags, so it may not match up VM cpu type and host cpu type?

Comment 37 Jaroslav Suchanek 2016-07-25 16:00:08 UTC
Pavel, any insight into the issue from comment 36? Thanks.

Comment 38 Pavel Hrdina 2016-07-26 11:28:57 UTC
Hi, so the thing that libvirt (controlled by virt-manager) chooses SandyBridge is because there was a bug in TSX feature (represented as RTM and HLE) for Haswell and early Broadwell processors.  It was disabled for those affected processors via microcode update.

The cpu definition stored in cpu_map.xml has those features defined for Haswell, but your host doesn't have those features even though it's an Haswell processor because it was disabled by the microcode update therefore libvirt fallback to the previous generation as the best match.

RHEL-7.2 contains libvirt with data for Haswell and Broadwell without TSX and it's correctly detected as Haswell-noTSX.

Comment 39 Alpus Chen 2016-07-28 03:42:11 UTC
Thanks for the explanation.

Per comment #35, will Red Hat update the document to mention picking the right cpu type (not the default)?

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

Comment 40 Amnon Ilan 2016-08-01 14:22:09 UTC
Yes, we plan to update with the following corrected text (and 
add Win2016 once it is released):

Limited CPU support for Windows 10 guests

On a Red Hat Enterprise 6 host, Windows 10 guests can only be created when using the following CPU models:

* the Intel Xeon E series
* the Intel Xeon E7 family
* Intel Xeon v2, v3, and v4
* Opteron G2, G3, G4, G5, and G6

For these CPU models, also make sure to set the CPU model of the guest to match the CPU model detected by running the "virsh capabilities" command on the host. Using the application default or hypervisor default prevents the guests from booting properly.

To be able to use Windows 10 guests on Legacy Intel Core 2 processors (also known as Penryn) or Intel Xeon 55xx and 75xx processor families (also known as Nehalem), add the following flag to the Domain XML file, with either Penryn or Nehalem as MODELNAME:

   <cpu mode='custom' match='exact'>
     <model>MODELNAME</model>
     <feature name='fsgsbase' policy='require'/>
   </cpu>

Other CPU models are not supported, and Windows 10 guests created on them are likely to terminate unexpectedly with a stop error, also known as the blue screen of death (BSOD).

Comment 41 Jiri Herrmann 2016-08-02 12:12:43 UTC
The Release Notes document with the updated description has been republished for the Customer Portal:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/known_issues_virtualization.html

If there is still anything amiss with the document, please let me know.

Comment 43 Amnon Ilan 2016-08-16 13:04:27 UTC
Closing this bz, feel free to reopen.

Comment 46 Hannan 2017-10-07 20:46:03 UTC
Hi guys,

I am having same problem. Is there any solution on this? Is it working on CentOS 7.x?

Thanks


Note You need to log in before you can comment on or make changes to this bug.