RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1879149 - Fail to boot Win10 32bit guest(BSOD) on EPYC host
Summary: Fail to boot Win10 32bit guest(BSOD) on EPYC host
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.5
Hardware: x86_64
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Marek Kedzierski
QA Contact: liunana
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-15 14:28 UTC by FuXiangChun
Modified: 2020-09-25 09:30 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-25 08:15:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
screenshot (30.55 KB, image/png)
2020-09-15 14:38 UTC, FuXiangChun
no flags Details
analysis of Windows kernel crashdump (3.71 KB, text/plain)
2020-09-24 12:35 UTC, Marek Kedzierski
no flags Details

Description FuXiangChun 2020-09-15 14:28:22 UTC
Description of problem:
Fail to boot Win10 32bit guest on EPYC host.

Version-Release number of selected component (if applicable):
qemu-kvm: qemu-kvm-rhev-2.10.0-21.el7_5.10.x86_64
kernel: kernel-3.10.0-862.14.4.el7.x86_64
spice: spice-server-0.14.0-2.el7_5.5.x86_64
seabios: seabios-bin-1.11.0-2.el7.noarch
seavgabios: seavgabios-bin-1.11.0-2.el7.noarch
sgabios: sgabios-bin-0.20110622svn-4.el7.noarch
ipxe: ipxe-roms-qemu-20170123-1.git4e85b27.el7_4.1.noarch
virtio-win: virtio-win-1.9.12-4.el7.iso

How reproducible:
100%

Steps to Reproduce:
1./usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox off  \
-machine pc  \
-nodefaults \
-device qxl-vga,bus=pci.0,addr=0x2 \
-device pci-bridge,id=pci_bridge,bus=pci.0,addr=0x4,chassis_nr=1 \
-m 30720  \
-smp 32,maxcpus=32,cores=16,threads=1,sockets=2  \
-cpu 'EPYC',+kvm_pv_unhalt \
-drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=unsafe,format=qcow2,file=/home/win10-32-virtio.qcow2 \
-device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,serial=SYSTEM_DISK0,bus=pci.0,addr=0x8 \
-rtc base=localtime,clock=host  \
-boot menu=off,strict=off,order=cdn,once=d  \
-no-hpet \
-enable-kvm \
-vnc :1 \
-monitor stdio \
2.
3.

Actual results:
Hit BSOD

Expected results:
VM work well

Additional info:

Comment 2 FuXiangChun 2020-09-15 14:38:36 UTC
Created attachment 1714947 [details]
screenshot

Comment 3 FuXiangChun 2020-09-15 14:39:18 UTC
It works if use Opteron_G5 to boot guest.

Comment 12 Marek Kedzierski 2020-09-24 12:35:05 UTC
Created attachment 1716330 [details]
analysis of Windows kernel crashdump

Comment 13 Marek Kedzierski 2020-09-24 12:40:49 UTC
This problem is not related to virtio-win drivers.

According to KVM traces:
msr_read c001102c = 0x (#GP)

which is consistent with Windows kernel analysis:

....
STACK_TEXT:  
8a403a38 83c505af 00000011 00000000 83c5035c hal!HalpErrataApplyPerProcessor+0x2112
8a403a4c 83c4ffb2 00000011 00000000 80d6c150 hal!HalpErrataInitSystem+0x4f
8a403a70 83c4ff78 80d6c150 83c4ff4d 8a403c2c hal!HalpInitSystemHelper+0x2c
8a403a78 83c4ff4d 8a403c2c 83acd216 00000001 hal!HalpInitSystemPhase1+0x18
8a403a80 83acd216 00000001 80d6c150 00000000 hal!HalInitSystem+0x1d
8a403c2c 83877f7f 00000000 8a403c70 8343a166 nt!Phase1InitializationDiscard+0x15e
8a403c38 8343a166 80d6c150 8f048f88 00000000 nt!Phase1Initialization+0x21
8a403c70 8358d9bd 83877f5e 80d6c150 00000000 nt!PspSystemThreadStartup+0x4a
8a403c7c 00000000 00000000 38573847 38793866 nt!KiThreadStartup+0x15
...

EXCEPTION_RECORD:  8a4038d8 -- (.exr 0xffffffff8a4038d8)
ExceptionAddress: 83c51f4a (hal!HalpErrataApplyPerProcessor+0x00002112)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 0
 
CONTEXT:  8a403480 -- (.cxr 0xffffffff8a403480)
eax=8a403a02 ebx=00000117 ecx=c001102c edx=8a403a37 esi=00000000 edi=00000004
eip=83c51f4a esp=8a403a30 ebp=8a403a38 iopl=0         nv up ei ng nz ac pe cy
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00210297
hal!HalpErrataApplyPerProcessor+0x2112:
83c51f4a 0f32            rdmsr
....

So guest tries to read rdmsr (ecx=c001102c) resulting in fault.

Similar task was already reported (and provides a fix):

https://bugzilla.redhat.com/show_bug.cgi?id=1593190

Comment 14 Marek Kedzierski 2020-09-24 13:19:53 UTC
(In reply to Marek Kedzierski from comment #13)
> This problem is not related to virtio-win drivers.
> 
> According to KVM traces:
> msr_read c001102c = 0x (#GP)
> 
> which is consistent with Windows kernel analysis:
> 
> ....
> STACK_TEXT:  
> 8a403a38 83c505af 00000011 00000000 83c5035c
> hal!HalpErrataApplyPerProcessor+0x2112
> 8a403a4c 83c4ffb2 00000011 00000000 80d6c150 hal!HalpErrataInitSystem+0x4f
> 8a403a70 83c4ff78 80d6c150 83c4ff4d 8a403c2c hal!HalpInitSystemHelper+0x2c
> 8a403a78 83c4ff4d 8a403c2c 83acd216 00000001 hal!HalpInitSystemPhase1+0x18
> 8a403a80 83acd216 00000001 80d6c150 00000000 hal!HalInitSystem+0x1d
> 8a403c2c 83877f7f 00000000 8a403c70 8343a166
> nt!Phase1InitializationDiscard+0x15e
> 8a403c38 8343a166 80d6c150 8f048f88 00000000 nt!Phase1Initialization+0x21
> 8a403c70 8358d9bd 83877f5e 80d6c150 00000000 nt!PspSystemThreadStartup+0x4a
> 8a403c7c 00000000 00000000 38573847 38793866 nt!KiThreadStartup+0x15
> ...
> 
> EXCEPTION_RECORD:  8a4038d8 -- (.exr 0xffffffff8a4038d8)
> ExceptionAddress: 83c51f4a (hal!HalpErrataApplyPerProcessor+0x00002112)
>    ExceptionCode: c0000005 (Access violation)
>   ExceptionFlags: 00000000
> NumberParameters: 0
>  
> CONTEXT:  8a403480 -- (.cxr 0xffffffff8a403480)
> eax=8a403a02 ebx=00000117 ecx=c001102c edx=8a403a37 esi=00000000 edi=00000004
> eip=83c51f4a esp=8a403a30 ebp=8a403a38 iopl=0         nv up ei ng nz ac pe cy
> cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00210297
> hal!HalpErrataApplyPerProcessor+0x2112:
> 83c51f4a 0f32            rdmsr
> ....
> 
> So guest tries to read rdmsr (ecx=c001102c) resulting in fault.
> 
> Similar task was already reported (and provides a fix):
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1593190

So the fix is to add ignore_msrs=1 to the KVM configuration.
After that machine boots.

Comment 15 Dr. David Alan Gilbert 2020-09-24 14:58:09 UTC
(In reply to Marek Kedzierski from comment #14)
> (In reply to Marek Kedzierski from comment #13)
> > This problem is not related to virtio-win drivers.
> > 
> > According to KVM traces:
> > msr_read c001102c = 0x (#GP)
> > 
> > which is consistent with Windows kernel analysis:
> > 
> > ....
> > STACK_TEXT:  
> > 8a403a38 83c505af 00000011 00000000 83c5035c
> > hal!HalpErrataApplyPerProcessor+0x2112
> > 8a403a4c 83c4ffb2 00000011 00000000 80d6c150 hal!HalpErrataInitSystem+0x4f
> > 8a403a70 83c4ff78 80d6c150 83c4ff4d 8a403c2c hal!HalpInitSystemHelper+0x2c
> > 8a403a78 83c4ff4d 8a403c2c 83acd216 00000001 hal!HalpInitSystemPhase1+0x18
> > 8a403a80 83acd216 00000001 80d6c150 00000000 hal!HalInitSystem+0x1d
> > 8a403c2c 83877f7f 00000000 8a403c70 8343a166
> > nt!Phase1InitializationDiscard+0x15e
> > 8a403c38 8343a166 80d6c150 8f048f88 00000000 nt!Phase1Initialization+0x21
> > 8a403c70 8358d9bd 83877f5e 80d6c150 00000000 nt!PspSystemThreadStartup+0x4a
> > 8a403c7c 00000000 00000000 38573847 38793866 nt!KiThreadStartup+0x15
> > ...
> > 
> > EXCEPTION_RECORD:  8a4038d8 -- (.exr 0xffffffff8a4038d8)
> > ExceptionAddress: 83c51f4a (hal!HalpErrataApplyPerProcessor+0x00002112)
> >    ExceptionCode: c0000005 (Access violation)
> >   ExceptionFlags: 00000000
> > NumberParameters: 0
> >  
> > CONTEXT:  8a403480 -- (.cxr 0xffffffff8a403480)
> > eax=8a403a02 ebx=00000117 ecx=c001102c edx=8a403a37 esi=00000000 edi=00000004
> > eip=83c51f4a esp=8a403a30 ebp=8a403a38 iopl=0         nv up ei ng nz ac pe cy
> > cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00210297
> > hal!HalpErrataApplyPerProcessor+0x2112:
> > 83c51f4a 0f32            rdmsr
> > ....
> > 
> > So guest tries to read rdmsr (ecx=c001102c) resulting in fault.
> > 
> > Similar task was already reported (and provides a fix):
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1593190
> 
> So the fix is to add ignore_msrs=1 to the KVM configuration.
> After that machine boots.

Well maybe, but we have the fix in kernel-3.10.0-1053.el7 accoding to:
https://bugzilla.redhat.com/show_bug.cgi?id=1593190#c46

why are you running with such an old host kernel?

Comment 16 FuXiangChun 2020-09-25 05:21:47 UTC
(In reply to Dr. David Alan Gilbert from comment #15)
> (In reply to Marek Kedzierski from comment #14)
> > (In reply to Marek Kedzierski from comment #13)
> > > This problem is not related to virtio-win drivers.
> > > 
> > > According to KVM traces:
> > > msr_read c001102c = 0x (#GP)
> > > 
> > > which is consistent with Windows kernel analysis:
> > > 
> > > ....
> > > STACK_TEXT:  
> > > 8a403a38 83c505af 00000011 00000000 83c5035c
> > > hal!HalpErrataApplyPerProcessor+0x2112
> > > 8a403a4c 83c4ffb2 00000011 00000000 80d6c150 hal!HalpErrataInitSystem+0x4f
> > > 8a403a70 83c4ff78 80d6c150 83c4ff4d 8a403c2c hal!HalpInitSystemHelper+0x2c
> > > 8a403a78 83c4ff4d 8a403c2c 83acd216 00000001 hal!HalpInitSystemPhase1+0x18
> > > 8a403a80 83acd216 00000001 80d6c150 00000000 hal!HalInitSystem+0x1d
> > > 8a403c2c 83877f7f 00000000 8a403c70 8343a166
> > > nt!Phase1InitializationDiscard+0x15e
> > > 8a403c38 8343a166 80d6c150 8f048f88 00000000 nt!Phase1Initialization+0x21
> > > 8a403c70 8358d9bd 83877f5e 80d6c150 00000000 nt!PspSystemThreadStartup+0x4a
> > > 8a403c7c 00000000 00000000 38573847 38793866 nt!KiThreadStartup+0x15
> > > ...
> > > 
> > > EXCEPTION_RECORD:  8a4038d8 -- (.exr 0xffffffff8a4038d8)
> > > ExceptionAddress: 83c51f4a (hal!HalpErrataApplyPerProcessor+0x00002112)
> > >    ExceptionCode: c0000005 (Access violation)
> > >   ExceptionFlags: 00000000
> > > NumberParameters: 0
> > >  
> > > CONTEXT:  8a403480 -- (.cxr 0xffffffff8a403480)
> > > eax=8a403a02 ebx=00000117 ecx=c001102c edx=8a403a37 esi=00000000 edi=00000004
> > > eip=83c51f4a esp=8a403a30 ebp=8a403a38 iopl=0         nv up ei ng nz ac pe cy
> > > cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00210297
> > > hal!HalpErrataApplyPerProcessor+0x2112:
> > > 83c51f4a 0f32            rdmsr
> > > ....
> > > 
> > > So guest tries to read rdmsr (ecx=c001102c) resulting in fault.
> > > 
> > > Similar task was already reported (and provides a fix):
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1593190
> > 
> > So the fix is to add ignore_msrs=1 to the KVM configuration.
> > After that machine boots.
> 
> Well maybe, but we have the fix in kernel-3.10.0-1053.el7 accoding to:
> https://bugzilla.redhat.com/show_bug.cgi?id=1593190#c46
> 
> why are you running with such an old host kernel?

Hit this bug in 7.5 host. Just I tried to update host kernel verion to the latest RHEL7.5.z(3.10.0-862.52.1.el7.x86_64). This issue is gone.  So, The latest RHEL7.5.z fixed this bug. Thanks.

Comment 17 Dr. David Alan Gilbert 2020-09-25 08:15:05 UTC
as comment 16; this was actually fixed a long time ago.


Note You need to log in before you can comment on or make changes to this bug.