Bug 1972487

Summary: [virtio-win][whql test] Win10-64 & Win2019 & Win2022(vhost=on) hit BSOD when run '2c_Mini6RSSSendRecv (Multi-Group Win8+)'
Product: Red Hat Enterprise Linux 9 Reporter: leidwang <leidwang>
Component: virtio-winAssignee: ybendito
virtio-win sub component: virtio-win-prewhql QA Contact: Ke Ma <mama>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: ailan, lijin, mdean, qizhu, ybendito, yvugenfi
Version: 9.0Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: 9.0   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-17 15:35:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1968315, 2057757    
Attachments:
Description Flags
screenshot
none
Build 21050 for test none

Description leidwang@redhat.com 2021-06-16 03:51:24 UTC
Description of problem:
Win2019 hit BSOD when run '2c_Mini6RSSSendRecv (Multi-Group Win8+)'

Version-Release number of selected component (if applicable):
kernel-4.18.0-310.el8.x86_64
qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64
virtio-win-prewhql-199

How reproducible:
100%

Steps to Reproduce:
1.boot up ws2019
/usr/libexec/qemu-kvm -name 199NIC196435CUD -enable-kvm -m 6G -smp 8 -uuid bbee5854-bdf1-421f-a6e3-95d28b84ed18 -nodefaults -cpu Skylake-Server,hv_stimer,hv_synic,hv_time,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi -chardev socket,id=charmonitor,path=/tmp/199NIC196435CUD,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=199NIC196435CUD,node-name=my_file -blockdev driver=raw,node-name=my,file=my_file -device ide-hd,drive=my,id=ide0-0-0,bus=ide.0,unit=0 -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/kvm_autotest_root/iso/ISO/Win2019/en_windows_server_2019_updated_may_2020_x64_dvd_5651846f.iso,node-name=my_cd,read-only=on -blockdev driver=raw,node-name=mycd,file=my_cd,read-only=on -device ide-cd,drive=mycd,id=ide0-1-0,bus=ide.1,unit=0 -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=199NIC196435CUD.iso,node-name=my_iso,read-only=on -blockdev driver=raw,node-name=myiso,file=my_iso,read-only=on -device ide-cd,drive=myiso,id=ide0-1-1 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -M q35 -device pcie-root-port,bus=pcie.0,id=root1.0,multifunction=on,port=0x10,chassis=1,addr=0x7 -device pcie-root-port,bus=pcie.0,id=root2.0,port=0x11,chassis=2,addr=0x7.0x1 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000e,bus=root1.0,netdev=hostnet0,id=net0,mac=00:52:19:66:ec:1e -vga std -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=8 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:3c:36:5f:4d,bus=root2.0,mq=on,vectors=18

/usr/libexec/qemu-kvm -name 199NIC196435SUD -enable-kvm -m 6G -smp 8 -uuid 4eab8737-a46f-435e-a124-14f2eebfc81e -nodefaults -cpu Skylake-Server,hv_stimer,hv_synic,hv_time,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi -chardev socket,id=charmonitor,path=/tmp/199NIC196435SUD,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=199NIC196435SUD,node-name=my_file -blockdev driver=raw,node-name=my,file=my_file -device ide-hd,drive=my,id=ide0-0-0,bus=ide.0,unit=0 -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/kvm_autotest_root/iso/ISO/Win2019/en_windows_server_2019_updated_may_2020_x64_dvd_5651846f.iso,node-name=my_cd,read-only=on -blockdev driver=raw,node-name=mycd,file=my_cd,read-only=on -device ide-cd,drive=mycd,id=ide0-1-0,bus=ide.1,unit=0 -blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=199NIC196435SUD.iso,node-name=my_iso,read-only=on -blockdev driver=raw,node-name=myiso,file=my_iso,read-only=on -device ide-cd,drive=myiso,id=ide0-1-1 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -M q35 -device pcie-root-port,bus=pcie.0,id=root1.0,multifunction=on,port=0x10,chassis=1,addr=0x7 -device pcie-root-port,bus=pcie.0,id=root2.0,port=0x11,chassis=2,addr=0x7.0x1 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000e,bus=root1.0,netdev=hostnet0,id=net0,mac=00:52:3b:30:b1:84 -vga std -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=8 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:36:73:d0:ab,bus=root2.0,mq=on,vectors=18
2.submit '2c_Mini6RSSSendRecv (Multi-Group Win8+)' to hlk-1809

Actual results:
Guest(199NIC196435SUD) bsod

Expected results:
Job passed
Additional info:

Comment 1 leidwang@redhat.com 2021-06-16 03:53:18 UTC
Created attachment 1791437 [details]
screenshot

Comment 3 leidwang@redhat.com 2021-06-17 05:33:37 UTC
Win10-64 also hit BSOD when run "Run RSC Tests". Stop code same as comment1.

Comment 5 leidwang@redhat.com 2021-06-17 06:02:44 UTC
(In reply to leidwang from comment #3)
> Win10-64 also hit BSOD when run "Run RSC Tests". Stop code same as comment1.

Sorry,Win10-64 hit BSOD when run "2c_Mini6RSSSendRecv (Multi-Group Win8+)". Stop code same as comment1.

Comment 6 leidwang@redhat.com 2021-06-18 03:49:12 UTC
Win2022(vhost=on) hit BSOD when run "2c_Mini6RSSSendRecv (Multi-Group Win8+)".Stop code same as comment1.

Comment 31 Yvugenfi@redhat.com 2021-07-29 05:56:30 UTC
The fix merged upstream.

Comment 32 leidwang@redhat.com 2021-08-05 03:11:37 UTC
Test this job(2c_Mini6RSSSendRecv (Multi-Group Win8+)) with virtio-win-prewhql-0.1-206(win2022,win2019), 
but still hit BSOD. Stop code is the same as before.

Can we move this BZ to 8.6? This BZ will not block our testing.

Comment 33 Yvugenfi@redhat.com 2021-08-05 08:34:31 UTC
(In reply to leidwang from comment #32)
> Test this job(2c_Mini6RSSSendRecv (Multi-Group Win8+)) with
> virtio-win-prewhql-0.1-206(win2022,win2019), 
> but still hit BSOD. Stop code is the same as before.
> 
> Can we move this BZ to 8.6? This BZ will not block our testing.

If it is not a blocker, let's move it to RHEL8.6

Comment 34 Yvugenfi@redhat.com 2021-08-05 08:34:58 UTC
Please upload the new dump as well. Thanks.

Comment 35 Yvugenfi@redhat.com 2021-08-05 08:48:20 UTC
(In reply to leidwang from comment #32)
> Test this job(2c_Mini6RSSSendRecv (Multi-Group Win8+)) with
> virtio-win-prewhql-0.1-206(win2022,win2019), 
> but still hit BSOD. Stop code is the same as before.
> 
> Can we move this BZ to 8.6? This BZ will not block our testing.

Moving to RHEL8.6 based on comment#32

Comment 37 yimsong 2021-09-02 06:55:23 UTC
Hit this issue when run the job "2c_Mini6RSSSendRecv (Multi-Group Win8+)" int win11 whql testing, stop code is same as comment1,
pkg:
qemu-kvm-6.0.0-28.module+el8.5.0+12271+fffa967b.x86_64
kernel-4.18.0-330.el8.x86_64
seabios-1.14.0-1.module+el8.4.0+8855+a9e237a9.x86_64
virtio-win-prewhql-0.1-207.iso

Comment 39 ybendito 2021-09-23 15:15:09 UTC
Please run this specific test ( '2c_Mini6RSSSendRecv (Multi-Group Win8+)') at least 5 times if not crashes earlier:
1. on virtio-win build 162
2. on virtio-win build 171
Please use Server2022 release (build 20348) with respective HCK (do not use the evaluation build 20344).
In case of crashes please share the dump files.
Thanks in advance.

Comment 40 ybendito 2021-10-04 18:15:40 UTC
Created attachment 1829162 [details]
Build 21050 for test

Comment 41 ybendito 2021-10-04 18:18:40 UTC
Please try the fix candidate build 21050 (comment #40) at least 5 times.
In case of crash please collect the memory dumps.

Comment 58 Yvugenfi@redhat.com 2022-01-18 13:23:17 UTC
Merged upstream: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/659

Waiting for downstream build.

Comment 61 leidwang@redhat.com 2022-02-26 01:36:01 UTC
Tested the job with 216 build on Win2022,Win2019,Win10-64.

1.Not hit BSOD on Win2022

2.Hit BSOD at the fourth test on Win2019
stop code:SYSTEM THREAD EXCEPTION NOT HANDLED

2.Hit BSOD at the fourth test on Win10-64
stop code:SYSTEM THREAD EXCEPTION NOT HANDLED

Hi Yan,I will upload dump files later,could you please help to check if the crash is related to the RSS test?Thanks!

Comment 64 leidwang@redhat.com 2022-02-26 01:49:22 UTC
(In reply to leidwang from comment #61)
> Tested the job with 216 build on Win2022,Win2019,Win10-64.
> 
> 1.Not hit BSOD on Win2022
> 
> 2.Hit BSOD at the fourth test on Win2019
> stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> 
> 2.Hit BSOD at the fourth test on Win10-64
Sorry,hit BSOD at the first test on Win10-64
> stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> 
> Hi Yan,I will upload dump files later,could you please help to check if the
> crash is related to the RSS test?Thanks!

Comment 66 ybendito 2022-02-27 23:11:38 UTC
(In reply to leidwang from comment #64)
> (In reply to leidwang from comment #61)
> > Tested the job with 216 build on Win2022,Win2019,Win10-64.
> > 
> > 1.Not hit BSOD on Win2022
> > 
> > 2.Hit BSOD at the fourth test on Win2019
> > stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> > 
> > 2.Hit BSOD at the fourth test on Win10-64
> Sorry,hit BSOD at the first test on Win10-64
> > stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> > 
> > Hi Yan,I will upload dump files later,could you please help to check if the
> > crash is related to the RSS test?Thanks!

Hi Leidwang,
Can you please provide exact qemu and kernel versions when you reproduce the problem.
In both dumps the problem is different that initial one and we want to try reproducing it.
Thanks

Comment 67 leidwang@redhat.com 2022-02-28 00:57:49 UTC
(In reply to ybendito from comment #66)
> (In reply to leidwang from comment #64)
> > (In reply to leidwang from comment #61)
> > > Tested the job with 216 build on Win2022,Win2019,Win10-64.
> > > 
> > > 1.Not hit BSOD on Win2022
> > > 
> > > 2.Hit BSOD at the fourth test on Win2019
> > > stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> > > 
> > > 2.Hit BSOD at the fourth test on Win10-64
> > Sorry,hit BSOD at the first test on Win10-64
> > > stop code:SYSTEM THREAD EXCEPTION NOT HANDLED
> > > 
> > > Hi Yan,I will upload dump files later,could you please help to check if the
> > > crash is related to the RSS test?Thanks!
> 
> Hi Leidwang,
> Can you please provide exact qemu and kernel versions when you reproduce the
> problem.
> In both dumps the problem is different that initial one and we want to try
> reproducing it.
> Thanks

qemu:qemu-kvm-6.2.0-7.el9.x86_64
kernel:kernel-5.14.0-57.kpq0.el9.x86_64

Comment 68 leidwang@redhat.com 2022-03-01 10:20:49 UTC
Based on our discussions at the Windows team meeting, the BSOD we encountered this time was a completely different problem.
So we'll close this bz and open a new bug to track new issues. Thanks!

Comment 69 Yvugenfi@redhat.com 2022-03-08 09:42:01 UTC
(In reply to leidwang from comment #68)
> Based on our discussions at the Windows team meeting, the BSOD we
> encountered this time was a completely different problem.
> So we'll close this bz and open a new bug to track new issues. Thanks!

Do you have a new BZ number?

Comment 70 leidwang@redhat.com 2022-03-08 10:24:03 UTC
(In reply to Yvugenfi from comment #69)
> (In reply to leidwang from comment #68)
> > Based on our discussions at the Windows team meeting, the BSOD we
> > encountered this time was a completely different problem.
> > So we'll close this bz and open a new bug to track new issues. Thanks!
> 
> Do you have a new BZ number?

https://bugzilla.redhat.com/show_bug.cgi?id=2059569

Comment 72 ybendito 2022-04-19 08:25:32 UTC
Reopened as the fix was reverted. Under investigation again.

Comment 74 Yvugenfi@redhat.com 2022-05-16 07:27:10 UTC
*** Bug 2059569 has been marked as a duplicate of this bug. ***

Comment 76 errata-xmlrpc 2022-05-17 15:35:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: virtio-win), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:3890