Bug 1401846 - [virtio-win][netkvm] Guest win2008-32&64 occurs BSoD when running job Ethernet - NDISTest 6.0
Summary: [virtio-win][netkvm] Guest win2008-32&64 occurs BSoD when running job Etherne...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: ybendito
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 1401400
TreeView+ depends on / blocked
 
Reported: 2016-12-06 09:04 UTC by Peixiu Hou
Modified: 2017-06-19 06:10 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-19 06:05:22 UTC
Target Upstream Version:


Attachments (Terms of Use)
128nic200864 whql job details (5.74 MB, application/vnd.ms-cab-compressed)
2016-12-08 13:37 UTC, Peixiu Hou
no flags Details
129NIC200832 wlk package (1.08 MB, application/vnd.ms-cab-compressed)
2017-02-08 05:53 UTC, Peixiu Hou
no flags Details

Description Peixiu Hou 2016-12-06 09:04:53 UTC
Description of problem:
Guest win2008-32&64 occurs BSoD when running job Ethernet - NDISTest 6.0 under q35

Version-Release number of selected component (if applicable):
kernel-3.10.0-524.el7.x86_64
qemu-kvm-rhev-2.6.0-27.el7.x86_64
seabios-1.9.1-5.el7.x86_64.rpm
virtio-win-prewhql-128

How reproducible:
3/3

Steps to Reproduce:
1.Boot a guest:
SUT: /usr/libexec/qemu-kvm -name 128NIC200864CTI -enable-kvm -m 4G -smp 4 -uuid 7ddff42f-b0e5-4ec7-97eb-32bb87cde1d5 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/128NIC200864CTI,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=128NIC200864CTI,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=128NIC200864CTI.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:7f:4f:47:f2 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -M q35 -device ioh3420,bus=pcie.0,id=root1.0,slot=1 -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet2,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:6d:4f:63:36,mq=on,vectors=10 -monitor stdio
Support: /usr/libexec/qemu-kvm -name 128NIC200864STI -enable-kvm -m 4G -smp 4 -uuid 410688d3-4652-4adc-9770-c164e556f39b -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/128NIC200864STI,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=128NIC200864STI,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=128NIC200864STI.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:4b:38:71:28 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -vga cirrus -M q35 -device ioh3420,bus=pcie.0,id=root1.0,slot=1 -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:23:20:b2:9f,bus=root1.0,mq=on,vectors=10 -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet2,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:1b:4f:8a:66,mq=on,vectors=10

2.Run the job "Ethernet - NDISTest 6.0".
3.Check the guest status

Actual results:
BSOD

Expected results:
Pass

Additional info:
Memory Dump file location:
http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/virtio-win/NDISTest-6.0/

Comment 4 Peixiu Hou 2016-12-07 03:11:46 UTC
The memory dump file debug info as following:

kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffff80000003, The exception code that was not handled
Arg2: fffffa60066767cb, The address that the exception occurred at
Arg3: fffffa6003b4da38, Exception Record Address
Arg4: fffffa6003b4d410, Context Record Address

Debugging Details:
------------------


EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid

FAULTING_IP: 
NDProt60+747cb
fffffa60`066767cb cc              int     3

EXCEPTION_RECORD:  fffffa6003b4da38 -- (.exr 0xfffffa6003b4da38)
ExceptionAddress: fffffa60066767cb (NDProt60+0x00000000000747cb)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 1
   Parameter[0]: 0000000000000000

CONTEXT:  fffffa6003b4d410 -- (.cxr 0xfffffa6003b4d410;r)
rax=fffffa8005fcead0 rbx=0000000000000000 rcx=0000000000000001
rdx=0000000000000000 rsi=fffffa8006825078 rdi=fffffa80049463d0
rip=fffffa60066767cb rsp=fffffa6003b4dc70 rbp=0000000000000080
 r8=ffffffffffffffff  r9=8101010101010100 r10=810101010100e0e0
r11=fffffa8006c73030 r12=fffffa600668a470 r13=0000000000000000
r14=fffffa80049463d0 r15=fffffa60017dbcc0
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
NDProt60+0x747cb:
fffffa60`066767cb cc              int     3
Last set context:
rax=fffffa8005fcead0 rbx=0000000000000000 rcx=0000000000000001
rdx=0000000000000000 rsi=fffffa8006825078 rdi=fffffa80049463d0
rip=fffffa60066767cb rsp=fffffa6003b4dc70 rbp=0000000000000080
 r8=ffffffffffffffff  r9=8101010101010100 r10=810101010100e0e0
r11=fffffa8006c73030 r12=fffffa600668a470 r13=0000000000000000
r14=fffffa80049463d0 r15=fffffa60017dbcc0
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
NDProt60+0x747cb:
fffffa60`066767cb cc              int     3
Resetting default scope

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0x7E

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.

EXCEPTION_PARAMETER1:  0000000000000000

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

LAST_CONTROL_TRANSFER:  from fffffa600668a4c3 to fffffa60066767cb

STACK_TEXT:  
fffffa60`03b4dc70 fffffa60`0668a4c3 : fffffa80`06825048 fffffa60`0101de10 00000000`00000000 fffffa60`019acf80 : NDProt60+0x747cb
fffffa60`03b4dd00 fffff800`01878f37 : fffffa80`06825078 00000000`00010286 fffffa60`03b4dd78 00000000`00000001 : NDProt60+0x884c3
fffffa60`03b4dd50 fffff800`016ab616 : fffffa60`017d8180 fffffa80`069b0bb0 fffffa60`017e1d40 fffffa60`017d87f0 : nt!PspSystemThreadStartup+0x57
fffffa60`03b4dd80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16


FOLLOWUP_IP: 
NDProt60+747cb
fffffa60`066767cb cc              int     3

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  NDProt60+747cb

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: NDProt60

IMAGE_NAME:  NDProt60.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4d4c09c8

STACK_COMMAND:  .cxr 0xfffffa6003b4d410 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb

BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:x64_0x7e_vrf_ndprot60+747cb

FAILURE_ID_HASH:  {10996e7b-e4c9-3684-dd80-850743c43426}

Followup: MachineOwner


Best Regards~
Peixiu Hou

Comment 5 ybendito 2016-12-07 18:30:20 UTC
debug extension ndtkd.dll does not succeed to work with test driver ndprot60

According to disassembly, most probable message from the test driver was
"This could possibly be a bug in the tester! Please forward this to ndistd alias."
pCurrentNetBufferList
[testsrc\nettest\ndis\ndistest\commengine\legacy\simplesendcommmanager.cpp @ 896]

Comment 6 ybendito 2016-12-07 18:31:42 UTC
Is it possible to make a try with '-M pc'?

Comment 7 Yvugenfi@redhat.com 2016-12-08 09:38:48 UTC
Also is it possible to narrow down the exact test that failed in NDISTest 6.0?

Comment 8 Peixiu Hou 2016-12-08 13:37:14 UTC
Created attachment 1229480 [details]
128nic200864 whql job details

Hi,

I tried this case with '-M pc', it can be passed normally,none BSOD occur.
Build version: virtio-win-prewhql-128

Boot cli:
/usr/libexec/qemu-kvm -name 128NIC200864CMW -enable-kvm -m 4G -smp 4 -uuid fe90161f-31bf-40b5-97fa-5891c0ab8ee2 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/128NIC200864CMW,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=128NIC200864CMW,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=128NIC200864CMW.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:75:79:c0:01 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -M pc -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:61:25:e6:6d,bus=pci.0,mq=on,vectors=10 -netdev tap,script=/etc/qemu-ifup1,downscript=no,id=hostnet2,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:23:59:24:5a,bus=pci.0,mq=on,vectors=10

And the job run details you can refer to the attachment.

Best Regards~
Peixiu Hou

Comment 9 ybendito 2016-12-20 15:06:12 UTC
1. Please provide data regarding host CPU.
2. Please do reference run with -M q35 and prewhql-126

Comment 10 ybendito 2016-12-23 14:00:50 UTC
Please do also reference run with -M q35 and 'disable-modern=on' on build 128.

Comment 11 Peixiu Hou 2016-12-27 09:47:00 UTC
Hi,

1. Host cpu info:
-------------------------------------------------------------------------------
processor	: 15
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
stepping	: 5
microcode	: 0x19
cpu MHz		: 2260.934
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4521.29
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:
-------------------------------------------------------------------------------

2. Run the job with -M q35 and prewhql-126, it's passed.
   Run the job with -M q35 and 'disable-modern=on' on build 128, bsod also occurred, but the bsod number change to 50, and after the guest rebooted, the job can be passed.

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: e88188ba, memory referenced.
Arg2: 00000001, value 0 = read operation, 1 = write operation.
Arg3: 8188d263, If non-zero, the instruction address which referenced the bad memory
	address.
Arg4: 00000002, (reserved)

Debugging Details:
------------------


WRITE_ADDRESS:  e88188ba 

FAULTING_IP: 
nt!memset+23
8188d263 8807            mov     byte ptr [edi],al

MM_INTERNAL_CODE:  2

DEFAULT_BUCKET_ID:  CODE_CORRUPTION

BUGCHECK_STR:  0x50

PROCESS_NAME:  System

CURRENT_IRQL:  0

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

TRAP_FRAME:  8dbe3568 -- (.trap 0xffffffff8dbe3568)
ErrCode = 00000002
eax=00000000 ebx=857cd0e8 ecx=00000002 edx=0000008a esi=e88188ba edi=e88188ba
eip=8188d263 esp=8dbe35dc ebp=8dbe36ad iopl=0         nv up ei pl nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00210202
nt!memset+0x23:
8188d263 8807            mov     byte ptr [edi],al          ds:0023:e88188ba=??
Resetting default scope

LAST_CONTROL_TRANSFER:  from 81894db4 to 818df36d

STACK_TEXT:  
8dbe3550 81894db4 00000001 e88188ba 00000000 nt!MmAccessFault+0x10a
8dbe3550 8188d263 00000001 e88188ba 00000000 nt!KiTrap0E+0xdc
8dbe35dc 806d9873 e88188ba 00000000 0000008c nt!memset+0x23
8dbe36ad 18806e18 038dbe37 e88188ba a0857cd0 NDIS!ndisQueryDeviceOid+0x1f
WARNING: Frame IP not in any known module. Following frames may be wrong.
8dbe36c5 8b000000 00000000 3f000000 81000f00 0x18806e18
8dbe36c9 00000000 3f000000 81000f00 00002801 0x8b000000


STACK_COMMAND:  kb

CHKIMG_EXTENSION: !chkimg -lo 50 -d !NDIS
    806d9848-806d984d  6 bytes - NDIS!ndisDummyIrpHandler+bc
	[ 5f 5e 5b 5d c2 08:f0 00 05 00 00 00 ]
    806d984f-806d9859  11 bytes - NDIS!ndisDummyIrpHandler+c3 (+0x07)
	[ 90 90 90 90 90 8b ff 55:0a 00 00 00 00 24 00 00 ]
17 errors : !NDIS (806d9848-806d9859)

MODULE_NAME: memory_corruption

IMAGE_NAME:  memory_corruption

FOLLOWUP_NAME:  memory_corruption

DEBUG_FLR_IMAGE_TIMESTAMP:  0

MEMORY_CORRUPTOR:  LARGE

FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_LARGE

BUCKET_ID:  MEMORY_CORRUPTION_LARGE

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:memory_corruption_large

FAILURE_ID_HASH:  {e29154ac-69a4-0eb8-172a-a860f73c0a3c}

Followup: memory_corruption
---------

Best Regards~
Peixiu Hou

Comment 18 Peixiu Hou 2017-02-08 05:53:04 UTC
Created attachment 1248551 [details]
129NIC200832 wlk package

Comment 21 Peixiu Hou 2017-02-09 08:59:36 UTC
Hi,

I tried to filter the job's failure, but it cannot be filtered. And I also  updated the latest filter package for wlk, ran this job again, the failure also cannot be filtered.


Thanks~
Peixiu

Comment 22 xiagao 2017-03-22 08:39:58 UTC
(In reply to Peixiu Hou from comment #4)
> The memory dump file debug info as following:
> 
> kd> !analyze -v
> *****************************************************************************
> **
> *                                                                           
> *
> *                        Bugcheck Analysis                                  
> *
> *                                                                           
> *
> *****************************************************************************
> **
> 
> SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
> This is a very common bugcheck.  Usually the exception address pinpoints
> the driver/function that caused the problem.  Always note this address
> as well as the link date of the driver/image that contains this address.
> Arguments:
> Arg1: ffffffff80000003, The exception code that was not handled
> Arg2: fffffa60066767cb, The address that the exception occurred at
> Arg3: fffffa6003b4da38, Exception Record Address
> Arg4: fffffa6003b4d410, Context Record Address
> 
> Debugging Details:
> ------------------
> 
> 
> EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments
> are invalid
> 
> FAULTING_IP: 
> NDProt60+747cb
> fffffa60`066767cb cc              int     3
> 
> EXCEPTION_RECORD:  fffffa6003b4da38 -- (.exr 0xfffffa6003b4da38)
> ExceptionAddress: fffffa60066767cb (NDProt60+0x00000000000747cb)
>    ExceptionCode: 80000003 (Break instruction exception)
>   ExceptionFlags: 00000000
> NumberParameters: 1
>    Parameter[0]: 0000000000000000
> 
> CONTEXT:  fffffa6003b4d410 -- (.cxr 0xfffffa6003b4d410;r)
> rax=fffffa8005fcead0 rbx=0000000000000000 rcx=0000000000000001
> rdx=0000000000000000 rsi=fffffa8006825078 rdi=fffffa80049463d0
> rip=fffffa60066767cb rsp=fffffa6003b4dc70 rbp=0000000000000080
>  r8=ffffffffffffffff  r9=8101010101010100 r10=810101010100e0e0
> r11=fffffa8006c73030 r12=fffffa600668a470 r13=0000000000000000
> r14=fffffa80049463d0 r15=fffffa60017dbcc0
> iopl=0         nv up ei ng nz na pe nc
> cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
> NDProt60+0x747cb:
> fffffa60`066767cb cc              int     3
> Last set context:
> rax=fffffa8005fcead0 rbx=0000000000000000 rcx=0000000000000001
> rdx=0000000000000000 rsi=fffffa8006825078 rdi=fffffa80049463d0
> rip=fffffa60066767cb rsp=fffffa6003b4dc70 rbp=0000000000000080
>  r8=ffffffffffffffff  r9=8101010101010100 r10=810101010100e0e0
> r11=fffffa8006c73030 r12=fffffa600668a470 r13=0000000000000000
> r14=fffffa80049463d0 r15=fffffa60017dbcc0
> iopl=0         nv up ei ng nz na pe nc
> cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
> NDProt60+0x747cb:
> fffffa60`066767cb cc              int     3
> Resetting default scope
> 
> DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT
> 
> BUGCHECK_STR:  0x7E
> 
> PROCESS_NAME:  System
> 
> CURRENT_IRQL:  0
> 
> ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint
> has been reached.
> 
> EXCEPTION_PARAMETER1:  0000000000000000
> 
> ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre
> 
> LAST_CONTROL_TRANSFER:  from fffffa600668a4c3 to fffffa60066767cb
> 
> STACK_TEXT:  
> fffffa60`03b4dc70 fffffa60`0668a4c3 : fffffa80`06825048 fffffa60`0101de10
> 00000000`00000000 fffffa60`019acf80 : NDProt60+0x747cb
> fffffa60`03b4dd00 fffff800`01878f37 : fffffa80`06825078 00000000`00010286
> fffffa60`03b4dd78 00000000`00000001 : NDProt60+0x884c3
> fffffa60`03b4dd50 fffff800`016ab616 : fffffa60`017d8180 fffffa80`069b0bb0
> fffffa60`017e1d40 fffffa60`017d87f0 : nt!PspSystemThreadStartup+0x57
> fffffa60`03b4dd80 00000000`00000000 : 00000000`00000000 00000000`00000000
> 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16
> 
> 
> FOLLOWUP_IP: 
> NDProt60+747cb
> fffffa60`066767cb cc              int     3
> 
> SYMBOL_STACK_INDEX:  0
> 
> SYMBOL_NAME:  NDProt60+747cb
> 
> FOLLOWUP_NAME:  MachineOwner
> 
> MODULE_NAME: NDProt60
> 
> IMAGE_NAME:  NDProt60.sys
> 
> DEBUG_FLR_IMAGE_TIMESTAMP:  4d4c09c8
> 
> STACK_COMMAND:  .cxr 0xfffffa6003b4d410 ; kb
> 
> FAILURE_BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb
> 
> BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb
> 
> ANALYSIS_SOURCE:  KM
> 
> FAILURE_ID_HASH_STRING:  km:x64_0x7e_vrf_ndprot60+747cb
> 
> FAILURE_ID_HASH:  {10996e7b-e4c9-3684-dd80-850743c43426}
> 
> Followup: MachineOwner
> 
> 
> Best Regards~
> Peixiu Hou

Hit the same BSOD in 2008-32 guest with virtio-win-prewhql-134

Comment 23 xiagao 2017-03-23 02:34:15 UTC
(In reply to xiagao from comment #22)


> > FOLLOWUP_IP: 
> > NDProt60+747cb
> > fffffa60`066767cb cc              int     3
> > 
> > SYMBOL_STACK_INDEX:  0
> > 
> > SYMBOL_NAME:  NDProt60+747cb
> > 
> > FOLLOWUP_NAME:  MachineOwner
> > 
> > MODULE_NAME: NDProt60
> > 
> > IMAGE_NAME:  NDProt60.sys
> > 
> > DEBUG_FLR_IMAGE_TIMESTAMP:  4d4c09c8
> > 
> > STACK_COMMAND:  .cxr 0xfffffa6003b4d410 ; kb
> > 
> > FAILURE_BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb
> > 
> > BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb
> > 
> > ANALYSIS_SOURCE:  KM
> > 
> > FAILURE_ID_HASH_STRING:  km:x64_0x7e_vrf_ndprot60+747cb
> > 
> > FAILURE_ID_HASH:  {10996e7b-e4c9-3684-dd80-850743c43426}
> > 
> > Followup: MachineOwner
> > 
> > 
> > Best Regards~
> > Peixiu Hou
> 
> Hit the same BSOD in 2008-32 guest with virtio-win-prewhql-134

Also hit BSOD in 2008-64 guest.

Besides, I run it more times in 2008-32 guest and hit the other error info.
report error message:
"1c_wmicoverage failed; Unable to query error = 0x80041008; Unable to obtain class name from guid {234E1FBF-37DC-4882-B01E-18F47CC0A40E}"

But that 1c_wmicoverage failure is NOT filtered out with the latest filter package for wlk.

Comment 24 ybendito 2017-03-28 09:44:15 UTC
Please retry with 0.1-135 and -M pc provide the dump file if the problem happens.
Please retry with 0.1-135 and q35 and also provide the dump file if the problem happens. For each retry please use dedicated comment and provide exact qemu command line and qemu/SeaBIOS versions.

 

Regarding 1c_wmicoverage problem: Please ignore it in this BZ. If it can't be filtered out, please open dedicated BZ for that and we'll investigate it.

Comment 25 ybendito 2017-05-04 16:18:42 UTC
Note that the problem with CDROM on q35 was fixed in qemu-kvm-rhev-2.8.0-5.el7
Please retry with virtio-win-prewhql-136 with pc and with q35.
For each retry please use dedicated comment and provide exact qemu command line and qemu/SeaBIOS versions. In case BSOD happens:
- please note on which side of the setup BSOD happened
- please note which exact test caused BSOD (it is possible that in case of BSOD log file will not be collected by WLK), set both machines to 'Unsafe' state and locate on client machines under c:\wlk\JobsWorkingDir all the log files of the test
- please collect and share the dump file

Comment 26 ybendito 2017-05-05 07:35:23 UTC
Note: from the analysis this is very similar to existing manual errata 3162.

For example, it is applicable for sure to following cases:
NDISTest 6.0 - 2c_Mini6RSSSendRecv failed with BSOD on Win7 32/64 (covers all OS's)
NDISTest 6.0 - 1c_Mini6Send failed on win8-32 guest with BSOD

For final decision whether we need to open new support request or use existing errata, it would be good to know exact test where it happens.

Comment 27 Peixiu Hou 2017-05-06 00:40:23 UTC
Hi,

I tried this issue with virtio-win-prewhql-136:

1. With -M pc, tried 3 times, hit BSOD 1 time(occurred on Client machine), and passed 2 times.
2. With -M w35, tried 2 times, BSOD 2 times.

All dump files(for pc and q35) and JobsWorkingDir log files has uploaded to follow location:
http://fileshare.englab.nay.redhat.com/pub/section2/coredump/var/crash/bug1401846/

Used version:
kernel-3.10.0-657.el7.x86_64
qemu-kvm-rhev-2.9.0-1.el7.x86_64
seabios-1.10.2-2.el7.x86_64

qemu command line:
1. With pc: 
/usr/libexec/qemu-kvm -name 136NIC200832CVP -enable-kvm -m 4G -smp 4 -uuid 176f5b48-d65d-4920-b9ac-a8d137760fa4 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/136NIC200832CVP,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=136NIC200832CVP,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x86_dvd_342333.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=136NIC200832CVP.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:00:7e:bb:f9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga std -M pc -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:4f:4e:94:1f,bus=pci.0,mq=on,vectors=10 -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet2,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:27:5b:80:65,bus=pci.0,mq=on,vectors=10 -monitor stdio

2. With q35: /usr/libexec/qemu-kvm -name 136NIC200832C7C -enable-kvm -m 4G -smp 4 -uuid 8424c9f9-b4af-4061-932c-7c0bfcbb1dbf -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/136NIC200832C7C,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb -drive file=136nic200832-q35.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x86_dvd_342333.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=136NIC200832C7C.vfd,if=floppy,id=drive-fdc0-0-0,format=raw,cache=none -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:49:7f:6d:84 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:3 -vga std -M q35 -device ioh3420,bus=pcie.0,id=root1.0,slot=1 -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:6c:06:78:d3,bus=root1.0,mq=on,vectors=10 -monitor stdio -boot order=cd,menu=on

BTW, which path is the new fixed netkvm file on virtio-win-prewhql-136 or later? there are 3 path for netkvm file, like virtio-win-prewhql-0.1-136\Wlh, virtio-win-prewhql-0.1-136\Wnet, virtio-win-prewhql-0.1-136\Wxp, all upper results are used netkvm file in \Wlh path. please help to confirm, thanks a lot~


Best Regards~
Peixiu

Comment 28 ybendito 2017-05-06 21:03:28 UTC
From dump analysis (note that 2008 ndprot60 drivers seem not supported by ndtkd.dll and it's hard to obtain exact information of assertion/breakpoint)
According to disassembly this is
"d:\6229t\testsrc\nettest\ndis\ndistest\commengine\legacy\simplesendcommmanager.cpp", line 896, assert "pCurrentNetBufferList", i.e. seems very similar to manual errata 3162
Unfortunately, the JobsWorkingDirectory does not contain any logs and we can't even locate the test that caused the BSOD.

Comment 29 ybendito 2017-05-08 11:16:12 UTC
(In reply to Peixiu Hou from comment #27)
> BTW, which path is the new fixed netkvm file on virtio-win-prewhql-136 or
> later? there are 3 path for netkvm file, like
> virtio-win-prewhql-0.1-136\Wlh, virtio-win-prewhql-0.1-136\Wnet,
> virtio-win-prewhql-0.1-136\Wxp, all upper results are used netkvm file in
> \Wlh path. please help to confirm, thanks a lot~
> 
> 
> Best Regards~
> Peixiu

Yes, 'wlh' driver is correct one for 2008

Comment 30 ybendito 2017-05-08 11:18:09 UTC
Is it possible to know which exact NDIS6.0 test caused BSOD?

Comment 31 ybendito 2017-05-09 12:31:54 UTC
1. Disable automatic reboot on BSOD on clients
2. Start the test from WLK studio
3. When BSOD happens (the machine is not restarted automatically), reboot the machine from qemu
4. When the machine is restarted, wait until message 'WTT is restarting the computer' appears, then cancel the reboot by 'shutdown /a'
5. Keep c:\wlk\JobsWorkingDir directory (JobRuns and Tasks directories shall not be empty!), zip and attach it to BZ. I hope from these logs we will be able to see the test caused BSOD.
6. Reboot the machine by 'shutdown -r' - upon next reboot the WLK removes logs

Comment 32 Peixiu Hou 2017-05-10 07:06:11 UTC
Hi,

Thanks a lot for your advise, the Log file(include JobRuns and Tasks directories) please download from this location:

http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/virtio-win/bug1401846/

Tried with -M pc on win2008-32, it can be passed.
Tried with -M pc on win2008-64, hit BSOD(2/2).
Tried with -M q35 on win2008-32, hit BSOD(1/1).
And check the NDISTests logfile, before the BSOD occur, the final sub-task is 1c_Mini6Send.

Used version:
qemu-kvm-rhev-2.9.0-3.el7.x86_64
kernel-3.10.0-664.el7.x86_64
virtio-win-prewhql-136

Best Regards~
Peixiu

Comment 33 ybendito 2017-05-12 08:30:16 UTC
Agree, the failed test 1c_Mini6Send in both cases, the manual errata is 3162


Note You need to log in before you can comment on or make changes to this bug.