Bug 1147203

Summary: [virtio-win][whql][netkvm]win2k8-64 bsod(7e) when run job "Ethernet - NDISTest 6.0"
Product: Red Hat Enterprise Linux 7 Reporter: lijin <lijin>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.1CC: dfleytma, juzhang, knoel, mdeng, michen, rbalakri, virt-bugs, virt-maint, vrozenfe, yvugenfi
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virtio-win-prewhql-0.1-101 Doc Type: Bug Fix
Doc Text:
NO_DOCS
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-24 08:44:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lijin 2014-09-28 05:58:50 UTC
Description of problem:
win2k8-64 bsod(7e) when run job "Ethernet - NDISTest 6.0",win2k8-32 can pass this job.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.1.0-4.el7.x86_64
kernel-3.10.0-165.el7.x86_64
seabios-1.7.5-4.el7.x86_64
virtio-win-prewhql-92

How reproducible:
3/3

Steps to Reproduce:
1.boot guest with:
nic1:
/usr/libexec/qemu-kvm -name 092NIC200864CJP -enable-kvm -m 4G -smp 4 -uuid 1b7849f9-e9e8-40eb-a752-1ca519a327fb -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/092NIC200864CJP,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=092NIC200864CJP,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=092NIC200864CJP.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:00:18:23:42,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:2 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:25:2c:83:0b,bus=pci.0,mq=on -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet2,vhost=on -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:7e:18:bc:4e,bus=pci.0,mq=on -monitor stdio

nic2:
/usr/libexec/qemu-kvm -name 092NIC200864SJP -enable-kvm -m 2G -smp 2 -uuid 8b8d7b54-056e-4240-ae26-d8e39bf23993 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/092NIC200864SJP,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=092NIC200864SJP,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=092NIC200864SJP.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:35:32:c9:6a,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:3 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:2a:4e:a7:73,bus=pci.0,mq=on -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet2,vhost=on -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:52:63:52:f9:a7,bus=pci.0,mq=on

2.submit job in wlk.

Actual results:
guest bsod with 7e,job failed

Expected results:
job can pass,no bsod

Additional info:
windbg info:
1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffff80000003, The exception code that was not handled
Arg2: fffffa60062837cb, The address that the exception occurred at
Arg3: fffffa6005048a38, Exception Record Address
Arg4: fffffa6005048410, Context Record Address

Debugging Details:
------------------


EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid

FAULTING_IP: 
NDProt60+747cb
fffffa60`062837cb cc              int     3

EXCEPTION_RECORD:  fffffa6005048a38 -- (.exr 0xfffffa6005048a38)
ExceptionAddress: fffffa60062837cb (NDProt60+0x00000000000747cb)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 1
   Parameter[0]: 0000000000000000

CONTEXT:  fffffa6005048410 -- (.cxr 0xfffffa6005048410)
rax=fffffa8002b1e990 rbx=0000000000000000 rcx=0000000000000001
rdx=0000000000000000 rsi=fffffa8003abfe08 rdi=fffffa8001919c10
rip=fffffa60062837cb rsp=fffffa6005048c70 rbp=0000000000000080
 r8=ffffffffffffffff  r9=8101010101010100 r10=810101010100e0e0
r11=fffffa8003bfe030 r12=fffffa6006297470 r13=0000000000000000
r14=fffffa8001919c10 r15=fffffa60005efcc0
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
NDProt60+0x747cb:
fffffa60`062837cb cc              int     3
Resetting default scope

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0x7E

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.

EXCEPTION_PARAMETER1:  0000000000000000

LAST_CONTROL_TRANSFER:  from fffffa60062974c3 to fffffa60062837cb

STACK_TEXT:  
fffffa60`05048c70 fffffa60`062974c3 : fffffa80`03abfdd8 fffffa60`01025e10 fffffa60`049b9620 00000000`00000000 : NDProt60+0x747cb
fffffa60`05048d00 fffff800`018c5f37 : fffffa80`03abfe08 00000000`00010286 fffffa60`05048d78 00000000`00000001 : NDProt60+0x884c3
fffffa60`05048d50 fffff800`016f8616 : fffffa60`005ec180 fffffa80`03c35bb0 fffffa80`029c7060 fffffa60`005ec950 : nt!PspSystemThreadStartup+0x57
fffffa60`05048d80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16


FOLLOWUP_IP: 
NDProt60+747cb
fffffa60`062837cb cc              int     3

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  NDProt60+747cb

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: NDProt60

IMAGE_NAME:  NDProt60.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4d4c09c8

STACK_COMMAND:  .cxr 0xfffffa6005048410 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb

BUCKET_ID:  X64_0x7E_VRF_NDProt60+747cb

Followup: MachineOwner
---------

Comment 3 lijin 2014-09-28 09:23:25 UTC
win2k8-32 hit this issue once when running job "Ethernet – NDISTest6.5 (Manual)",dump files located in the same place in comment#1

Comment 5 Yossi Hindin 2015-01-15 10:01:53 UTC
Please, test this bug usinh build virtio-win-prewhql-0.1-100

Comment 7 lijin 2015-02-03 07:44:18 UTC
test with virtio-win-prewhql-100,2k8-64 guest passed this job,no bsod,no error

package info:
    virtio-win-prewhql-0.1-100
    qemu-kvm-rhev-2.1.2-20.el7.x86_64
    kernel-3.10.0-223.el7.x86_64
    seabios-1.7.5-7.el7.x86_64
    spice-server-0.12.4-9.el7.x86_64

Comment 8 lijin 2015-02-12 09:36:07 UTC
(In reply to lijin from comment #7)
> test with virtio-win-prewhql-100,2k8-64 guest passed this job,no bsod,no
> error
> 
> package info:
>     virtio-win-prewhql-0.1-100
>     qemu-kvm-rhev-2.1.2-20.el7.x86_64
>     kernel-3.10.0-223.el7.x86_64
>     seabios-1.7.5-7.el7.x86_64
>     spice-server-0.12.4-9.el7.x86_64

run this job with "-netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:2d:14:6c:31,bus=pci.0,mq=on"

Comment 9 Mike Cao 2015-02-12 09:47:10 UTC
Still Can hit this issue with queues=4 environment same as c#7

CLI:/usr/libexec/qemu-kvm -name 100NIC200864CQM -enable-kvm -m 4G -smp 4 -uuid 8fd8b052-f4bd-44fd-9633-0068c39ada1c -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/100NIC200864CQM,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=100NIC200864CQM,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2008_datacenter_enterprise_standard_sp2_x64_dvd_342336.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=100NIC200864CQM.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:41:27:90:13,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:2 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=

Comment 11 Yossi Hindin 2015-03-02 17:28:30 UTC
Hi

   Recently, we have fixed very similar bug. Please, rerun the test using build 101.

   Regards,
      Joseph Hindin

Comment 12 Yossi Hindin 2015-03-03 13:21:29 UTC
For informational purposes only: windows crashed after successful test run, dump analysis is below:
1: kd> !analyze -v\
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffff80000003, The exception code that was not handled
Arg2: fffff8000165f738, The address that the exception occurred at
Arg3: fffffa60017f3fa8, Exception Record Address
Arg4: fffffa60017f3980, Context Record Address

Debugging Details:
------------------


EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid

FAULTING_IP: 
nt!DebugPrompt+18
fffff800`0165f738 c3              ret

EXCEPTION_RECORD:  fffffa60017f3fa8 -- (.exr 0xfffffa60017f3fa8)
ExceptionAddress: fffff8000165f738 (nt!DebugPrompt+0x0000000000000018)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 1
   Parameter[0]: 0000000000000002

CONTEXT:  fffffa60017f3980 -- (.cxr 0xfffffa60017f3980;r)
rax=0000000000000002 rbx=fffffa8000d42aa0 rcx=fffffa60052fc1a0
rdx=fffffa60017f0044 rsi=fffffa8001b6f1a0 rdi=fffffa60052fc1e5
rip=fffff8000165f737 rsp=fffffa60017f41e8 rbp=fffffa60017f45b8
 r8=fffffa60017f4278  r9=fffffa60052f0002 r10=0000000000000000
r11=fffffa60017f4238 r12=fffffa8000d42aa0 r13=0000000000000000
r14=0000000000000001 r15=0000000000000001
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
nt!DebugPrompt+0x17:
fffff800`0165f737 cc              int     3
Last set context:
rax=0000000000000002 rbx=fffffa8000d42aa0 rcx=fffffa60052fc1a0
rdx=fffffa60017f0044 rsi=fffffa8001b6f1a0 rdi=fffffa60052fc1e5
rip=fffff8000165f737 rsp=fffffa60017f41e8 rbp=fffffa60017f45b8
 r8=fffffa60017f4278  r9=fffffa60052f0002 r10=0000000000000000
r11=fffffa60017f4238 r12=fffffa8000d42aa0 r13=0000000000000000
r14=0000000000000001 r15=0000000000000001
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
nt!DebugPrompt+0x17:
fffff800`0165f737 cc              int     3
Resetting default scope

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0x7E

PROCESS_NAME:  System

CURRENT_IRQL:  0

ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.

EXCEPTION_PARAMETER1:  0000000000000002

ANALYSIS_VERSION: 6.3.9600.17029 (debuggers(dbg).140219-1702) x86fre

LAST_CONTROL_TRANSFER:  from fffff800016d12ec to fffff8000165f737

STACK_TEXT:  
fffffa60`017f41e8 fffff800`016d12ec : fffffa80`00d42aa0 fffff800`016a220c fffffa80`02538590 fffffa60`0521dc4d : nt!DebugPrompt+0x17
fffffa60`017f41f0 fffffa60`05270598 : fffffa80`025418b0 fffffa60`0085f6a0 fffffa60`052fc1f0 fffffa60`017f4188 : nt!DbgPrompt+0x3c
fffffa60`017f4240 fffffa60`052122b6 : fffffa60`00000001 fffffa60`052e1a40 ffffffff`00000151 81010101`01010100 : NDProt60+0x6a598
fffffa60`017f42a0 fffffa60`052178c7 : fffffa80`021c1df0 fffffa60`017f4450 fffffa80`0213b040 fffffa80`00d1ed38 : NDProt60+0xc2b6
fffffa60`017f4300 fffffa60`0520ff85 : fffffa80`021c1df0 fffffa60`017f4450 fffffa80`0213b040 fffffa60`017f4410 : NDProt60+0x118c7
fffffa60`017f4360 fffffa60`0520c9ec : fffffa80`01eaaf60 fffffa60`017f4450 fffffa80`0213b040 00000000`00010282 : NDProt60+0x9f85
fffffa60`017f4390 fffffa60`009a5f17 : fffffa60`017f4450 fffffa80`0213b040 fffffa80`01b6f100 fffffa60`00808200 : NDProt60+0x69ec
fffffa60`017f4400 fffffa60`009a6dd5 : fffffa80`01b6f1a0 fffffa80`019f3c00 00000000`00160700 fffffa80`019f3c70 : NDIS!ndisUnbindProtocol+0x1f7
fffffa60`017f4510 fffffa60`009a9011 : fffffa80`01b6f101 fffffa60`017f4568 fffffa80`019f3c70 fffffa80`025c47b0 : NDIS!ndisCloseMiniportBindings+0x3e5
fffffa60`017f4620 fffffa60`00997cbe : fffffa60`0085f110 00000000`00000002 fffffa80`01b6f1a0 fffffa80`025b8fd0 : NDIS!ndisPnPRemoveDevice+0x1e1
fffffa60`017f47b0 fffff800`01a7058a : fffff980`040b4dc0 fffff980`040b4dc0 fffffa80`01b6f050 00000000`00000000 : NDIS! ?? ::LNCPHCLB::`string'+0x71ec
fffffa60`017f4850 fffff800`01a6f31a : fffff980`040b4f68 fffffa80`01b6f050 fffffa80`00000001 fffffa80`025f5800 : nt!IovCallDriver+0x34a
fffffa60`017f4890 fffff800`01a7058a : fffff980`040b4dc0 00000000`00000002 fffffa80`01fd7040 fffffa80`0269d940 : nt!ViFilterDispatchPnp+0xea
fffffa60`017f48c0 fffff800`018605f2 : fffff980`040b4dc0 00000000`00000000 fffff980`040b4dc0 fffffa80`0269d940 : nt!IovCallDriver+0x34a
fffffa60`017f4900 fffff800`01a3ef61 : fffffa80`017c2a30 00000000`00000000 fffffa80`017c32f0 00000000`00000002 : nt!IopSynchronousCall+0x10a
fffffa60`017f4970 fffff800`017384d6 : fffff880`0570ba60 fffff880`0570ba60 fffff880`0570ba80 00000000`00000000 : nt!IopRemoveDevice+0x101
fffffa60`017f4a30 fffff800`01a3eaa4 : fffffa80`017c32f0 00000000`00000000 00000000`00000002 fffffa80`01f4a4b0 : nt!PnpRemoveLockedDeviceNode+0x1a6
fffffa60`017f4a80 fffff800`01a3ebc0 : 00000000`00000000 fffffa80`017c3201 fffff880`057700b0 fffff800`3f051397 : nt!PnpDeleteLockedDeviceNode+0x44
fffffa60`017f4ab0 fffff800`01a43277 : 00000000`00000002 00000000`00000000 00000000`00000000 fffffa80`00000000 : nt!PnpDeleteLockedDeviceNodes+0xa0
fffffa60`017f4b20 fffff800`01a438ac : fffffa60`00000000 00000000`00010200 fffffa60`017f4c00 00000000`00000000 : nt!PnpProcessQueryRemoveAndEject+0xbe7
fffffa60`017f4c70 fffff800`0194290a : 00000000`00000001 fffffa80`0265c1f0 fffff880`05858970 fffff800`01862800 : nt!PnpProcessTargetDeviceEvent+0x4c
fffffa60`017f4ca0 fffff800`0166c8c3 : fffff800`01862850 fffff880`05858970 fffff800`0179c8f8 fffffa80`00d1ebb0 : nt! ?? ::NNGAKEGL::`string'+0x502d7
fffffa60`017f4cf0 fffff800`0186ff37 : fffffa80`0265c1f0 00000000`00000000 fffffa80`00d1ebb0 00000000`00000080 : nt!ExpWorkerThread+0xfb
fffffa60`017f4d50 fffff800`016a2616 : fffffa60`005ec180 fffffa80`00d1ebb0 fffffa60`005f5d40 fffffa80`00d1e138 : nt!PspSystemThreadStartup+0x57
fffffa60`017f4d80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16


FOLLOWUP_IP: 
NDProt60+6a598
fffffa60`05270598 8b442444        mov     eax,dword ptr [rsp+44h]

SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  NDProt60+6a598

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: NDProt60

IMAGE_NAME:  NDProt60.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4d4c09c8

STACK_COMMAND:  .cxr 0xfffffa60017f3980 ; kb

FAILURE_BUCKET_ID:  X64_0x7E_VRF_NDProt60+6a598

BUCKET_ID:  X64_0x7E_VRF_NDProt60+6a598

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:x64_0x7e_vrf_ndprot60+6a598

FAILURE_ID_HASH:  {6e92616a-7217-0813-744a-4e931898daa3}

Followup: MachineOwner
---------
---

Comment 13 Mike Cao 2015-03-17 02:46:49 UTC
Verified this issue on virtio-win-prewhql-101

Steps same as comment#0

Actual Results:
The job can pass

Based on above ,this issue has fixed ald 
Move status to VERIFIED

Comment 16 errata-xmlrpc 2015-11-24 08:44:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2513.html