Bug 1188790

Summary: NetKVM driver crashed on pausing in MPE test
Product: Red Hat Enterprise Linux 7 Reporter: Yossi Hindin <yhindin>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: knoel, lijin, michen, rbalakri, virt-maint, vrozenfe, yvugenfi
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Windows   
Whiteboard: Fixed_Not_Ship
Fixed In Version: virtio-win-prewhql-0.1-101 Doc Type: Bug Fix
Doc Text:
NO_DOCS
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-24 08:49:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yossi Hindin 2015-02-03 17:55:20 UTC
Description of problem: NetKVM driver crashed on pausing in MPE test


Version-Release number of selected component (if applicable):

NetKVM v. 100


How reproducible:

1/2
 
Steps to Reproduce:
1. Run HCK MPE test

Actual results:

Crashes in HCK, close to pausing transition


Expected results:

Test passes


Additional info:

Comment 2 Mike Cao 2015-02-04 06:00:36 UTC
Hi, Yossi 

QE just finish virtio-win-prewhql-100 netkvm whql test and did not this issue 

Mike

Comment 3 lijin 2015-02-12 06:08:47 UTC
win8-32 bsod five times when run this job with "queues=4"
three times bsod with code "a"
twice bsod with code "c4"

package info:
qemu-kvm-rhev-2.1.2-20.el7.x86_64
kernel-3.10.0-223.el7.x86_64
seabios-1.7.5-4.el7.x86_64
virtio-win-prewhql-100

NIC1:
/usr/libexec/qemu-kvm -name 100NICWIN832CJA -enable-kvm -m 2G -smp 2 -uuid bc61d6d1-6ea7-4379-8c8d-e8c4bd457cb6 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/100NICWIN832CJA,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=100NICWIN832CJA,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_8_enterprise_x86_dvd_917587.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=100NICWIN832CJA.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:45:73:31:b4,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:14:36:19:1f,bus=pci.0,mq=on,vectors=10

NIC2:
/usr/libexec/qemu-kvm -name 100NICWIN832SJA -enable-kvm -m 2G -smp 2 -uuid 0d0ba6f2-6226-416f-b9d4-5f34766146b1 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/100NICWIN832SJA,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=100NICWIN832SJA,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_8_enterprise_x86_dvd_917587.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=100NICWIN832SJA.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:44:19:23:a4,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:2d:14:6c:31,bus=pci.0,mq=on,vectors=10

Comment 4 lijin 2015-02-12 06:10:31 UTC
code "a" windbg info:
1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: a459ee60, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000001, bitfield :
	bit 0 : value 0 = read operation, 1 = write operation
	bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 812e4960, address which referenced memory

Debugging Details:
------------------


WRITE_ADDRESS:  a459ee60 Special pool

CURRENT_IRQL:  2

FAULTING_IP: 
nt!KfAcquireSpinLock+1a
812e4960 f00fba2f00      lock bts dword ptr [edi],0

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

BUGCHECK_STR:  AV

PROCESS_NAME:  System

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

TRAP_FRAME:  9c037c1c -- (.trap 0xffffffff9c037c1c)
ErrCode = 00000002
eax=84235800 ebx=a459ee00 ecx=a459ee02 edx=00000008 esi=a459ee60 edi=a459ee60
eip=812e4960 esp=9c037c90 ebp=9c037ca8 iopl=0         nv up ei pl zr na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010246
nt!KfAcquireSpinLock+0x1a:
812e4960 f00fba2f00      lock bts dword ptr [edi],0   ds:0023:a459ee60=????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from 813d6840 to 8135fccc

STACK_TEXT:  
9c037bfc 813d6840 0000000a a459ee60 00000002 nt!KiBugCheck2
9c037bfc 812e4960 0000000a a459ee60 00000002 nt!KiTrap0E+0x2c8
9c037c94 8170aed6 a459ede8 a27302a8 a459ee60 nt!KfAcquireSpinLock+0x1a
9c037ca8 82499148 a459ede8 a27302a8 9c6d8e08 nt!VerifierKfAcquireSpinLock+0x67
9c037ce4 8243cba6 9c037d00 9c6d8e30 8143f3c0 ndis!ndisHandleProtocolUnbindNotification+0x155
9c037d0c 82465dba a46dcfd8 a459ede8 9c037d74 ndis!ndisQueuedUnbindAdapter+0xbd
9c037d1c 812e9854 a46dcfd8 8435d040 00000000 ndis!ndisWorkItemHandler+0xe
9c037d74 8132c415 00000000 00cd9d3a 00000000 nt!ExpWorkerThread+0x111
9c037db0 813d8039 812e9747 00000000 00000000 nt!PspSystemThreadStartup+0x4a
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19


STACK_COMMAND:  kb

FOLLOWUP_IP: 
ndis!ndisHandleProtocolUnbindNotification+155
82499148 8b5770          mov     edx,dword ptr [edi+70h]

SYMBOL_STACK_INDEX:  4

SYMBOL_NAME:  ndis!ndisHandleProtocolUnbindNotification+155

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: ndis

IMAGE_NAME:  ndis.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5010ac18

BUCKET_ID_FUNC_OFFSET:  155

FAILURE_BUCKET_ID:  AV_VRF_ndis!ndisHandleProtocolUnbindNotification

BUCKET_ID:  AV_VRF_ndis!ndisHandleProtocolUnbindNotification

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:av_vrf_ndis!ndishandleprotocolunbindnotification

FAILURE_ID_HASH:  {34126f4d-173c-d14c-2444-c1fe159fa11e}

Followup: MachineOwner
---------

Comment 5 lijin 2015-02-12 06:11:22 UTC
code "c4" windbg info:

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught.  This is
because the driver was specified in the registry as being suspect (by the
administrator) and the kernel has enabled substantial checking of this driver.
If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will
be among the most commonly seen crashes.
Arguments:
Arg1: 000000e0, Calling OS Kernel API with user-mode address as parameter.
Arg2: 00000000, Address used as API parameter.
Arg3: 00000001, Size in bytes of the address range used as API parameter.
Arg4: 00000000

Debugging Details:
------------------

*** ERROR: Module load completed but symbols could not be loaded for ndprot630.sys

OVERLAPPED_MODULE: Address regions for 'ndprot630' and 'ndprot630.sys' overlap

BUGCHECK_STR:  0xc4_e0

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

PROCESS_NAME:  ndistest.exe

CURRENT_IRQL:  0

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

DEVICE_OBJECT: 8aa0d4c0

DRIVER_OBJECT: 8a716dd8

IMAGE_NAME:  ndprot630.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  52cf01e0

MODULE_NAME: ndprot630

FAULTING_MODULE: 9f020000 ndprot630

LAST_CONTROL_TRANSFER:  from 81ab4818 to 81711cb0

STACK_TEXT:  
98a9f674 81ab4818 000000c4 000000e0 00000000 nt!KeBugCheckEx
98a9f69c 81aae5f9 000000c4 000000e0 00000000 nt!VerifierBugCheckIfAppropriate+0x3d
98a9f6c4 81abc583 00000000 00000001 870c53c8 nt!VfUtilSynchronizationObjectSanityChecks+0x32
98a9f6ec 8288a11c 00000000 00000000 00000000 nt!VerifierKeWaitForSingleObject+0xda
98a9f710 9f0cd779 00000000 0000000a 870c9570 ndis!NdisWaitEvent+0x2e
WARNING: Stack unwind information not available. Following frames may be wrong.
98a9f724 9f0cd755 0000000a 870c9570 98a9f760 ndprot630+0xad779
98a9f734 9f05b19d 0000000a ffffffff 82ea9890 ndprot630+0xad755
98a9f760 9f057997 a546afd0 00000030 82ea9890 ndprot630+0x3b19d
98a9f7c4 9f045b51 28000005 00000000 a546afd0 ndprot630+0x37997
98a9f9b0 9f04e666 28000005 00000000 a546afd0 ndprot630+0x25b51
98a9fad4 9f022e3c 28000005 00000000 a546afd0 ndprot630+0x2e666
98a9fb3c 9f0a80bf 28000005 00000000 a546afd0 ndprot630+0x2e3c
98a9fba8 9f0277cf 8aa0d4c0 a48d6f68 a48d6fd8 ndprot630+0x880bf
98a9fbc8 81aacf4b 8aa0d4c0 a48d6f68 94070000 ndprot630+0x77cf
98a9fbe8 81652a9f 81862f1c 94079bc0 a48d6f68 nt!IovCallDriver+0x2e3
98a9fbfc 81862f1c a48d6ffc a48d6f68 94079bc0 nt!IofCallDriver+0x62
98a9fc50 81862991 8aa0d4c0 00000000 8185fa01 nt!IopSynchronousServiceTail+0x121
98a9fcf0 818625c9 8aa0d4c0 a48d6f68 00000000 nt!IopXxxControlFile+0x3ac
98a9fd24 817852fc 00000fb4 00000e54 00000000 nt!NtDeviceIoControlFile+0x2a
98a9fd24 776d6954 00000fb4 00000e54 00000000 nt!KiFastCallEntry+0x12c
0bd2e998 00000000 00000000 00000000 00000000 0x776d6954


STACK_COMMAND:  kb

FOLLOWUP_IP: 
ndprot630+ad779
9f0cd779 8be5            mov     esp,ebp

SYMBOL_STACK_INDEX:  5

SYMBOL_NAME:  ndprot630+ad779

FOLLOWUP_NAME:  MachineOwner

FAILURE_BUCKET_ID:  0xc4_e0_VRF_ndprot630+ad779

BUCKET_ID:  0xc4_e0_VRF_ndprot630+ad779

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:0xc4_e0_vrf_ndprot630+ad779

FAILURE_ID_HASH:  {5d82b425-d37a-214f-f4ec-bff802e43225}

Followup: MachineOwner
---------

Comment 7 Vadim Rozenfeld 2015-03-03 02:43:52 UTC
Should be fixed in build 101, available at http://download.devel.redhat.com/brewroot/packages/virtio-win-prewhql/0.1/101/win/virtio-win-prewhql-0.1.zip

Comment 8 lijin 2015-03-10 08:11:16 UTC
job can pass on all windows guests with build 101

Comment 12 lijin 2015-07-17 07:35:47 UTC
change status to verified according to comment#8

Comment 14 errata-xmlrpc 2015-11-24 08:49:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2513.html