Bug 1359072 - [virtio-win][netkvm][whql]many whql jobs occurred BSOD(DRIVER_VERIFIER_DETECTED_VIOLATION (c4)) on build 122
Summary: [virtio-win][netkvm][whql]many whql jobs occurred BSOD(DRIVER_VERIFIER_DETECT...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Ladi Prosek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-22 08:54 UTC by Yu Wang
Modified: 2016-11-04 08:55 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
NO_DOCS
Clone Of:
Environment:
Last Closed: 2016-11-04 08:55:30 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2609 normal SHIPPED_LIVE virtio-win bug fix and enhancement update 2016-11-03 15:27:12 UTC

Description Yu Wang 2016-07-22 08:54:30 UTC
Description of problem:
many whql jobs occurred BSOD on build 122

guest os:
win2008R2: almost all the DF- jobs :
some of the jobs:
* DF - Concurrent Hardware And Operating System (CHAOS) Test (Certification)
* DF - Fuzz misc API with zero length query test (Certification)
* DF - Fuzz Misc API test (Certification)
* DF - Fuzz open and close test (Certification)
* DF - Fuzz Query and Set File Information Test (Certification)

win8-32: almost all the DF- jobs and some NDISTest jobs
jobs as win2008R2 and some NDISTest jobs like:
*NDISTest 6.0 - [1 Machine] - 1c_Registry
*NDISTest 6.0 - [1 Machine] - 1c_FaultHandling

Version-Release number of selected component (if applicable):
virtio-win-prewhql-122

How reproducible:
100%

Steps to Reproduce:
1. boot guest with command
/usr/libexec/qemu-kvm -name 122NICWIN864CVN -enable-kvm -m 3G -smp 4 -uuid 951a7c70-ebcd-44d6-a237-5cdd9fc3948f -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/122NICWIN864CVN,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=122NICWIN864CVN,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_8_enterprise_x64_dvd_917522.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=122NICWIN864CVN.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:16:63:8f:34,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on,queues=4 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:03:22:c1:d9,bus=pci.0,mq=on,vectors=10,disable-legacy=on,disable-modern=off -monitor stdio

2.submit job 


Actual results:
BSOD

Expected results:
pass

Additional info:
1 all these jobs can pass w/ virtio-win-prewhql-117, so it is a regression.

Comment 4 Yu Wang 2016-07-24 13:27:16 UTC
Microsoft (R) Windows Debugger Version 6.3.9600.16384 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [I:\wyu\win2008R2\3066-MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available


************* Symbol Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       SRV*c:\symbols\*http://msdl.microsoft.com/download/symbols
Symbol search path is: SRV*c:\symbols\*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows 7 Kernel Version 7601 (Service Pack 1) MP (4 procs) Free x64
Product: Server, suite: TerminalServer DataCenter SingleUserTS
Built by: 7601.17514.amd64fre.win7sp1_rtm.101119-1850
Machine Name:
Kernel base = 0xfffff800`0141d000 PsLoadedModuleList = 0xfffff800`01662e90
Debug session time: Fri Jul 22 03:05:28.546 2016 (UTC + 8:00)
System Uptime: 0 days 0:01:03.546
Loading Kernel Symbols
...............................................................
................................................................

Loading User Symbols

Loading unloaded module list
....
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck C4, {62, fffffa80056be6a8, fffffa80056bf5e0, 1}

*** ERROR: Module load completed but symbols could not be loaded for netkvm.sys
Probably caused by : netkvm.sys

Followup: MachineOwner
---------

2: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught.  This is
because the driver was specified in the registry as being suspect (by the
administrator) and the kernel has enabled substantial checking of this driver.
If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will
be among the most commonly seen crashes.
Arguments:
Arg1: 0000000000000062, A driver has forgotten to free its pool allocations prior to unloading.
Arg2: fffffa80056be6a8, name of the driver having the issue.
Arg3: fffffa80056bf5e0, verifier internal structure with driver information.
Arg4: 0000000000000001, total # of (paged+nonpaged) allocations that weren't freed.
	Type !verifier 3 drivername.sys for info on the allocations
	that were leaked that caused the bugcheck.

Debugging Details:
------------------


BUGCHECK_STR:  0xc4_62

IMAGE_NAME:  netkvm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5784c852

MODULE_NAME: netkvm

FAULTING_MODULE: fffff88002d00000 netkvm

VERIFIER_DRIVER_ENTRY: dt nt!_MI_VERIFIER_DRIVER_ENTRY fffffa80056bf5e0
Symbol nt!_MI_VERIFIER_DRIVER_ENTRY not found.

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

PROCESS_NAME:  System

CURRENT_IRQL:  2

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

LAST_CONTROL_TRANSFER:  from fffff800019243dc to fffff8000149d640

STACK_TEXT:  
fffff880`01fd16c8 fffff800`019243dc : 00000000`000000c4 00000000`00000062 fffffa80`056be6a8 fffffa80`056bf5e0 : nt!KeBugCheckEx
fffff880`01fd16d0 fffff800`0193354a : 00000000`00000001 00000000`00000000 fffff880`02d00000 00000000`00000001 : nt!VerifierBugCheckIfAppropriate+0x3c
fffff880`01fd1710 fffff800`015865f0 : 00000000`00000000 00000000`00000000 fffff880`01e5d180 00000000`00000000 : nt!VfPoolCheckForLeaks+0x4a
fffff880`01fd1750 fffff800`0184d4fe : fffffa80`056be5f0 00000000`00000000 00000000`00000000 00000000`00000018 : nt!VfTargetDriversRemove+0x160
fffff880`01fd17f0 fffff800`01871f53 : 00000000`00000000 00000000`000e0082 00000000`00000000 00000000`00000001 : nt!VfDriverUnloadImage+0x2e
fffff880`01fd1820 fffff800`018723cd : 00000000`00000000 fffffa80`056be5f0 00000000`00000000 00000000`00010200 : nt!MiUnloadSystemImage+0x283
fffff880`01fd1890 fffff800`019138f1 : 00000000`00000000 00000000`00000016 fffffa80`03ce24b0 00000000`00000018 : nt!MmUnloadSystemImage+0x4d
fffff880`01fd18d0 fffff800`014a7514 : 00000000`00000000 00000000`00000016 fffffa80`03ce24b0 00000000`00000004 : nt!IopDeleteDriver+0x41
fffff880`01fd1900 fffff800`0170971d : 00000000`00000016 fffffa80`05781050 fffffa80`03ce2600 fffff8a0`026e5de0 : nt!ObfDereferenceObject+0xd4
fffff880`01fd1960 fffff800`014a7514 : fffffa80`057576a0 fffff800`01883e6e fffff8a0`04e7a5a0 fffffa80`057576a0 : nt!IopDeleteDevice+0x49
fffff880`01fd1990 fffff800`0159647b : 00000000`00000016 fffff8a0`04e7a590 00000000`00004800 00000000`00000000 : nt!ObfDereferenceObject+0xd4
fffff880`01fd19f0 fffff800`01883ed4 : fffffa80`04b88390 00000000`00000000 00000000`00000002 fffffa80`04b87060 : nt!PnpRemoveLockedDeviceNode+0x23b
fffff880`01fd1a40 fffff800`01883fe0 : 00000000`00000000 fffff8a0`0512ba01 fffff8a0`0503b830 ffffe965`5f7a60ab : nt!PnpDeleteLockedDeviceNode+0x44
fffff880`01fd1a70 fffff800`01914e54 : 00000000`00000002 00000000`00000000 fffffa80`04b88390 fffff8a0`00000000 : nt!PnpDeleteLockedDeviceNodes+0xa0
fffff880`01fd1ae0 fffff800`019154ac : fffff880`00000000 00000000`00010200 fffff880`01fd1c00 00000000`00000000 : nt!PnpProcessQueryRemoveAndEject+0xc34
fffff880`01fd1c20 fffff800`017fe6ac : 00000000`00000000 fffffa80`061f35b0 fffff8a0`0512ba20 fffff800`0163a600 : nt!PnpProcessTargetDeviceEvent+0x4c
fffff880`01fd1c50 fffff800`014a7a21 : fffff800`01708db8 fffff8a0`0512ba20 fffff800`0163a658 fffffa80`03cdeb60 : nt! ?? ::NNGAKEGL::`string'+0x5cd3b
fffff880`01fd1cb0 fffff800`0173acce : 00000000`00000000 fffffa80`03cdeb60 00000000`00000080 fffffa80`03ca79e0 : nt!ExpWorkerThread+0x111
fffff880`01fd1d40 fffff800`0148efe6 : fffff880`01e5d180 fffffa80`03cdeb60 fffff880`01e67fc0 00000000`00000000 : nt!PspSystemThreadStartup+0x5a
fffff880`01fd1d80 00000000`00000000 : fffff880`01fd2000 fffff880`01fcc000 fffff880`01fd11d0 00000000`00000000 : nt!KiStartSystemThread+0x16


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

FAILURE_BUCKET_ID:  X64_0xc4_62_VRF_LEAKED_POOL_IMAGE_netkvm.sys

BUCKET_ID:  X64_0xc4_62_VRF_LEAKED_POOL_IMAGE_netkvm.sys

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:x64_0xc4_62_vrf_leaked_pool_image_netkvm.sys

FAILURE_ID_HASH:  {103b59e3-ae81-da9a-b6ab-b56111a21557}

Followup: MachineOwner
---------

Comment 5 Ladi Prosek 2016-07-25 08:55:59 UTC
Nonpaged memory allocation in virtio_reserve_queue_memory is never freed. The fix is in and a new build will be out shortly.

Comment 7 Peixiu Hou 2016-07-27 04:41:02 UTC
Hit the same issue on follow systems:

guest os:
win2012-64/R2 & win8.1-32/64: almost all the DF- jobs and some NDISTest jobs:
jobs as win2008R2 and some NDISTest jobs like:
*NDISTest 6.0 - [1 Machine] - 1c_Registry
*NDISTest 6.0 - [1 Machine] - 1c_FaultHandling
*NDISTest 6.0 - [1 Machine] - 1c_NdisRequestCov
*NDISTest 6.0 - [1 Machine] - OffloadMisc
*NDISTest 6.0 - [1 Machine] - 1c_Mini6RSSOids


Win10-32/64/2016: almost all the DF- jobs and some NDISTest jobs
jobs as win2008R2 and some NDISTest jobs like:
*NDISTest 6.0 - [1 Machine] - 1c_Registry
*NDISTest 6.0 - [1 Machine] - 1c_FaultHandling
*NDISTest 6.0 - [1 Machine] - 1c_NdisRequestCov
*NDISTest 6.0 - [1 Machine] - OffloadMisc
*NDISTest 6.0 - [1 Machine] - 1c_Mini6RSSOids
*NDISTest 6.0 - [1 Machine] - Stats
*NDISTest 6.0 - [1 Machine] - Reset
*NDISTest 6.0 - [1 Machine] - SingleEtherType
*NDISTest 6.0 - [1 Machine] - OffloadLSO
*NDISTest 6.0 - [1 Machine] - CheckConnectivity

Comment 8 Ladi Prosek 2016-07-27 07:16:51 UTC
virtio-win-prewhql-123 has the fix. Peixiu Hou, can you please try this build?

Comment 9 Peixiu Hou 2016-07-29 09:44:50 UTC
Hi Ladi,

I retested all DF-jobs and upper listed NDISTest jobs with virtio-win-prewhql-123 on Win2012-R2, all jobs are pass, thanks a lot~~


Best Regards~
Peixiu Hou

Comment 10 Ladi Prosek 2016-07-29 09:47:19 UTC
Excellent, thank you!

Comment 11 lijin 2016-08-01 02:50:58 UTC
change status to verified according to comment#9

Comment 13 errata-xmlrpc 2016-11-04 08:55:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2609.html


Note You need to log in before you can comment on or make changes to this bug.