Bug 912926 - [WHQL][netkvm] win2k3 32/64 and win xp got 0x000000D1 BSOD while running Sleep and PNP (disable and enable) with IO Before and After (Certification) (id 2067) on HCK
Summary: [WHQL][netkvm] win2k3 32/64 and win xp got 0x000000D1 BSOD while running Slee...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: virtio-win
Version: 6.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Yan Vugenfirer
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 957435 958699
TreeView+ depends on / blocked
 
Reported: 2013-02-20 02:04 UTC by Min Deng
Modified: 2013-12-06 07:18 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Interrupt can arrive during device shutdown. Consequence: Interrupt storm on the guest. The guest is hanging. Fix: Disable interrupts and synchronise device shutdown with DIRQL of the device. Result: Device shutdown cannot cause guest hangs anymore.
Clone Of:
Environment:
Last Closed: 2013-11-22 00:03:38 UTC


Attachments (Terms of Use)
Screenshot-1 (23.60 KB, image/png)
2013-02-20 02:04 UTC, Min Deng
no flags Details
Dumpfiles (34.27 MB, application/x-zip-compressed)
2013-02-20 02:09 UTC, Min Deng
no flags Details
Analysis of dump (4.72 KB, application/octet-stream)
2013-02-20 02:23 UTC, Min Deng
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1729 normal SHIPPED_LIVE virtio-win bug fix and enhancement update 2013-11-21 00:39:25 UTC

Description Min Deng 2013-02-20 02:04:22 UTC
Created attachment 699786 [details]
Screenshot-1

Description of problem:
Windows 2003 32 bits and windows 2003 64 bits GOt BSOD issues while running Sleep and PNP (disable and enable) with IO Before and After (Certification) on HCK 
Version-Release number of selected component (if applicable):
virtio-win-prewhql-0.1-54
How reproducible:
3 times and 3 failed

Steps to Reproduce:
1.
Boot up guest with the CLI 
a./usr/libexec/qemu-kvm -M rhel6.4.0 -m 2G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2k3-32-nic2.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device rtl8139,netdev=hostnet0,mac=00:32:25:57:41:18,bus=pci.0,addr=0x4 -uuid ebaa20d6-e4b7-45f2-b735-81c857b471a9 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win2k3-32-nic-54-2,server,nowait -mon chardev=111a,mode=readline -name win2332-2 -netdev tap,sndbuf=0,id=hostnet1,script=/etc/qemu-ifup-private,downscript=no -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:36:21:37:43:08,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :2 -vga cirrus
b./usr/libexec/qemu-kvm -M rhel6.4.0 -m 2G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2k3-32-nic1.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device rtl8139,netdev=hostnet0,mac=00:32:25:57:41:18,bus=pci.0,addr=0x4 -uuid 1c46b6bd-f4e1-46dc-a495-a4ebdec6e665 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win2k3-32-nic-54-1,server,nowait -mon chardev=111a,mode=readline -name win2332 -netdev tap,sndbuf=0,id=hostnet1,script=/etc/qemu-ifup-private,downscript=no -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:34:41:33:23:78,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :1 -vga cirrus

2.submit the Sleep and PNP (disable and enable) with IO Before and After (Certification) on HCK
  
Actual results:
The job failed with BSOD issue.

Expected results:
The job can pass

Additional info:

Comment 1 Min Deng 2013-02-20 02:09:46 UTC
Created attachment 699787 [details]
Dumpfiles

It includes win2k3 32/64 dump files

Comment 2 Min Deng 2013-02-20 02:23:41 UTC
Created attachment 699788 [details]
Analysis of dump

Comment 4 Mike Cao 2013-04-07 05:44:00 UTC
This comment is used for QE purpose

 kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000004, memory referenced
Arg2: d0000007, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: b9b67a17, address which referenced memory

Debugging Details:
------------------

*** No owner thread found for resource 808a5920
*** No owner thread found for resource 808a5920
*** No owner thread found for resource 808a5920

READ_ADDRESS:  00000004 

CURRENT_IRQL:  7

FAULTING_IP: 
netkvm+5a17
b9b67a17 8b4804          mov     ecx,dword ptr [eax+4]

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0xD1

PROCESS_NAME:  System

TRAP_FRAME:  f78fa7a8 -- (.trap 0xfffffffff78fa7a8)
ErrCode = 00000000
eax=00000000 ebx=f78fa87b ecx=01ce0e93 edx=0c2bc6e0 esi=8a0dea20 edi=00000000
eip=b9b67a17 esp=f78fa81c ebp=f78fa820 iopl=0         nv up ei pl nz na po nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010202
netkvm+0x5a17:
b9b67a17 8b4804          mov     ecx,dword ptr [eax+4] ds:0023:00000004=????????
Resetting default scope

LOCK_ADDRESS:  808a59a0 -- (!locks 808a59a0)

Resource @ nt!IopDeviceTreeLock (0x808a59a0)    Shared 1 owning threads
     Threads: 89ccc3c8-01<*> 
1 total locks, 1 locks currently held

PNP_TRIAGE: 
	Lock address  : 0x808a59a0
	Thread Count  : 1
	Thread address: 0x89ccc3c8
	Thread wait   : 0x2029b

LAST_CONTROL_TRANSFER:  from b9b67a17 to 8088c9fb

STACK_TEXT:  
f78fa7a8 b9b67a17 badb0d00 0c2bc6e0 00000000 nt!KiTrap0E+0x2a7
WARNING: Stack unwind information not available. Following frames may be wrong.
f78fa820 b9b68eb2 8a0dea20 00000072 89948638 netkvm+0x5a17
f78fa83c b9b629d2 8a0dea20 f78fa87b f78fa86c netkvm+0x6eb2
f78fa84c f7228409 f78fa86b f78fa87b 8a0dea20 netkvm+0x9d2
f78fa86c 8088d760 89948638 010deeec 00000000 NDIS!ndisMIsr+0x36
f78fa890 8088d709 899d1500 00000181 f78fa940 nt!KiChainedDispatch2ndLvl+0x48
f78fa890 809bbb08 899d1500 00000181 f78fa940 nt!KiChainedDispatch+0x29
f78fa910 f720be45 89791020 00004000 08a36000 nt!VfFreeCommonBuffer
f78fa940 f722d748 899d1540 00004000 00000001 NDIS!ndisFreeSharedMemory+0x69
f78fa960 b9b62e21 899d1540 00004000 00000001 NDIS!NdisMFreeSharedMemory+0x26
f78fa980 b9b66a76 8a0dea20 8a0dee28 8a0dea20 netkvm+0xe21
f78fa994 b9b66aaf 8a0dea20 8a0dea20 b9b66b7c netkvm+0x4a76
f78fa9c0 b9b62967 8a0dea20 809b7224 899d1540 netkvm+0x4aaf
f78fa9dc f721f92e 8a0dea20 88fdb908 899d1540 netkvm+0x967
f78faa18 f721fc06 899d178c f72207d9 8a084f00 NDIS!ndisMCommonHaltMiniport+0x375
f78faa20 f72207d9 8a084f00 00000000 899d1540 NDIS!ndisMHaltMiniport+0x21
f78fab4c f721cbe4 80a5ff00 899d1440 00000000 NDIS!ndisPnPRemoveDevice+0x189
f78fab7c 809b550c 899d1440 8a084f00 8a085000 NDIS!ndisPnPDispatch+0x192
f78fabac 8081df43 8090d804 f78fabe4 8090d804 nt!IovCallDriver+0x112
f78fabb8 8090d804 89cdb4a8 89cdb4a8 89cda160 nt!IofCallDriver+0x13
f78fabe4 8090da81 899d1440 f78fac10 00000000 nt!IopSynchronousCall+0xb8
f78fac38 808239e6 89cdb4a8 00000002 00000000 nt!IopRemoveDevice+0x97
f78fac60 8090f672 e14ee628 00000016 e16923d8 nt!IopRemoveLockedDeviceNode+0x160
f78fac78 8090f6d9 89cda160 00000002 e16923d8 nt!IopDeleteLockedDeviceNode+0x34
f78facac 80911637 89cdb4a8 026923d8 00000002 nt!IopDeleteLockedDeviceNodes+0x3f
f78fad40 80911908 f78fad7c 89cbe2fc e1c42148 nt!PiProcessQueryRemoveAndEject+0x7ad
f78fad5c 80911b36 f78fad7c 89ccc3c8 808ae5fc nt!PiProcessTargetDeviceEvent+0x2a
f78fad80 808804ab 893b0378 00000000 89ccc3c8 nt!PiWalkDeviceList+0x1d2
f78fadac 80949c7a 893b0378 00000000 00000000 nt!ExpWorkerThread+0xeb
f78faddc 8088e0f2 808803c0 00000001 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16


STACK_COMMAND:  kb

FOLLOWUP_IP: 
netkvm+5a17
b9b67a17 8b4804          mov     ecx,dword ptr [eax+4]

SYMBOL_STACK_INDEX:  1

SYMBOL_NAME:  netkvm+5a17

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: netkvm

IMAGE_NAME:  netkvm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  511205a3

FAILURE_BUCKET_ID:  0xD1_VRF_netkvm+5a17

BUCKET_ID:  0xD1_VRF_netkvm+5a17

Followup: MachineOwner
---------

3: kd> 

Resource @ nt!IopDeviceTreeLock (0x808a59a0)    Shared 1 owning threads
     Threads: 89ccc3c8-01<*> 
1 total locks, 1 locks currently held

Comment 5 Dmitry Fleytman 2013-04-20 12:21:17 UTC
This bug is regression after commit:

commit 138a62a00ddf5bfe4273e42c403bc6517874f538
Author: Yan Vugenfirer <yvugenfi@redhat.com>
Date:   Mon Jan 7 23:52:41 2013 +0200

    [NetKVM] BZ 878442 - Interrupt handlers refactored, Interrupt timestamping made debug only feature

Comment 8 Dmitry Fleytman 2013-05-12 12:25:07 UTC
*** Bug 959011 has been marked as a duplicate of this bug. ***

Comment 9 Min Deng 2013-05-24 03:05:52 UTC
Hi Yan

   QE re-test the job on win2k3-32/win2k3-64 and winxp guest via build 61,and the testing results are following as below.
   Actual results,the job can pass without any errors.
   Expected results,the job will pass

   So the issue has been fixed on build 61,thanks.
Best Regards,
Min

Comment 10 Mike Cao 2013-05-28 11:01:45 UTC
Based on comment #9 ,This issue has been fixed ald 
Move status to VERIFIED

Comment 16 errata-xmlrpc 2013-11-22 00:03:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1729.html


Note You need to log in before you can comment on or make changes to this bug.