Bug 912926
Summary: | [WHQL][netkvm] win2k3 32/64 and win xp got 0x000000D1 BSOD while running Sleep and PNP (disable and enable) with IO Before and After (Certification) (id 2067) on HCK | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Min Deng <mdeng> | ||||||||
Component: | virtio-win | Assignee: | Yvugenfi <yvugenfi> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 6.5 | CC: | acathrow, bcao, bsarathy, dfleytma, juzhang, lnovich, michen, qzhang, yvugenfi | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: |
Cause:
Interrupt can arrive during device shutdown.
Consequence:
Interrupt storm on the guest. The guest is hanging.
Fix:
Disable interrupts and synchronise device shutdown with DIRQL of the device.
Result:
Device shutdown cannot cause guest hangs anymore.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-11-22 00:03:38 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 957435, 958699 | ||||||||||
Attachments: |
|
Created attachment 699787 [details]
Dumpfiles
It includes win2k3 32/64 dump files
Created attachment 699788 [details]
Analysis of dump
This comment is used for QE purpose kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 00000004, memory referenced Arg2: d0000007, IRQL Arg3: 00000000, value 0 = read operation, 1 = write operation Arg4: b9b67a17, address which referenced memory Debugging Details: ------------------ *** No owner thread found for resource 808a5920 *** No owner thread found for resource 808a5920 *** No owner thread found for resource 808a5920 READ_ADDRESS: 00000004 CURRENT_IRQL: 7 FAULTING_IP: netkvm+5a17 b9b67a17 8b4804 mov ecx,dword ptr [eax+4] DEFAULT_BUCKET_ID: DRIVER_FAULT BUGCHECK_STR: 0xD1 PROCESS_NAME: System TRAP_FRAME: f78fa7a8 -- (.trap 0xfffffffff78fa7a8) ErrCode = 00000000 eax=00000000 ebx=f78fa87b ecx=01ce0e93 edx=0c2bc6e0 esi=8a0dea20 edi=00000000 eip=b9b67a17 esp=f78fa81c ebp=f78fa820 iopl=0 nv up ei pl nz na po nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202 netkvm+0x5a17: b9b67a17 8b4804 mov ecx,dword ptr [eax+4] ds:0023:00000004=???????? Resetting default scope LOCK_ADDRESS: 808a59a0 -- (!locks 808a59a0) Resource @ nt!IopDeviceTreeLock (0x808a59a0) Shared 1 owning threads Threads: 89ccc3c8-01<*> 1 total locks, 1 locks currently held PNP_TRIAGE: Lock address : 0x808a59a0 Thread Count : 1 Thread address: 0x89ccc3c8 Thread wait : 0x2029b LAST_CONTROL_TRANSFER: from b9b67a17 to 8088c9fb STACK_TEXT: f78fa7a8 b9b67a17 badb0d00 0c2bc6e0 00000000 nt!KiTrap0E+0x2a7 WARNING: Stack unwind information not available. Following frames may be wrong. f78fa820 b9b68eb2 8a0dea20 00000072 89948638 netkvm+0x5a17 f78fa83c b9b629d2 8a0dea20 f78fa87b f78fa86c netkvm+0x6eb2 f78fa84c f7228409 f78fa86b f78fa87b 8a0dea20 netkvm+0x9d2 f78fa86c 8088d760 89948638 010deeec 00000000 NDIS!ndisMIsr+0x36 f78fa890 8088d709 899d1500 00000181 f78fa940 nt!KiChainedDispatch2ndLvl+0x48 f78fa890 809bbb08 899d1500 00000181 f78fa940 nt!KiChainedDispatch+0x29 f78fa910 f720be45 89791020 00004000 08a36000 nt!VfFreeCommonBuffer f78fa940 f722d748 899d1540 00004000 00000001 NDIS!ndisFreeSharedMemory+0x69 f78fa960 b9b62e21 899d1540 00004000 00000001 NDIS!NdisMFreeSharedMemory+0x26 f78fa980 b9b66a76 8a0dea20 8a0dee28 8a0dea20 netkvm+0xe21 f78fa994 b9b66aaf 8a0dea20 8a0dea20 b9b66b7c netkvm+0x4a76 f78fa9c0 b9b62967 8a0dea20 809b7224 899d1540 netkvm+0x4aaf f78fa9dc f721f92e 8a0dea20 88fdb908 899d1540 netkvm+0x967 f78faa18 f721fc06 899d178c f72207d9 8a084f00 NDIS!ndisMCommonHaltMiniport+0x375 f78faa20 f72207d9 8a084f00 00000000 899d1540 NDIS!ndisMHaltMiniport+0x21 f78fab4c f721cbe4 80a5ff00 899d1440 00000000 NDIS!ndisPnPRemoveDevice+0x189 f78fab7c 809b550c 899d1440 8a084f00 8a085000 NDIS!ndisPnPDispatch+0x192 f78fabac 8081df43 8090d804 f78fabe4 8090d804 nt!IovCallDriver+0x112 f78fabb8 8090d804 89cdb4a8 89cdb4a8 89cda160 nt!IofCallDriver+0x13 f78fabe4 8090da81 899d1440 f78fac10 00000000 nt!IopSynchronousCall+0xb8 f78fac38 808239e6 89cdb4a8 00000002 00000000 nt!IopRemoveDevice+0x97 f78fac60 8090f672 e14ee628 00000016 e16923d8 nt!IopRemoveLockedDeviceNode+0x160 f78fac78 8090f6d9 89cda160 00000002 e16923d8 nt!IopDeleteLockedDeviceNode+0x34 f78facac 80911637 89cdb4a8 026923d8 00000002 nt!IopDeleteLockedDeviceNodes+0x3f f78fad40 80911908 f78fad7c 89cbe2fc e1c42148 nt!PiProcessQueryRemoveAndEject+0x7ad f78fad5c 80911b36 f78fad7c 89ccc3c8 808ae5fc nt!PiProcessTargetDeviceEvent+0x2a f78fad80 808804ab 893b0378 00000000 89ccc3c8 nt!PiWalkDeviceList+0x1d2 f78fadac 80949c7a 893b0378 00000000 00000000 nt!ExpWorkerThread+0xeb f78faddc 8088e0f2 808803c0 00000001 00000000 nt!PspSystemThreadStartup+0x2e 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16 STACK_COMMAND: kb FOLLOWUP_IP: netkvm+5a17 b9b67a17 8b4804 mov ecx,dword ptr [eax+4] SYMBOL_STACK_INDEX: 1 SYMBOL_NAME: netkvm+5a17 FOLLOWUP_NAME: MachineOwner MODULE_NAME: netkvm IMAGE_NAME: netkvm.sys DEBUG_FLR_IMAGE_TIMESTAMP: 511205a3 FAILURE_BUCKET_ID: 0xD1_VRF_netkvm+5a17 BUCKET_ID: 0xD1_VRF_netkvm+5a17 Followup: MachineOwner --------- 3: kd> Resource @ nt!IopDeviceTreeLock (0x808a59a0) Shared 1 owning threads Threads: 89ccc3c8-01<*> 1 total locks, 1 locks currently held This bug is regression after commit: commit 138a62a00ddf5bfe4273e42c403bc6517874f538 Author: Yan Vugenfirer <yvugenfi> Date: Mon Jan 7 23:52:41 2013 +0200 [NetKVM] BZ 878442 - Interrupt handlers refactored, Interrupt timestamping made debug only feature For NDIS6.x: http://git.engineering.redhat.com/?p=users/yvugenfi/internal-kvm-guest-drivers-windows/.git;a=commit;h=26672bdc93faf65762e655546bb28bf8846d53e2 For NDIS 5.1: http://git.engineering.redhat.com/?p=users/yvugenfi/internal-kvm-guest-drivers-windows/.git;a=commit;h=60cb9f22f566b8595ec47dbf5b3bc099a2b097fe For NDIS6.x: http://git.engineering.redhat.com/?p=users/yvugenfi/internal-kvm-guest-drivers-windows/.git;a=commit;h=26672bdc93faf65762e655546bb28bf8846d53e2 For NDIS 5.1: http://git.engineering.redhat.com/?p=users/yvugenfi/internal-kvm-guest-drivers-windows/.git;a=commit;h=60cb9f22f566b8595ec47dbf5b3bc099a2b097fe *** Bug 959011 has been marked as a duplicate of this bug. *** Hi Yan QE re-test the job on win2k3-32/win2k3-64 and winxp guest via build 61,and the testing results are following as below. Actual results,the job can pass without any errors. Expected results,the job will pass So the issue has been fixed on build 61,thanks. Best Regards, Min Based on comment #9 ,This issue has been fixed ald Move status to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1729.html |
Created attachment 699786 [details] Screenshot-1 Description of problem: Windows 2003 32 bits and windows 2003 64 bits GOt BSOD issues while running Sleep and PNP (disable and enable) with IO Before and After (Certification) on HCK Version-Release number of selected component (if applicable): virtio-win-prewhql-0.1-54 How reproducible: 3 times and 3 failed Steps to Reproduce: 1. Boot up guest with the CLI a./usr/libexec/qemu-kvm -M rhel6.4.0 -m 2G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2k3-32-nic2.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device rtl8139,netdev=hostnet0,mac=00:32:25:57:41:18,bus=pci.0,addr=0x4 -uuid ebaa20d6-e4b7-45f2-b735-81c857b471a9 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win2k3-32-nic-54-2,server,nowait -mon chardev=111a,mode=readline -name win2332-2 -netdev tap,sndbuf=0,id=hostnet1,script=/etc/qemu-ifup-private,downscript=no -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:36:21:37:43:08,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :2 -vga cirrus b./usr/libexec/qemu-kvm -M rhel6.4.0 -m 2G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2k3-32-nic1.raw,format=raw,if=none,id=drive-virtio0,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device rtl8139,netdev=hostnet0,mac=00:32:25:57:41:18,bus=pci.0,addr=0x4 -uuid 1c46b6bd-f4e1-46dc-a495-a4ebdec6e665 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win2k3-32-nic-54-1,server,nowait -mon chardev=111a,mode=readline -name win2332 -netdev tap,sndbuf=0,id=hostnet1,script=/etc/qemu-ifup-private,downscript=no -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:34:41:33:23:78,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :1 -vga cirrus 2.submit the Sleep and PNP (disable and enable) with IO Before and After (Certification) on HCK Actual results: The job failed with BSOD issue. Expected results: The job can pass Additional info: