Bug 1122364
Summary: | [virtio-win][vioser][rhel6]win2k8r2 guest bsod(7e) when do continuous hotplug/unplug during virtio serial in use with driver verifier enabled | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | lijin <lijin> |
Component: | virtio-win | Assignee: | Ladi Prosek <lprosek> |
virtio-win sub component: | virtio-win-prewhql | QA Contact: | Virtualization Bugs <virt-bugs> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | medium | CC: | ailan, ghammer, hhuang, lijin, lmiksik, lprosek, michen, phou, rbalakri, virt-maint, vrozenfe, wyu, yvugenfi |
Version: | 7.2 | ||
Target Milestone: | rc | ||
Target Release: | 7.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
NO_DOCS
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-04 08:43:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
lijin
2014-07-23 05:21:41 UTC
guest hit this while enable driver verifier; guest won't bsod with driver verifier disabled(try 3 times) Does it reproduce if you increase the all sleep times from 3 to 10? One more question. Does it also reproduce on a rhel-7 host? 1.increase the all sleep times from 3 to 10 ---> still hit this issue 2.can reproduce on rhel7 host: package info: qemu-kvm-rhev-1.5.3-60.el7ev_0.2.x86_64 kernel-3.10.0-133.el7.x86_64 seabios-1.7.2.2-12.el7.x86_64 lijin, pls try whether you can reproduce it w/ virtio-win-1.7.1 (In reply to Mike Cao from comment #7) > lijin, pls try whether you can reproduce it w/ virtio-win-1.7.1 can reproduce it with virtio-win-1.7.1 Is the verifier is set with the standard settings or all the flags are enabled (see bug 1113910)? Is it reproducible when the serial ports are not in use? Thanks. 1.the verifier is set with the standard settings:# verifier.exe /standard /driver vioser.sys 2.cannot reproduce this issue when the serial ports are not in use winxp hit the same issue with build94 How reproducible: 2/5 package info: virtio-win-prewhql-94 qemu-kvm-rhev-2.1.2-12.el7.x86_64 kernel-3.10.0-208.el7.x86_64 seabios-1.7.5-5.el7.x86_64 spice-server-0.12.4-8.el7.x86_64 qxl-win-0.1-21 steps: 1.boot guest with: /usr/libexec/qemu-kvm -drive file=winxp-new.raw,if=none,cache=none,media=disk,format=raw,id=drive-ide0-0-1 -device ide-drive,id=ide1,drive=drive-ide0-0-1,bus=ide.1,unit=1,bootindex=1 -spice port=5900,disable-ticketing -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -qmp tcp:0:4444,server,nowait -usb -device usb-tablet,id=tablet1 -monitor stdio -boot menu=on -netdev tap,script=/etc/qemu-ifup,id=hostnet1 -device rtl8139,netdev=hostnet1,id=net1,mac=00:52:54:fe:48:68 -cpu SandyBridge -M pc -smp 2 -m 2G -enable-kvm \ -device virtio-serial-pci,id=virtio-serial0,max_ports=31 -chardev socket,id=channel1,host=127.0.0.1,port=12345,server,nowait -device virtserialport,nr=1,chardev=channel1,name=com.redhat.rhevm.vdsm1,bus=virtio-serial0.0,id=port1 -chardev socket,id=channel2,host=127.0.0.1,port=12346,server,nowait -device virtserialport,nr=2,chardev=channel2,name=com.redhat.rhevm.vdsm2,bus=virtio-serial0.0,id=port2 2.Transferring data form host to guest via port0 in a loop eg:in the guest # for ((;;)) ;do python VirtIOChannel_guest_reieve.py com.redhat.rhevm.vdsm1;done on the host # for ((;;)) ;do python serial-host-send.py; sleep 1; done 3.Transferring data from guest to host via port1 in a loop eg: on the host # for ((;;)) ; do python serial-host-receive.py; sleep 1; done in the guest #for ((;;)) ; do python VirtIOChannel_guest_send.py com.redhat.rhevm.vdsm2;done 4.Continuous hotplug/unplug virtioserialport: # cat hot-port.sh #!/bin/bash # some simply scripts for balloon device hotplug/unplug in a loop let i=0 exec 3<>/dev/tcp/localhost/4444 #note modify this to qmp port echo -e "{ 'execute': 'qmp_capabilities' }" >&3 read response <&3 echo $response while [ $i -lt 100 ] do echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'port1' }}">&3 ; sleep 3 ; read response <&3 ; echo "$i: $response" echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'port2' }}">&3 ; sleep 3 ; read response <&3 ; echo "$i: $response" sleep 3 ; echo -e "{'execute':'device_add','arguments':{'driver':'virtserialport','chardev':'channel1','name':'com.redhat.rhevm.vdsm1','bus':'virtio-serial0.0','id':'port1'}}">&3 ; sleep 3 ; read response <&3 echo "$i: $response" echo -e "{'execute':'device_add','arguments':{'driver':'virtserialport','chardev':'channel2','name':'com.redhat.rhevm.vdsm2','bus':'virtio-serial0.0','id':'port2'}}">&3 ; sleep 3 ; read response <&3 echo "$i: $response" let i=$i+1 done actual results: during step4,guest bsod with 7E windbg info: 1: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e) This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address. Arguments: Arg1: 80000003, The exception code that was not handled Arg2: 8052b5d0, The address that the exception occurred at Arg3: bad037a8, Exception Record Address Arg4: bad034a4, Context Record Address Debugging Details: ------------------ *** No owner thread found for resource 8055b4e0 *** No owner thread found for resource 8055b4e0 *** No owner thread found for resource 8055b4e0 EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid FAULTING_IP: nt!DbgBreakPoint+0 8052b5d0 cc int 3 EXCEPTION_RECORD: bad037a8 -- (.exr 0xffffffffbad037a8) ExceptionAddress: 8052b5d0 (nt!DbgBreakPoint) ExceptionCode: 80000003 (Break instruction exception) ExceptionFlags: 00000000 NumberParameters: 3 Parameter[0]: 00000000 Parameter[1]: 8052b704 Parameter[2]: 00000061 CONTEXT: bad034a4 -- (.cxr 0xffffffffbad034a4;r) eax=89770b70 ebx=00000000 ecx=8052b704 edx=00000061 esi=804faea6 edi=00000102 eip=8052b5d0 esp=bad03870 ebp=bad0388c iopl=0 nv up ei pl nz na po nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000202 nt!DbgBreakPoint: 8052b5d0 cc int 3 Last set context: eax=89770b70 ebx=00000000 ecx=8052b704 edx=00000061 esi=804faea6 edi=00000102 eip=8052b5d0 esp=bad03870 ebp=bad0388c iopl=0 nv up ei pl nz na po nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000202 nt!DbgBreakPoint: 8052b5d0 cc int 3 Resetting default scope DEFAULT_BUCKET_ID: DRIVER_FAULT BUGCHECK_STR: 0x7E PROCESS_NAME: System ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION} Breakpoint A breakpoint has been reached. EXCEPTION_PARAMETER1: 00000000 EXCEPTION_PARAMETER2: 8052b704 EXCEPTION_PARAMETER3: 00000061 ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre LOCK_ADDRESS: 8055b560 -- (!locks 8055b560) Resource @ nt!IopDeviceTreeLock (0x8055b560) Shared 1 owning threads Threads: 89bb43c8-01<*> 1 total locks, 1 locks currently held PNP_TRIAGE: Lock address : 0x8055b560 Thread Count : 1 Thread address: 0x89bb43c8 Thread wait : 0x9751 LAST_CONTROL_TRANSFER: from 805cff2a to 804f9f33 STACK_TEXT: bad0386c b1847c57 8977ffb8 8977fe00 00000000 nt!DbgBreakPoint bad0388c b18374af 8977ffb8 b185dd18 768801f8 wdf01000!_FX_DRIVER_GLOBALS::WaitForSignal+0x5e bad038ac b18392a7 00000000 b185ffc8 89a54d00 wdf01000!FxIoQueue::StopProcessingForPower+0xcf bad038cc b1830e61 00000001 bad038f4 b184ecdc wdf01000!FxPkgIo::StopProcessingForPower+0xbd bad038d8 b184ecdc 89628150 00000001 b18602f8 wdf01000!FxDeviceToMx::FxPkgIo_StopProcessingForPower+0x16 bad038f4 b184dd81 89a54d00 89a54e0c 89a54d00 wdf01000!FxPkgPnp::PowerGotoD3Stopped+0x47 bad0397c b184ebb2 00000314 89a54e0c 89a54d00 wdf01000!FxPkgPnp::PowerEnterNewState+0x11c bad039a0 b184f5bb bad039b8 b18621d4 89a54d00 wdf01000!FxPkgPnp::PowerProcessEventInner+0x171 bad039c4 b1854e73 00000001 bad03a58 b1854832 wdf01000!FxPkgPnp::PowerProcessEvent+0x15c bad039d0 b1854832 89a54d00 89a54e7c 89a54d00 wdf01000!FxPkgPnp::PowerPolStopping+0x1a bad03a58 b1855716 0000055b 89a54e7c 89a54d00 wdf01000!FxPkgPnp::PowerPolicyEnterNewState+0x11c bad03a7c b1856388 bad03a94 00000008 89a54d00 wdf01000!FxPkgPnp::PowerPolicyProcessEventInner+0x185 bad03aa0 b1852b74 00000001 00000000 b185309c wdf01000!FxPkgPnp::PowerPolicyProcessEvent+0x172 bad03aac b185309c bad03adc b1852484 89a54d00 wdf01000!FxPkgPnp::PnpPowerPolicySurpriseRemove+0xc bad03ab4 b1852484 89a54d00 89a54da8 89a54d00 wdf01000!FxPkgPnp::PnpEventFailedIoStarting+0xd bad03adc b1852db2 00000127 89a54da8 89a54d00 wdf01000!FxPkgPnp::PnpEnterNewState+0x104 bad03b00 b185347a bad03b18 89770b70 89a54d00 wdf01000!FxPkgPnp::PnpProcessEventInner+0x149 bad03b24 b184c540 00000400 00000000 89a54d00 wdf01000!FxPkgPnp::PnpProcessEvent+0x13e bad03b38 b185067a bad03b70 89a54d00 bad03b68 wdf01000!FxPkgPnp::PnpSurpriseRemoval+0x29 bad03b48 b184be02 89a54d00 bad03b70 8a406f68 wdf01000!FxPkgPdo::_PnpSurpriseRemoval+0x1e bad03b68 b1828a3f 8a406f68 bad03b90 b1828c63 wdf01000!FxPkgPnp::Dispatch+0x207 bad03b74 b1828c63 89a23830 8a406f68 89a23830 wdf01000!FxDevice::Dispatch+0x7f bad03b90 804ef18f 89a23830 8a406f68 806e6428 wdf01000!FxDevice::DispatchWithLock+0x7b bad03ba0 80658128 8a406ffc bad03c40 8a406f68 nt!IopfCallDriver+0x31 bad03bc4 80592b63 89a23830 89a23830 00000017 nt!IovCallDriver+0xa0 bad03bf0 80592dc5 89a23830 bad03c1c 00000000 nt!IopSynchronousCall+0xb7 bad03c44 805938e4 89a23830 00000017 e1549ee8 nt!IopRemoveDevice+0x93 bad03c60 80594786 00000308 e1549ee8 00000000 nt!IopSurpriseRemoveLockedDeviceNode+0x90 bad03c74 805947fd 896096f8 00000003 e1549ee8 nt!IopDeleteLockedDeviceNode+0x24 bad03ca8 8059a1d2 89a23830 02549ee8 00000003 nt!IopDeleteLockedDeviceNodes+0x3f bad03d3c 8059a72e bad03d78 806e6974 e1d88c28 nt!PiProcessQueryRemoveAndEject+0x4d2 bad03d58 8059a874 bad03d78 89a22d80 8056485c nt!PiProcessTargetDeviceEvent+0x2a bad03d7c 8053876d 89a22d80 00000000 89bb43c8 nt!PiWalkDeviceList+0xea bad03dac 805cff64 89a22d80 00000000 00000000 nt!ExpWorkerThread+0xef bad03ddc 805460de 8053867e 00000001 00000000 nt!PspSystemThreadStartup+0x34 00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16 FOLLOWUP_IP: wdf01000!_FX_DRIVER_GLOBALS::WaitForSignal+5e b1847c57 8d45f4 lea eax,[ebp-0Ch] SYMBOL_STACK_INDEX: 1 SYMBOL_NAME: wdf01000!_FX_DRIVER_GLOBALS::WaitForSignal+5e FOLLOWUP_NAME: MachineOwner MODULE_NAME: wdf01000 IMAGE_NAME: wdf01000.sys DEBUG_FLR_IMAGE_TIMESTAMP: 4a5bbf28 STACK_COMMAND: .cxr 0xffffffffbad034a4 ; kb FAILURE_BUCKET_ID: 0x7E_BADMEMREF_wdf01000!_FX_DRIVER_GLOBALS::WaitForSignal+5e BUCKET_ID: 0x7E_BADMEMREF_wdf01000!_FX_DRIVER_GLOBALS::WaitForSignal+5e ANALYSIS_SOURCE: KM FAILURE_ID_HASH_STRING: km:0x7e_badmemref_wdf01000!_fx_driver_globals::waitforsignal+5e FAILURE_ID_HASH: {7f9ce2e3-e036-29c0-4327-6bd86a1c7722} Followup: MachineOwner --------- (In reply to lijin from comment #11) > # cat hot-port.sh > #!/bin/bash > # some simply scripts for balloon device hotplug/unplug in a loop > let i=0 > exec 3<>/dev/tcp/localhost/4444 #note modify this to qmp port > echo -e "{ 'execute': 'qmp_capabilities' }" >&3 > read response <&3 > echo $response > while [ $i -lt 100 ] > do > echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'port1' }}">&3 ; > sleep 3 ; > read response <&3 ; > echo "$i: $response" > echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'port2' }}">&3 ; > sleep 3 ; > read response <&3 ; > echo "$i: $response" > sleep 3 ; > echo -e > "{'execute':'device_add','arguments':{'driver':'virtserialport','chardev': > 'channel1','name':'com.redhat.rhevm.vdsm1','bus':'virtio-serial0.0','id': > 'port1'}}">&3 ; > sleep 3 ; > read response <&3 > echo "$i: $response" > echo -e > "{'execute':'device_add','arguments':{'driver':'virtserialport','chardev': > 'channel2','name':'com.redhat.rhevm.vdsm2','bus':'virtio-serial0.0','id': > 'port2'}}">&3 ; > sleep 3 ; > read response <&3 > echo "$i: $response" > let i=$i+1 > done modify all "sleep 3" to "sleep 7" to give guest enough time to unplug,still hit this issue(tried three times,guest bsod once) still hit this issue on build 102 package info: kernel-2.6.32-540.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.454.el6.x86_64 seabios-0.6.1.2-29.el6.x86_64 spice-server-0.12.4-12.el6.x86_64 virtio-win-prewhql-102 steps same as comment #0 Actual Results: on build 86 and 102,guest bsod with 7e when hotunplug/hotplug in loop while serial ports in use; Based on above,this issue has NOT been fixed. win2012R2 guest also hit this issue,dump file is located in the same place as comment#16 Taking a look. I was able to repro this. It seems that only one port and pushing data through it from guest to host is enough to hit it. With verifier off, this manifests as the write call hanging forever after the port is unplugged. Maybe an IRP is never completed? Digging deeper... win2012R2 still hit this issue with following package: virtio-win-1.7.5-0.el6.noarch qemu-kvm-rhev-0.12.1.2-2.482.el6.x86_64 kernel-2.6.32-595.el6.x86_64 seabios-0.6.1.2-30.el6.x86_64 (In reply to Ladi Prosek from comment #22) > I was able to repro this. It seems that only one port and pushing data > through it from guest to host is enough to hit it. Good to hear. I was unable to reproduce it on my setup. Do you have a window crash dump file to share? Thanks. > With verifier off, this manifests as the write call hanging forever after > the port is unplugged. Maybe an IRP is never completed? Digging deeper... Yes, I have a dump. I'll start copying it over but I suspect that it will be the same thing that lijin shared earlier. At least the blame callstack shown by !analyze is identical. Here's something potentially more useful - verbose debug output of the driver just before the hang. One thing I still want to verify is that the last write in the log actually corresponds to the userspace write that hangs. 00102727 62.04384613 VIOSerialPortCreate Port id = 3 00102728 62.04384995 --> VIOSerialReclaimConsumedBuffers 00102729 62.04385757 <-- VIOSerialReclaimConsumedBuffers Full: 0 00102730 62.04386139 --> VIOSerialSendCtrlMsg vq = FFFFD0016E41D000 00102731 62.04387283 <-- VIOSerialSendCtrlMsg cnt = 0 00102732 62.04388046 <-- VIOSerialPortCreate 00102733 62.04390335 --> VIOSerialPortDeviceControl 00102734 62.04391098 <-- VIOSerialPortDeviceControl 00102735 62.04421997 --> VIOSerialPortWrite Request: 00001FFF2ADCF7E8 Length: 16384 00102736 62.04423141 --> VIOSerialWillWriteBlock 00102737 62.04425049 --> VIOSerialReclaimConsumedBuffers 00102738 62.04426193 <-- VIOSerialReclaimConsumedBuffers Full: 0 00102739 62.04426575 <-- VIOSerialWillWriteBlock 00102740 62.04428864 --> VIOSerialSendBuffers Buffer: FFFFE000D4400000 Length: 16384 00102741 62.04431152 <-- VIOSerialSendBuffers 00102742 62.04431534 <-- VIOSerialPortWrite 00102743 62.04757309 --> VIOSerialInterruptIsr 00102744 62.04758835 <-- VIOSerialInterruptIsr Serviced: 1 00102745 62.04761887 --> VIOSerialInterruptIsr 00102746 62.04762268 --> VIOSerialInterruptDpc 00102747 62.04762650 <-- VIOSerialInterruptIsr Serviced: 1 00102748 62.04763031 --> VIOSerialCtrlWorkHandler 00102749 62.04763412 --> VIOSerialQueuesInterruptDpc 00102750 62.04764175 <-- VIOSerialCtrlWorkHandler 00102751 62.04764557 --> VIOSerialGetInBuf 00102752 62.04764938 <-- VIOSerialInterruptDpc 00102753 62.04765320 <-- VIOSerialGetInBuf 00102754 62.04765701 --> VIOSerialReclaimConsumedBuffers 00102755 62.04767990 <-- VIOSerialReclaimConsumedBuffers Full: 0 00102756 62.04768372 <-- VIOSerialQueuesInterruptDpc 00102757 62.04793549 --> VIOSerialPortClose 00102758 62.04793930 --> VIOSerialSendCtrlMsg vq = FFFFD0016E41D000 00102759 62.04795456 <-- VIOSerialSendCtrlMsg cnt = 0 00102760 62.04796219 --> VIOSerialDiscardPortDataLocked 00102761 62.04796600 <-- VIOSerialDiscardPortDataLocked 00102762 62.04797363 --> VIOSerialReclaimConsumedBuffers 00102763 62.04798126 <-- VIOSerialReclaimConsumedBuffers Full: 0 00102764 62.04798508 <-- VIOSerialPortClose 00102765 62.04801941 VIOSerialPortCreate Port id = 3 00102766 62.04803085 --> VIOSerialReclaimConsumedBuffers 00102767 62.04803467 <-- VIOSerialReclaimConsumedBuffers Full: 0 00102768 62.04803848 --> VIOSerialSendCtrlMsg vq = FFFFD0016E41D000 00102769 62.04804611 <-- VIOSerialSendCtrlMsg cnt = 0 00102770 62.04805756 <-- VIOSerialPortCreate 00102771 62.04807281 --> VIOSerialPortDeviceControl 00102772 62.04807663 <-- VIOSerialPortDeviceControl 00102773 62.04811859 --> VIOSerialInterruptIsr 00102774 62.04812241 <-- VIOSerialInterruptIsr Serviced: 1 00102775 62.04813004 --> VIOSerialInterruptDpc 00102776 62.04813766 --> VIOSerialCtrlWorkHandler 00102777 62.04814529 <-- VIOSerialCtrlWorkHandler 00102778 62.04814911 <-- VIOSerialInterruptDpc 00102779 62.04817963 --> VIOSerialInterruptIsr 00102780 62.04819107 <-- VIOSerialInterruptIsr Serviced: 1 00102781 62.04819489 --> VIOSerialInterruptDpc 00102782 62.04819870 --> VIOSerialCtrlWorkHandler 00102783 62.04820633 <-- VIOSerialCtrlWorkHandler 00102784 62.04821396 <-- VIOSerialInterruptDpc 00105619 62.22101593 --> VIOSerialInterruptIsr 00105620 62.22103119 <-- VIOSerialInterruptIsr Serviced: 1 00105621 62.22104263 --> VIOSerialInterruptIsr 00105622 62.22104263 <-- VIOSerialInterruptIsr Serviced: 1 00105623 62.22105789 --> VIOSerialInterruptDpc 00105624 62.22105789 --> VIOSerialCtrlWorkHandler 00105625 62.22106171 --> VIOSerialHandleCtrlMsg 00105626 62.22106552 VIOSerialFindPortById port = 3 00105627 62.22107315 VIOSerialFindPortById id = 3 port = 0xFFFFE000D6B5B198 00105628 62.22107315 VIRTIO_CONSOLE_PORT_REMOVE id = 3 00105629 62.22108078 --> VIOSerialRemovePort DeviceId: 0 PortId: 3 00105630 62.22108459 --> VIOSerialEvtChildListIdentificationDescriptionCompare 00105631 62.22108841 <-- VIOSerialEvtChildListIdentificationDescriptionCompare 00105632 62.22109222 <-- VIOSerialRemovePort 00105633 62.22109604 <-- VIOSerialHandleCtrlMsg 00105634 62.22109985 --> VIOSerialAddInBuf buf = FFFFE000D4E01110 00105635 62.22110748 <-- VIOSerialAddInBuf 00105636 62.22111130 <-- VIOSerialCtrlWorkHandler 00105637 62.22111511 <-- VIOSerialInterruptDpc 00105638 62.22111893 --> VIOSerialQueuesInterruptDpc 00105639 62.22111893 <-- VIOSerialQueuesInterruptDpc 00105640 68.22449493 --> VIOSerialInterruptIsr 00105641 68.22449493 <-- VIOSerialInterruptIsr Serviced: 1 00105642 68.22450256 --> VIOSerialInterruptDpc 00105643 68.22450256 --> VIOSerialCtrlWorkHandler 00105644 68.22451019 --> VIOSerialHandleCtrlMsg 00105645 68.22451782 VIOSerialFindPortById port = 3 00105646 68.22451782 <-- VIOSerialFindPortById 00105647 68.22452545 Invlid index 3 in control packet 00105648 68.22452545 VIRTIO_CONSOLE_PORT_OPEN invalid id = 3 00105649 68.22453308 <-- VIOSerialHandleCtrlMsg 00105650 68.22454071 --> VIOSerialAddInBuf buf = FFFFE000D4E28FD0 00105651 68.22454071 <-- VIOSerialAddInBuf 00105652 68.22454834 --> VIOSerialHandleCtrlMsg 00105653 68.22454834 VIOSerialFindPortById port = 3 00105654 68.22454834 <-- VIOSerialFindPortById 00105655 68.22455597 VIOSerialAddPort DeviceId = 0 :: PortId = 3 00105656 68.22455597 --> VIOSerialEvtChildListIdentificationDescriptionCompare 00105657 68.22456360 <-- VIOSerialEvtChildListIdentificationDescriptionCompare 00105658 68.22457123 --> VIOSerialEvtChildListIdentificationDescriptionDuplicate 00105659 68.22457123 <-- VIOSerialEvtChildListIdentificationDescriptionDuplicate 00105660 68.22457123 WdfChildListAddOrUpdateChildDescriptionAsPresent = 0x0. 00105661 68.22457886 <-- VIOSerialHandleCtrlMsg 00105662 68.22457886 --> VIOSerialAddInBuf buf = FFFFE000D4E28F90 00105663 68.22458649 <-- VIOSerialAddInBuf 00105664 68.22459412 <-- VIOSerialCtrlWorkHandler 00105665 68.22459412 <-- VIOSerialInterruptDpc And this is what happened after I killed the hanging process. Only now was the D0 exit delivered to the driver. 00105666 1474.66308594 --> VIOSerialPortWriteRequestCancel Request: 0x00001FFF2ADCF7E8 00105667 1474.66308594 <-- VIOSerialPortWriteRequestCancel 00105668 1474.66308594 --> VIOSerialPortEvtDeviceD0Exit TargetState: 5 00105669 1474.66308594 --> VIOSerialDisableInterruptQueue disable 00105670 1474.66308594 <-- VIOSerialDisableInterruptQueue 00105671 1474.66308594 --> VIOSerialDiscardPortDataLocked 00105672 1474.66308594 <-- VIOSerialDiscardPortDataLocked 00105673 1474.66308594 --> VIOSerialReclaimConsumedBuffers 00105674 1474.66308594 <-- VIOSerialReclaimConsumedBuffers Full: 1 00105675 1474.66308594 --> VIOSerialFreeBuffer buf = FFFFE000D4FE9ED0, buf->va_buf = FFFFE000D4FEE000 00105676 1474.66308594 <-- VIOSerialFreeBuffer ... 00105931 1474.66406250 <-- VIOSerialPortEvtDeviceD0Exit This is what I believe is happening: 1) Upon receiving the VIRTIO_CONSOLE_PORT_REMOVE command, VIOSerialRemovePort() marks the port as "missing" 2) When the inflight write completes (QEMU explicitly flushes out all unconsumed buffers from the queue after port removal), the driver is notified as usual but fails to handle the event because VIOSerialQueuesInterruptDpc() enumerates only "present" ports 3) Therefore VIOSerialReclaimConsumedBuffers() is never called and the write request is never completed I have verified that immediately completing the inflight write request in VIOSerialRemovePort() solves the problem but it may not be the right way to fix this. I'll continue investigating next week and will send out a patch once I properly wrap my head around all this :) Change product from rhel6 to rhel7 for the following reasons: 1.virtio-win share the same stream on rhel6 and rhel7; 2.QE and Dev have an agreement that bug fix of virtio-win on rhel7 should also work on rhel6,we will track virtio-win bug via rhel7 product. On rhel6: This issue can be reproduced with virtio-win-prewhql-102. Verified this issue with virtio-win-prewhql-115, not hit BSOD. On rhel7: This issue cannot be reproduced with virtio-win-prewhql-102. Verified this issue with virtio-win-prewhql-115, not hit BSOD. Based on above,this issue has been fixed. According to comment#36, the bug has been fixed ,change status to verified. Thanks wyu Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2609.html |