Bug 1054640

Summary: [virtio-win][netkvm]windows 8.1 x86 BSOD on DRIVER_POWER_STATE_FAILURE (9f)
Product: Red Hat Enterprise Linux 7 Reporter: Chao Yang <chayang>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: chayang, ghammer, hhuang, juzhang, knoel, mdeng, michen, rbalakri, virt-bugs, virt-maint, vrozenfe
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
NO_DOCS
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-24 08:39:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1100308    
Bug Blocks:    
Attachments:
Description Flags
BSOD log
none
minidump of this bsod
none
script to repeatedly hot plug/unplug virtio-net-nic none

Description Chao Yang 2014-01-17 07:43:50 UTC
Created attachment 851443 [details]
BSOD log

Description of problem:
Booted a windows 8.1 x86 guest, then repeatedly hot plug/unplug virtio-net-pci for 300 times. BSOD happened when trying to shut down guest.

Version-Release number of selected component (if applicable):
qemu-kvm-1.5.3-31.el7.x86_64
3.10.0-66.el7.x86_64
prewhlq-74

How reproducible:
1/1

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
CLI:
/usr/libexec/qemu-kvm -M pc-i440fx-rhel7.0.0 -cpu Opteron_G3,hv_spinlocks=0x1fff,hv_relaxed,hv_vapic -smp 2,sockets=2,threads=2,cores=2,maxcpus=8 -m 4096 -enable-kvm -rtc base=localtime,clock=host,driftfix=slew -drive file=/home/chayang/windows_8_32.qcow2_v3.bak.bak.bak,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,bus=pci.0 -device scsi-hd,bus=scsi0.0,drive=drive-virtio-disk,id=disk,bootindex=1 -device virtio-balloon-pci,id=balloon0 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:1a:4a:42:48:71 -k en-us -boot menu=on -qmp tcp:0:4451,server,nowait -vnc :4 -vga cirrus -monitor stdio -S

Comment 2 Yvugenfi@redhat.com 2014-01-19 10:10:00 UTC
Please attached the dump file or upload it to some accessible location (zip it before).

Best regards,
Yan.

Comment 3 Chao Yang 2014-01-20 02:44:51 UTC
Created attachment 852598 [details]
minidump of this bsod

Comment 4 Yvugenfi@redhat.com 2014-01-20 13:52:32 UTC
Another question - what was the end result of the 300 plug\unplug script?

Device plugged or device unplugged?

Thanks,
Yan.

Comment 5 Yvugenfi@redhat.com 2014-01-20 14:38:40 UTC
Microsoft (R) Windows Debugger Version 6.2.9200.16384 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [E:\temp\Yan\dumps\BZ1054640\011714-12250-01.dmp\011714-12250-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: C:\Users\yan\symbols\local;SRV*C:\Users\yan\symbols\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows 8 Kernel Version 9600 MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 9600.16404.x86fre.winblue_gdr.130913-2141
Machine Name:
Kernel base = 0x8141d000 PsLoadedModuleList = 0x81616218
Debug session time: Fri Jan 17 19:50:30.217 2014 (UTC + 2:00)
System Uptime: 0 days 16:08:08.249
Loading Kernel Symbols
...............................................................
........................................................
Loading User Symbols
Loading unloaded module list
..........
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9F, {4, 12c, 86064400, 82698b50}

Implicit thread is now 86064400
Probably caused by : pci.sys

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time (usually 10 minutes).
Arguments:
Arg1: 00000004, The power transition timed out waiting to synchronize with the Pnp
	subsystem.
Arg2: 0000012c, Timeout in seconds.
Arg3: 86064400, The thread currently holding on to the Pnp lock.
Arg4: 82698b50, nt!TRIAGE_9F_PNP on Win7

Debugging Details:
------------------

Implicit thread is now 86064400

DRVPOWERSTATE_SUBCODE:  4

IMAGE_NAME:  pci.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  52158f12

MODULE_NAME: pci

FAULTING_MODULE: 82bbd000 pci

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

BUGCHECK_STR:  0x9F

PROCESS_NAME:  System

CURRENT_IRQL:  2

STACK_TEXT:  
a32c3734 81465a62 0000c800 00000000 86064400 nt!KiSwapContext+0x19
a32c3790 81465501 a32c38a0 86064400 00000000 nt!KiSwapThread+0x172
a32c37c4 814634d6 00000000 00000000 a32c38a0 nt!KiCommitThreadWait+0x141
a32c387c 8ca9775a a32c38a0 00000000 00000000 nt!KeWaitForSingleObject+0x176
a32c3908 8ca85a09 86b4a1c8 00000000 856c60e8 ndis!NdisMSleep+0x5d
a32c3920 8cab912e 8ca79690 856c60e8 856c6da0 ndis!ndisMInvokeHalt+0x32
a32c3960 8ca85a58 85d73010 856c60e8 856c60e8 ndis!ndisMCommonHaltMiniport+0x2b5
a32c3970 8cab95e6 00000002 856c60e8 8ca7a7dc ndis!ndisMHaltMiniport+0x39
a32c3a98 8ca8deb8 856c60e8 00000000 81463360 ndis!ndisPnPRemoveDevice+0x198
a32c3aa8 8cac169e 856c60e8 867e7c58 00000000 ndis!ndisPnPRemoveDeviceEx+0x51
a32c3ad4 8ca8a4da 867e7c58 a32c3afa a32c3afb ndis!ndisPnPIrpRemoveDevice+0x51d1
a32c3afc 81461aff 856c6030 867e7c58 856c6030 ndis!ndisPnPDispatch+0x161
a32c3b18 816c2e8e 85757a50 85673b80 00000002 nt!IofCallDriver+0x3f
a32c3b54 816fba39 c00000bb 00000000 00000000 nt!IopSynchronousCall+0xa2
a32c3bb0 814b3a91 00000000 85673b80 85673b01 nt!IopRemoveDevice+0xad
a32c3be4 816fd8fa a3c53988 00000000 85673b80 nt!PnpRemoveLockedDeviceNode+0x17f
a32c3bf8 816fd89a 0000002f a3c53988 00000000 nt!PnpDeleteLockedDeviceNode+0x36
a32c3c34 816fccbd 00000002 00000001 0000002f nt!PnpDeleteLockedDeviceNodes+0x68
a32c3cd8 816f450a 00000000 a7e95530 00000001 nt!PnpProcessQueryRemoveAndEject+0x3e5
a32c3cf8 816f0fab 816f0d10 86064400 84cae498 nt!PnpProcessTargetDeviceEvent+0x76
a32c3d24 814a9611 84cae498 00000000 86064400 nt!PnpDeviceEventWorker+0x29b
a32c3d70 814b777a 00000000 dbb38df0 00000000 nt!ExpWorkerThread+0xff
a32c3db0 81532fe1 814a9512 00000000 00000000 nt!PspSystemThreadStartup+0x58
a32c3dc8 81464726 a32c3de0 81465387 849f46e0 nt!KiThreadStartup+0x15
a32c3dd0 81465387 849f46e0 a32c3e04 a32c4000 nt!PsLeavePriorityRegion+0x12
a32c3ddc a32c4000 a32c1000 00000000 00000000 nt!KeReleaseInStackQueuedSpinLock+0x37
WARNING: Frame IP not in any known module. Following frames may be wrong.
a32c3e04 00000000 00000000 00000000 00000000 0xa32c4000


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

FAILURE_BUCKET_ID:  0x9F_4_netkvm_IMAGE_pci.sys

BUCKET_ID:  0x9F_4_netkvm_IMAGE_pci.sys

Followup: MachineOwner
---------

0: kd> !ndiskd.miniports
The full list of miniports is not included in this dump file.  Ndiskd will
grovel through the available memory and search for possible miniport data.

This process may take time; press CTRL+C or CTRL+Break to interrupt.

    Miniport                                                                    
    856c60e8 - Red Hat VirtIO Ethernet Adapter #3

Done.  Found 1 miniport(s).  This list may be incomplete.
0: kd> !ndiskd.miniport 856c60e8


MINIPORT

    Red Hat VirtIO Ethernet Adapter #3

    Ndis handle        856c60e8
    Ndis API version   v6.30
*** WARNING: Unable to verify timestamp for netkvm.sys
    Adapter context    86b4a1c8
    Miniport driver    85d73010 - netkvm   v104.232
    Network interface  8667b008

    Media type         802.3
    Device path        \??\PCI#VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00#3&13c0b0c5&0&30#{ad498944-762f-11d0-8dcb-00c04fc3358c}\{2527AD68-8C49-4189-8D9F-D807E367C67C}
    Device object      856c6030            More information
    MAC address        [MAC address at 853b8fd0 is unavailble]


AUTOMATIC DIAGNOSTICS

    Power management is not enabled        Why not?



STATE

    Miniport           PAUSED
    Device PnP         REMOVED             Show state history
    Datapath           DIVERTED_BECAUSE_MEDIA_DISCONNECTED
    NBL status         NDIS_STATUS_MEDIA_DISCONNECTED
    Operational status DOWN
    Operational flags  DOWN_NOT_CONNECTED
    Admin status       ADMIN_UP
    Media              MediaConnectUnknown
    Power              D0
    References         2                   Show detail
    Total resets       0
    Pending OID        None
    Flags              BUS_MASTER, 64BIT_DMA, SG_DMA, DEFAULT_PORT_ACTIVATED,
                       SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,
                       NOT_MEDIA_CONNECTED
    PnP flags          PM_SUPPORTED, REMOVE_IN_PROGRESS, DEVICE_POWER_ENABLED,
                       NO_HALT_ON_SUSPEND, REJECT_REQUESTS, HALTING,
                       HARDWARE_DEVICE, CANCELLED_WAKEUP_TIMER


BINDINGS

    Open List          Open                Protocol           Context           
    No protocols have an open binding

    Filter List        Filter              Filter Driver      Context           
    No filters are attached


MORE INFORMATION

    Driver handlers                        Task offloads
    Power management                       PM protocol offloads
    Pending OIDs                           Timers
    Pending NBLs
    Wake-on-LAN (WoL)                      Packet filter
    Receive queues                         Receive filtering
    RSS                                    NIC switch
    Hardware resources                     Selective suspend
    NDIS ports                             WMI guids

Comment 6 Chao Yang 2014-01-21 01:10:21 UTC
(In reply to Yan Vugenfirer from comment #4)
> Another question - what was the end result of the 300 plug\unplug script?
> 
> Device plugged or device unplugged?
> 

Device got unplugged. 

> Thanks,
> Yan.

Comment 7 Yvugenfi@redhat.com 2014-01-21 09:29:42 UTC
(In reply to Chao Yang from comment #6)
> (In reply to Yan Vugenfirer from comment #4)
> > Another question - what was the end result of the 300 plug\unplug script?
> > 
> > Device plugged or device unplugged?
> > 
> 
> Device got unplugged. 
> 
> > Thanks,
> > Yan.

Thanks! According to the dump the drivers is not unloaded and still present.

Can you attach the script that you used for plug\unplug?

Thanks,
Yan.

Comment 8 Chao Yang 2014-01-21 09:57:42 UTC
Created attachment 853068 [details]
script to repeatedly hot plug/unplug virtio-net-nic

Comment 9 Chao Yang 2014-01-21 09:59:38 UTC
(In reply to Yan Vugenfirer from comment #7)
> (In reply to Chao Yang from comment #6)
> > (In reply to Yan Vugenfirer from comment #4)
> > > Another question - what was the end result of the 300 plug\unplug script?
> > > 
> > > Device plugged or device unplugged?
> > > 
> > 
> > Device got unplugged. 
> > 
> > > Thanks,
> > > Yan.
> 
> Thanks! According to the dump the drivers is not unloaded and still present.
> 
> Can you attach the script that you used for plug\unplug?
> 

Please check Comment 8, I performed 300 iterations before shutting down guest. And I didn't remove netdev for each loop.

> Thanks,
> Yan.

Comment 11 Yvugenfi@redhat.com 2014-05-14 13:29:56 UTC
(In reply to Chao Yang from comment #9)
> (In reply to Yan Vugenfirer from comment #7)
> > (In reply to Chao Yang from comment #6)
> > > (In reply to Yan Vugenfirer from comment #4)
> > > > Another question - what was the end result of the 300 plug\unplug script?
> > > > 
> > > > Device plugged or device unplugged?
> > > > 
> > > 
> > > Device got unplugged. 
> > > 
> > > > Thanks,
> > > > Yan.
> > 
> > Thanks! According to the dump the drivers is not unloaded and still present.
> > 
> > Can you attach the script that you used for plug\unplug?
> > 
> 
> Please check Comment 8, I performed 300 iterations before shutting down
> guest. And I didn't remove netdev for each loop.
> 
> > Thanks,
> > Yan.

Hello,

Is this crash reproduced consistently?
What is networking configuration on the host? Are you using Linux bridge , open vSwitch or macvtap?


Best regards,
Yan.

Comment 15 Yvugenfi@redhat.com 2014-06-18 10:22:09 UTC
Hi,

Can you set QA ACK flag please?

Thanks,
Yan.

Comment 16 Mike Cao 2014-07-22 01:36:22 UTC
dengmin ,pls help verify this bug

Comment 17 Min Deng 2014-07-31 08:13:53 UTC
  QE verified the bug on virtio-win-prewhql-0.1-87
Steps,detail steps please refer to comments 14.
Now the guest works well without BSOD so the issue has been fixed now,thanks.

kernel-3.10.0-128.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev_0.2.x86_64

Comment 18 Mike Cao 2014-08-20 06:39:09 UTC
Move status to Verified according to comment #17

Comment 23 errata-xmlrpc 2015-11-24 08:39:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2513.html