Bug 1295637 - [virtio-win][netkvm][rhel6]win2012 guest bsod with DRIVER_POWER_STATE_FAILURE(9f) when shutdown after netdev_del&device_del while coping files in guest
[virtio-win][netkvm][rhel6]win2012 guest bsod with DRIVER_POWER_STATE_FAILURE...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
Unspecified Unspecified
high Severity high
: rc
: 7.4
Assigned To: ybendito
xiywang
:
Depends On:
Blocks: 1401400 1395265
  Show dependency treegraph
 
Reported: 2016-01-05 00:02 EST by lijin
Modified: 2017-08-01 23:24 EDT (History)
13 users (show)

See Also:
Fixed In Version: qemu-kvm-rhev-2.9.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 19:29:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description lijin 2016-01-05 00:02:40 EST
Description of problem:
During guest coping files from a samba server to guest,delete netdev and virtio-net-pci device,then shutdown guest,guest will keep shutdown for a long time then BSOD with DRIVER_POWER_STATE_FAILURE(9f)

Version-Release number of selected component (if applicable):
virtio-win-1.7.5-0.el6.noarch
kernel-2.6.32-595.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.482.el6.x86_64
seabios-0.6.1.2-30.el6.x86_64

How reproducible:
80%

Steps to Reproduce:
1.boot win2012 guest with virtio-net-pci:
/usr/libexec/qemu-kvm -M pc -cpu SandyBridge -M pc -enable-kvm -m 2G -smp 2 -nodefconfig -nodefaults -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=win2012-pre.qcow2,if=none,id=drive-ide0-0-0,format=qcow2,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -qmp tcp:0:4444,server,nowait -monitor stdio -drive file=en_windows_server_2012_x64_dvd_915478.iso,if=none,id=drive-ide0-0-1,format=raw,cache=none,media=cdrom -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -cdrom /usr/share/virtio-win/virtio-win.iso -fda /usr/share/virtio-win/virtio-win_amd64.vfd -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on -device virtio-net-pci,vectors=10,netdev=hostnet0,id=net0,mac=00:52:3b:65:ee:ff

2.enable netkvm driver verifier in guest with standard settings;

3.check network in qemu monitor:
(qemu) info network 
Devices not on any VLAN:
  hostnet0: ifname=tap0,script=/etc/qemu-ifup,downscript=no peer=net0
  net0: model=virtio-net-pci,macaddr=00:52:3b:65:ee:ff peer=hostnet0

4.copy a 3G file from samba server to guest;

5.during step4,delete netdev first,then delete virtio-net-pci device
(qemu) netdev_del hostnet0
(qemu) device_del net0

6.check network in qemu monitor:
(qemu) info network 
Devices not on any VLAN:
  net0: model=virtio-net-pci,macaddr=00:52:3b:65:ee:ff peer=hostnet0

7.shutdown guest:
(qemu) system_powerdown 

Actual results:
after step6, net0 still existed in (qemu) info network,and after step7,guest will keep shuting down for a long time(more than 10 minutes),then go to BSOD(DRIVER_POWER_STATE_FAILURE)

Expected results:
guest can shutdown without bsod

Additional info:
1.if delte virtio-net-pci first,then delete netdev,NOT hit this issue;
2.CAN reproduce this issue with virtio-win-1.7.4-1.el6_7.2,so it's not a regression bug;
3.mount the image to rhel7 host,CAN reproduce(tried 5 times,hit once);
4.the windbg info:
0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time.
Arguments:
Arg1: 0000000000000004, The power transition timed out waiting to synchronize with the Pnp
	subsystem.
Arg2: 0000000000000258, Timeout in seconds.
Arg3: fffffa80038dc180, The thread currently holding on to the Pnp lock.
Arg4: fffff802cd3f8800, nt!TRIAGE_9F_PNP on Win7 and higher

Debugging Details:
------------------

Implicit thread is now fffffa80`038dc180

DRVPOWERSTATE_SUBCODE:  4

IMAGE_NAME:  pci.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  5010ab1f

MODULE_NAME: pci

FAULTING_MODULE: fffff88000a7a000 pci

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

BUGCHECK_STR:  0x9F

PROCESS_NAME:  System

CURRENT_IRQL:  2

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

STACK_TEXT:  
fffff802`cd3f87c8 fffff802`cdfea5b2 : 00000000`0000009f 00000000`00000004 00000000`00000258 fffffa80`038dc180 : nt!KeBugCheckEx
fffff802`cd3f87d0 fffff802`ce1e928a : fffff802`00000000 fffff802`00000000 00000000`00000000 00000000`00000000 : nt!PnpBugcheckPowerTimeout+0x6e
fffff802`cd3f8830 fffff802`cdf228b4 : fffffa80`04c98460 fffffa80`04c8b080 fffff802`cd3f8b18 00000000`00000000 : nt!PopBuildDeviceNotifyListWatchdog+0x16
fffff802`cd3f8860 fffff802`cdf22ed5 : fffffa80`04bcf980 fffff802`cdeec4f8 fffff802`ce173f00 00000000`0000005d : nt!KiProcessExpiredTimerList+0x214
fffff802`cd3f89a0 fffff802`cdf22d88 : fffff802`ce171180 fffff802`ce173f80 00000000`00000002 00000000`00008545 : nt!KiExpireTimerTable+0xa9
fffff802`cd3f8a40 fffff802`cdf1ce76 : fffffa80`04bcf980 00000000`ffffffff fffffa80`04bcfc58 00000000`00000000 : nt!KiTimerExpiration+0xc8
fffff802`cd3f8af0 fffff802`cdf2157a : fffff802`ce171180 fffff802`ce171180 00000000`00183de0 fffff802`ce1cb880 : nt!KiRetireDpcList+0x1f6
fffff802`cd3f8c60 00000000`00000000 : fffff802`cd3f9000 fffff802`cd3f3000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x5a


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

IMAGE_VERSION:  6.2.9200.16384

FAILURE_BUCKET_ID:  0x9F_VRF_4_netkvm_IMAGE_pci.sys

BUCKET_ID:  0x9F_VRF_4_netkvm_IMAGE_pci.sys

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:0x9f_vrf_4_netkvm_image_pci.sys

FAILURE_ID_HASH:  {a8531532-7f86-cd3a-f904-434ce6bf30f1}

Followup: MachineOwner
---------
Comment 3 Yu Wang 2016-06-12 04:58:18 EDT
Reproduce this issue on virtio-win-1.7.5-0.el6.noarch
Verified this issue on virtio-win-prewhql-118

Steps as comment#0

on virtio-win-1.7.5-0.el6.noarch, bsod as comment#0
on on virtio-win-prewhql-118, guest can shutdown without bsod, and no "net0: model=virtio-net-pci,macaddr=00:52:3b:65:ee:ff peer=hostnet0" after delete the device

application version:

kernel-2.6.32-642.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.491.el6.x86_64
virtio-win-prewhql-118


Above all, this bug has been fixed.

Thanks
Yu Wang
Comment 4 Yu Wang 2016-06-12 05:11:06 EDT
According to comment#3, change status to verified.

Thanks
Yu Wang
Comment 6 Peixiu Hou 2016-09-12 05:16:33 EDT
Reproduced this issue on virtio-win-1.7.5-0.el6.noarch.
Verified this issue on virtio-win-prewhql-126, it also be reproduced(tried 4 times, hit 2 times). 
Steps as comment#0, after system_powerdown, guest will keep shuting down for a long time(more than 10 minutes),then go to BSOD(9f).


The memory dump has uploaded to this location:
http://fileshare.englab.nay.redhat.com/pub/section2/images_backup/virtio-win/bug1295637/BSOD_MEMORY.DMP.zip
Comment 8 ybendito 2017-01-10 09:00:35 EST
Root cause is stuck packet(s) in qemu after link down of nic (or network device removal). 
Fix in progress:
http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg01366.html
Changing component to qemu-kvm
Comment 14 xiywang 2017-05-11 03:17:57 EDT
Hi ybendito,

This bug is fixed in qemu-kvm-rhev-2.9.0-1.el7? While it's a rhel6 bug?
Or you mean el6?

Thanks,
Xiyue
Comment 15 ybendito 2017-05-11 13:39:18 EDT
(In reply to xiywang from comment #14)
> Hi ybendito,
> 
> This bug is fixed in qemu-kvm-rhev-2.9.0-1.el7? While it's a rhel6 bug?
> Or you mean el6?
> 
> Thanks,
> Xiyue

This is a qemu version where this bug fix was applied, this is good enough to verify the solution is correct.
If there will be later decision to backport it to different stream of qemu, it will be done also.
Comment 16 xiywang 2017-06-12 23:39:55 EDT
Verified on qemu-kvm-rhev-2.9.0-9.el7.x86_64

1. boot a guest
/usr/libexec/qemu-kvm -M pc -cpu IvyBridge -m 4096 -smp 4 -enable-kvm -nodefconfig -nodefaults -rtc base=localtime,driftfix=slew \
-device piix3-usb-uhci,id=usb -drive file=/home/win2012.img,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -qmp tcp:0:4444,server,nowait -monitor stdio \
-netdev tap,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,id=hostnet0,vhost=on -device virtio-net-pci,vectors=10,netdev=hostnet0,id=net0,mac=00:52:3b:65:ab:12

2. enable netkvm driver verifier in guest
# verifier /querysettings
# verifier.exe /standard /driver netkvm.sys
reboot guest
# verifier /querysettings
...
Verified drivers:

netkvm.sys

3. check network status
(qemu) info network
net0: index=0,type=nic,model=virtio-net-pci,macaddr=00:52:3b:65:ab:12
 \ hostnet0: index=0,type=tap,ifname=tap0,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown

4. copy a large file from external server to guest

5. during step4, delete netdev and virtio-net-pci device
(qemu) netdev_del hostnet0
(qemu) device_del net0

6. check network status
(qemu) info network

7. shutdown guest
(qemu) system_powerdown

Tried 5 times and could not reproduce. Set to Verified.
Comment 18 errata-xmlrpc 2017-08-01 19:29:42 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392
Comment 19 errata-xmlrpc 2017-08-01 21:07:21 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392
Comment 20 errata-xmlrpc 2017-08-01 21:59:20 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392
Comment 21 errata-xmlrpc 2017-08-01 22:40:06 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392
Comment 22 errata-xmlrpc 2017-08-01 23:04:50 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392
Comment 23 errata-xmlrpc 2017-08-01 23:24:58 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Note You need to log in before you can comment on or make changes to this bug.