Bug 1419900

Summary: BSOD of vioser.sys on Win10 in S0->S4->S0 flow
Product: Red Hat Enterprise Linux 7 Reporter: ybendito
Component: virtio-winAssignee: Ladi Prosek <lprosek>
virtio-win sub component: virtio-win-prewhql QA Contact: Virtualization Bugs <virt-bugs>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ghammer, juzhang, lijin, lmiksik, lprosek, michen, peliu, vrozenfe, xiagao, ybendito
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 12:55:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
WDF log none

Description ybendito 2017-02-07 11:12:10 UTC
Description of problem:


Version-Release number of selected component (if applicable):
prewhql-0.1-126

How reproducible:

Happened during normal usage in hibernation-return from hibernation flow

Additional info:
Dump file at \\sharkan.eng.lab.tlv.redhat.com\win-team\BZs\vioser

Comment 2 xiagao 2017-02-10 04:49:53 UTC
(In reply to ybendito from comment #0)
> Description of problem:
> 
> 
> Version-Release number of selected component (if applicable):
> prewhql-0.1-126
> 
> How reproducible:
> 
> Happened during normal usage in hibernation-return from hibernation flow
> 
> Additional info:
> Dump file at \\sharkan.eng.lab.tlv.redhat.com\win-team\BZs\vioser

Hi ybendito,

I tried to reproduce it with win10-32 guest, but failed. Could you give more info?

The info and steps i reproduced is as following.

Pkg Version:
qemu-kvm-rhev-2.8.0-3.el7.x86_64
kernel-3.10.0-556.el7.x86_64
virtio-win-prewhql-0.1-131
qxlwddm-0.1-12

Try times:
10

Steps:
1.boot up win10-32 guest
/usr/libexec/qemu-kvm -name 130BLKW10D32FEJ -enable-kvm -m 4G -smp 4 -uuid e991c891-9a73-4186-9f5f-8bc005279819 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/131INPW10D64RKP,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb \
-drive file=win10-32.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=00:52:03:78:b8:e6 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -device virtio-serial-pci,id=virtio-serial0,max_ports=16 -chardev socket,id=channel0,path=/tmp/helloword,server,nowait -device virtserialport,chardev=channel0,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port0 -monitor stdio -spice id=on,disable-ticketing,port=5910 -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2.install qxl and virtio serial driver

3.enable Hibernate in guest

4.hibernation

5.boot up guest again

Results:
after step5, guest is running well and no bsod.

Comment 3 xiagao 2017-02-10 05:31:56 UTC
(In reply to xiagao from comment #2)
> (In reply to ybendito from comment #0)
> > Description of problem:
> > 
> > 
> > Version-Release number of selected component (if applicable):
> > prewhql-0.1-126
> > 
> > How reproducible:
> > 
> > Happened during normal usage in hibernation-return from hibernation flow
> > 
> > Additional info:
> > Dump file at \\sharkan.eng.lab.tlv.redhat.com\win-team\BZs\vioser
> 
> Hi ybendito,
> 
> I tried to reproduce it with win10-32 guest, but failed. Could you give more
> info?
> 
> The info and steps i reproduced is as following.
> 
> Pkg Version:
> qemu-kvm-rhev-2.8.0-3.el7.x86_64
> kernel-3.10.0-556.el7.x86_64
> virtio-win-prewhql-0.1-131

sorry, wrote the wrong version, correct it with virtio-win-prewhql-0.1-126


> qxlwddm-0.1-12
> 
> Try times:
> 10
> 
> Steps:
> 1.boot up win10-32 guest
> /usr/libexec/qemu-kvm -name 130BLKW10D32FEJ -enable-kvm -m 4G -smp 4 -uuid
> e991c891-9a73-4186-9f5f-8bc005279819 -nodefconfig -nodefaults -chardev
> socket,id=charmonitor,path=/tmp/131INPW10D64RKP,server,nowait -mon
> chardev=charmonitor,id=monitor,mode=control -rtc
> base=localtime,driftfix=slew -boot order=cd,menu=on -device
> piix3-usb-uhci,id=usb \
> -drive
> file=win10-32.raw,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,
> cache=none -device
> ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
> -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device
> e1000,netdev=hostnet0,id=net0,mac=00:52:03:78:b8:e6 -chardev
> pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0
> -device usb-tablet,id=input0 -device
> virtio-serial-pci,id=virtio-serial0,max_ports=16 -chardev
> socket,id=channel0,path=/tmp/helloword,server,nowait -device
> virtserialport,chardev=channel0,name=com.redhat.rhevm.vdsm,bus=virtio-
> serial0.0,id=port0 -monitor stdio -spice id=on,disable-ticketing,port=5910
> -vga qxl -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0
> 
> 2.install qxl and virtio serial driver
> 
> 3.enable Hibernate in guest
> 
> 4.hibernation
> 
> 5.boot up guest again
> 
> Results:
> after step5, guest is running well and no bsod.

correct

Comment 4 Gal Hammer 2017-02-12 15:33:54 UTC
I was unable to reproduce it. Can you please provide your qemu command line and which virtio drivers were installed?

Thanks.

Comment 5 Ladi Prosek 2017-02-14 11:22:23 UTC
Created attachment 1250175 [details]
WDF log

Attaching the output of !wdflogdump.

It looks like there was a power-up failure:

410: FxPkgPnp::PowerWaking - EvtDeviceD0Entry WDFDEVICE 0x00001FFF18335FD8 !devobj 0xFFFFE000E7CC8900, old state WdfPowerDeviceD3 failed, 0xc000000d(STATUS_INVALID_PARAMETER)

and the driver didn't recover well.

From the dump it looks like VIOSerialEvtDeviceReleaseHardware has already run, but child devices (ports) are still attached to the parent.

Calling WdfDeviceInitSetReleaseHardwareOrderOnFailure might help here.

Comment 6 Ladi Prosek 2017-03-16 14:52:54 UTC
I can reproduce this. Steps:

1) Boot a VM with virtio-serial-pci and a couple of ports attached to it
2) Hibernate the VM
3) Restart the VM *without* virtio-serial-pci

Comment 7 Gal Hammer 2017-03-19 08:35:41 UTC
(In reply to Ladi Prosek from comment #6)
> I can reproduce this. Steps:
> 
> 1) Boot a VM with virtio-serial-pci and a couple of ports attached to it
> 2) Hibernate the VM
> 3) Restart the VM *without* virtio-serial-pci

Oh. That bug again... (Once fixed in commit 582dd183).

The problem might be that the driver doesn't get a notification about the removal and try to access the device. I think a real solution should be in QEMU side.

Comment 8 Ladi Prosek 2017-03-20 09:15:14 UTC
(In reply to Gal Hammer from comment #7)
> (In reply to Ladi Prosek from comment #6)
> > I can reproduce this. Steps:
> > 
> > 1) Boot a VM with virtio-serial-pci and a couple of ports attached to it
> > 2) Hibernate the VM
> > 3) Restart the VM *without* virtio-serial-pci
> 
> Oh. That bug again... (Once fixed in commit 582dd183).

582dd183 is viorng, but yes, the same bug. Maybe resume from S4 with the device removed should be part of the test plan.

> The problem might be that the driver doesn't get a notification about the
> removal and try to access the device. I think a real solution should be in
> QEMU side.

It would be interesting to see what the ACPI spec says about removing devices while sleeping. This must be a problem with real HW too, undocking a sleeping laptop for example.

Comment 10 peliu@redhat.com 2017-03-28 08:27:23 UTC
Reproduce this issue with virtio-win-prewhql-0.1-126;
Verify this issue with virtio-win-prewhql-0.1-135;

Steps same as comment#6.

I can reproduce this with virtio-win-prewhql-0.1-126 in win10-32 guest.The steps same as comment 6.

With virtio-win-prewhql-0.1-135 in win10-32 guest,there is no such problem. 
So this issue has been fixed,thanks.

Comment 11 lijin 2017-03-29 02:02:16 UTC
change status to verified according to comment#10

Comment 14 errata-xmlrpc 2017-08-01 12:55:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2341