Bug 873960 - [virtio-win][serial]Hibernate [S4] or Sleep [S3] leads guest get BSOD[0x000000D1] during data transferring
[virtio-win][serial]Hibernate [S4] or Sleep [S3] leads guest get BSOD[0x00000...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: virtio-win (Show other bugs)
6.4
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Yan Vugenfirer
Virtualization Bugs
: Regression
Depends On: 873971
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-07 00:40 EST by dengmin
Modified: 2013-02-21 05:41 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 05:41:06 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screenshot and related dump (257.97 KB, application/x-zip-compressed)
2012-11-07 00:47 EST, dengmin
no flags Details
scripts (12.13 KB, application/x-zip-compressed)
2012-11-07 00:51 EST, dengmin
no flags Details

  None (edit)
Description dengmin 2012-11-07 00:40:34 EST
Description of problem:
The guest got BSOD issue as soon as do Hibernate while transferring data from guest to host on win7 64 guest.

Version-Release number of selected component (if applicable):
virtio-win-prewhql-0.1-43
kernel-2.6.32-338.el6.x86_64
qemu-kvm-0.12.1.2-2.331.el6.x86_64

How reproducible:
2 times 2 failed 

Steps to Reproduce:
1.boot up guest 
  /usr/libexec/qemu-kvm -m 4G -smp 4 -usb -device usb-tablet,id=tablet0 -drive file=win7-64-run.raw,if=none,id=drive-virtio0-0-0,format=raw,werror=stop,rerror=stop,cache=none -device ide-drive,drive=drive-virtio0-0-0,id=virti0-0-0,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device rtl8139,netdev=hostnet0,id=net0,mac=00:12:10:94:a3:f8 -uuid 8677bf9b-3cf6-47a1-b3e9-94213b801c87 -monitor stdio -spice id=on,disable-ticketing,port=5912 -vga qxl -device virtio-serial-pci,id=virtio-serial0,max_ports=16 -chardev socket,id=channel0,path=/tmp/helloworld,server,nowait -device virtserialport,nr=2,chardev=channel0,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port0 -device virtio-serial-pci,id=virtio-serial1,max_ports=31 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,nr=2,chardev=channel1,name=com.redhat.rhevm.vdsm1,bus=virtio-serial1.0,id=port1 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -device virtio-balloon-pci,addr=0x6,bus=pci.0,id=balloon1  
2.Transfer data from guest to host with the scripts in the attachment
3.Do S4 for the guest 
  
Actual results:
The guest got BSOD with error code - 0x000000D1
Expected results:
The guest do s4 successfully

Additional info:
Screen shot and scripts will upload to the bug.
Comment 2 dengmin 2012-11-07 00:47:59 EST
Created attachment 639834 [details]
Screenshot and related dump
Comment 3 dengmin 2012-11-07 00:51:08 EST
Created attachment 639835 [details]
scripts

on guest runs following script in cygwin
for ((;;))
do 
 python VirtIochannel_Guest_send com.redhat.rhevm.vdsm
done

on host,
for ((;;))
do
python serial-host-receive.py /tmp/helloworld
done.
Comment 5 dengmin 2012-11-07 01:01:12 EST
By the way,do S3 will get the same error,so please double check it.
Comment 6 Yan Vugenfirer 2012-11-11 10:56:46 EST
Looks like a synchronization error.

We are trying to access port data after "port" was removed:


1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000010, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff880035ccc62, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: GetPointerFromAddress: unable to read from fffff800028fe100
 0000000000000010 

CURRENT_IRQL:  2

FAULTING_IP: 
vioser!VIOSerialDiscardPortDataLocked+6a [c:\cygwin\tmp\build\source\internal-kvm-guest-drivers-windows\vioserial\sys\port.c @ 398]
fffff880`035ccc62 488b3cc8        mov     rdi,qword ptr [rax+rcx*8]

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0xD1

PROCESS_NAME:  python.exe

TRAP_FRAME:  fffff88003b19770 -- (.trap 0xfffff88003b19770)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000002
rdx=0000057ffb72e678 rsi=0000000000000000 rdi=0000000000000000
rip=fffff880035ccc62 rsp=fffff88003b19900 rbp=fffffa80049cb1d8
 r8=fffff880035d4168  r9=0000000000000000 r10=fffffa80048cc4d0
r11=fffff88003b19920 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei ng nz na pe nc
vioser!VIOSerialDiscardPortDataLocked+0x6a:
fffff880`035ccc62 488b3cc8        mov     rdi,qword ptr [rax+rcx*8] ds:9000:00000000`00000010=????????????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff800026ce569 to fffff800026cefc0

STACK_TEXT:  
fffff880`03b19628 fffff800`026ce569 : 00000000`0000000a 00000000`00000010 00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
fffff880`03b19630 fffff800`026cd1e0 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiBugCheckDispatch+0x69
fffff880`03b19770 fffff880`035ccc62 : 00000000`00000000 fffffa80`04a1a430 00000000`249e6908 fffff8a0`00000008 : nt!KiPageFault+0x260
fffff880`03b19900 fffff880`035cd520 : fffffa80`04a1a430 fffffa80`049f34b0 00000000`00000000 0000057f`fc427178 : vioser!VIOSerialDiscardPortDataLocked+0x6a [c:\cygwin\tmp\build\source\internal-kvm-guest-drivers-windows\vioserial\sys\port.c @ 398]
fffff880`03b19930 fffff880`00f9ca76 : fffffa80`048cc4d0 00000000`00000000 fffffa80`048cc4d0 00000000`00000000 : vioser!VIOSerialPortClose+0xa4 [c:\cygwin\tmp\build\source\internal-kvm-guest-drivers-windows\vioserial\sys\port.c @ 1167]
fffff880`03b19960 fffff880`00f9c2cf : 00000000`00000001 fffffa80`03bd8e80 00000000`00000000 fffffa80`04a1a070 : Wdf01000!FxPkgGeneral::OnClose+0xaa
fffff880`03b199c0 fffff880`00f90245 : fffffa80`03b7f1d0 00000000`00000001 fffffa80`0518dee0 00000000`00000000 : Wdf01000!FxPkgGeneral::Dispatch+0x14f
fffff880`03b19a20 fffff800`029c6f2e : 00000000`00000070 00000000`00000001 fffffa80`0518dee0 00000000`00000000 : Wdf01000!FxDevice::Dispatch+0xa9
fffff880`03b19a50 fffff800`026d81d4 : fffffa80`03b993f8 fffffa80`03b99060 fffffa80`036d64b0 fffff880`03b19bc0 : nt!IopDeleteFile+0x11e
fffff880`03b19ae0 fffff800`029c1ae4 : fffffa80`03b99060 00000000`00000002 fffffa80`0380bb50 00000000`00000002 : nt!ObfDereferenceObject+0xd4
fffff880`03b19b40 fffff800`029c2094 : 00000000`00000230 fffffa80`03b99060 fffff8a0`02542d10 00000000`00000230 : nt!ObpCloseHandleTableEntry+0xc4
fffff880`03b19bd0 fffff800`026ce253 : fffffa80`0380bb50 fffff880`03b19ca0 00000000`7efdb000 00000000`022cab70 : nt!ObpCloseHandle+0x94
fffff880`03b19c20 00000000`77bd140a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`0008e318 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x77bd140a


STACK_COMMAND:  kb

FOLLOWUP_IP: 
vioser!VIOSerialDiscardPortDataLocked+6a [c:\cygwin\tmp\build\source\internal-kvm-guest-drivers-windows\vioserial\sys\port.c @ 398]
fffff880`035ccc62 488b3cc8        mov     rdi,qword ptr [rax+rcx*8]

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  vioser!VIOSerialDiscardPortDataLocked+6a

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: vioser

IMAGE_NAME:  vioser.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  50906b32

FAILURE_BUCKET_ID:  X64_0xD1_vioser!VIOSerialDiscardPortDataLocked+6a

BUCKET_ID:  X64_0xD1_vioser!VIOSerialDiscardPortDataLocked+6a

Followup: MachineOwner
---------
Comment 7 Mike Cao 2012-11-12 22:31:55 EST
QE can hit this issue both from guest to host and host to guest.
Comment 8 Gal Hammer 2012-11-19 04:39:23 EST
The fix for bz#873971 seem to handle this crash as well.
Comment 10 dengmin 2012-11-27 01:10:24 EST
Hi Gal,
   Re-test the bug on win7 guest vai build 46,there is not any BSOD issue while doing S4,so does S3.The fix should works,thanks for your work.Any issues please let me know.   
Best Regards
Min
Comment 11 dengmin 2012-11-27 01:16:28 EST
Verified the bug on build 46 and the bug was ever found on build 43
Steps,
please refer to the CLI on comment0

Actual results,
the virtio serial works well after guest doing S3 and S4,there isn't any BSOD issues.
Expected  results,
the virtio serial works well after guest doing S3 and S4,there isn't any BSOD issues.

 So the bug is fixed,thanks a lot!
Best Regards,
Min
Comment 12 Mike Cao 2012-11-27 01:20:16 EST
Based on comment #11 ,this issue has been fixed already 
Move status to VERFIED.
Comment 13 Gal Hammer 2012-11-27 02:46:33 EST
(In reply to comment #10)

>    Re-test the bug on win7 guest vai build 46,there is not any BSOD issue
> while doing S4,so does S3.The fix should works,thanks for your work.Any
> issues please let me know.   

You're welcome. There is nothing it :-).

The only issue which is related to this fix that I'm aware of is BZ 878291.
Comment 14 errata-xmlrpc 2013-02-21 05:41:06 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0441.html

Note You need to log in before you can comment on or make changes to this bug.