Bug 890485

Summary: [virtio-win][scsi][9F] BSOD after S4/reboot while putting a lit heavy IO load on virtio-scisi disk by crystal disk mark tool in guest
Product: Red Hat Enterprise Linux 6 Reporter: dawu
Component: virtio-winAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4CC: acathrow, bcao, bsarathy, ghammer, juzhang, michen, rhod
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 14:07:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 896495    

Description dawu 2012-12-27 08:25:22 UTC
Description of problem:
BSOD with code 9f sometimes happens after S4/reboot while putting a lit heavy IO load on virtio-scisi disk  by crystal disk mark tool in guest

Version-Release number of selected component (if applicable):
kernel-2.6.32-348.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.344.el6.x86_64
virtio-win-prewhql-0.1-49

How reproducible:
60%

Steps to Reproduce:
1.Start guest with 4 virito-scsi disks which including 3 kinds of cache (writeback/none/writethrough)

  /usr/libexec/qemu-kvm -m 2G -smp 2 -cpu cpu64-rhel6,+x2apic -usb -device usb-tablet -drive file=win2k8-64-scsi-49.qcow2,format=qcow2,if=none,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=writeback,aio=native,id=scsi-disk0 -device virtio-scsi-pci,id=bus0 -device scsi-hd,bus=bus0.0,drive=scsi-disk0,id=disk0,serial=test0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,mac=00:10:16:23:78:01,bus=pci.0,addr=0x4 -uuid 00e5dd78-317c-499b-92ad-c3b819901407 -rtc base=localtime -no-kvm-pit-reinjection -monitor stdio -name win08R2-blk -spice disable-ticketing,port=5931 -vga qxl -drive file=disk1.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=none,aio=native,id=scsi-disk1 -device virtio-scsi-pci,id=bus1 -device scsi-hd,bus=bus1.0,drive=scsi-disk1,id=disk1,serial=test1 -drive file=disk2.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=none,aio=native,id=scsi-disk2 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk2,id=disk2,serial=test2 -drive file=disk3.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=writethrough,aio=native,id=scsi-disk3 -device virtio-scsi-pci,id=bus3 -device scsi-hd,bus=bus3.0,drive=scsi-disk3,id=disk3,serial=test3 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2.Running crystal disk mark for every disks with the largest value of options on them.Such as "9  4000M" for system disk.

3.Let crystal disk mark tool running for an enough long time before running ending.

4. S4 or reboot guest.
  
Actual results:
Guest BSOD some times with 9f error code.
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9F, {3, fffffa8001b729e0, fffffa8002d5d790, fffffa80018c5690}

*** ERROR: Module load completed but symbols could not be loaded for vioscsi.sys
Probably caused by : vioscsi.sys

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time (usually 10 minutes).
Arguments:
Arg1: 0000000000000003, A device object has been blocking an Irp for too long a time
Arg2: fffffa8001b729e0, Physical Device Object of the stack
Arg3: fffffa8002d5d790, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device Object of the stack
Arg4: fffffa80018c5690, The blocked IRP

Debugging Details:
------------------


DRVPOWERSTATE_SUBCODE:  3

IRP_ADDRESS:  fffffa80018c5690

DEVICE_OBJECT: fffffa8001b729e0

DRIVER_OBJECT: fffffa8001b63960

IMAGE_NAME:  vioscsi.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  50b73b33

MODULE_NAME: vioscsi

FAULTING_MODULE: fffffa6000cca000 vioscsi

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

BUGCHECK_STR:  0x9F

PROCESS_NAME:  System

CURRENT_IRQL:  2

LAST_CONTROL_TRANSFER:  from fffff8000170a76d to fffff800016af450

STACK_TEXT:  
fffff800`026ea9f8 fffff800`0170a76d : 00000000`0000009f 00000000`00000003 fffffa80`01b729e0 fffffa80`02d5d790 : nt!KeBugCheckEx
fffff800`026eaa00 fffff800`016b30dd : fffff800`026eaaf0 00000000`00000000 00000000`00000001 fffffa60`005efb00 : nt! ?? ::FNODOBFM::`string'+0x17cec
fffff800`026eaa70 fffff800`016b2818 : fffff800`026eacd0 fffffa60`022fe702 fffff800`026eacc8 00000000`00000001 : nt!KiTimerListExpire+0x30d
fffff800`026eaca0 fffff800`016b372f : 0000019a`fa3c774f 00000000`00000000 00000000`00000001 fffff800`017cba80 : nt!KiTimerExpiration+0x1d8
fffff800`026ead10 fffff800`016b38e2 : fffff800`017c8680 fffff800`017c8680 00000000`00000000 fffff800`017cdb80 : nt!KiRetireDpcList+0x1df
fffff800`026ead80 fffff800`01880860 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x62
fffff800`026eadb0 00000000`fffff800 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!zzz_AsmCodeRange_End+0x4
fffff800`026e40b0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00680000`00000000 : 0xfffff800


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

FAILURE_BUCKET_ID:  X64_0x9F_disk.sys_CNVIRP_IMAGE_vioscsi.sys

BUCKET_ID:  X64_0x9F_disk.sys_CNVIRP_IMAGE_vioscsi.sys

Followup: MachineOwner
---------

Expected results:
Guest should do S4 and reboot successfully without any error.

Additional info:

Comment 3 RHEL Program Management 2012-12-31 06:48:25 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 5 Gal Hammer 2013-03-14 15:46:23 UTC
Is it related to the bios fix (bug 846912)?

Comment 6 dawu 2013-03-15 03:12:33 UTC
(In reply to comment #5)
> Is it related to the bios fix (bug 846912)?

Hi Gal,

You are right, it's more like the bug 846519, but all of them should be related to the bios fix (bug 846912).

Thanks,
Best Regards,
Dawn

Comment 8 lijin 2013-07-24 08:26:04 UTC
Reproduced this issue on seabios-0.6.1.2-26.el6.x86_64 and virtio-win-prewhql-0.1-49;
Verified this issue on seabios-0.6.1.2-28.el6.x86_64 and virtio-win-prewhql-0.1-65

Steps:
1.boot win2k8-64 guest:
/usr/libexec/qemu-kvm  \
-drive file=win2k8-64.qcow2,if=none,cache=none,media=disk,format=qcow2,id=drive-scsi-0 \
-device virtio-scsi-pci,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=drive-scsi-0,id=hd1 \
-usb -device usb-tablet \
-monitor stdio \
-chardev socket,id=aaaa,path=/tmp/tttt,server,nowait \
-mon chardev=aaaa,mode=readline \
-spice disable-ticketing,port=5900 -vga qxl \
-chardev file,path=/root/console.log,id=serial1 \
-device isa-serial,chardev=serial1,id=s1 \
-cpu 'Penryn' -M pc \
-smp 2,cores=2,threads=1,sockets=1 -m 2G \
-enable-kvm \
-drive file=disk1.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=none,aio=native,id=scsi-disk1 -device virtio-scsi-pci,id=bus1 -device scsi-hd,bus=bus1.0,drive=scsi-disk1,id=disk1,serial=test1 \
-drive file=disk2.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=writeback,aio=native,id=scsi-disk2 -device virtio-scsi-pci,id=bus2 -device scsi-hd,bus=bus2.0,drive=scsi-disk2,id=disk2,serial=test2 \
-drive file=disk3.qcow2,if=none,media=disk,format=qcow2,rerror=stop,werror=stop,cache=writethrough,aio=native,id=scsi-disk3 -device virtio-scsi-pci,id=bus3 -device scsi-hd,bus=bus3.0,drive=scsi-disk3,id=disk3,serial=test3 \
-global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2.run CrystalDiskMark for every disks with the largest value of options on them.Such as "9  4000M" for system disk

3.let CrystalDiskMark running for an enough long time before running ending.

4.do s4 in guest

Actual Results:
on seabios-0.6.1.2-26.el6.x86_64,guest bsod with code 9F(try 3 times,bsod once)
on seabios-0.6.1.2-28.el6.x86_64,guest works well,no bsod(3 times).

Based on above ,this issue has been fixed already.

Comment 9 Mike Cao 2013-07-24 14:07:54 UTC
Thanks for the results 

Closing.

*** This bug has been marked as a duplicate of bug 896495 ***