Bug 921917
| Summary: | [whql][vioser][Balloon][9F] BSOD happened when ran job "Sleep and PNP (disable and enable) with IO Before and After (Certification)" on HCK on win2k8-32 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | dawu | ||||
| Component: | virtio-win | Assignee: | Gal Hammer <ghammer> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 6.5 | CC: | acathrow, bcao, bsarathy, juzhang, kzhang, lijin, lnovich, mdeng, michen, qzhang, rhod | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-11-05 10:13:21 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 896495, 912287 | ||||||
| Attachments: |
|
||||||
|
Description
dawu
2013-03-15 09:17:15 UTC
Following is the bug dump analysis: 0: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* DRIVER_POWER_STATE_FAILURE (9f) A driver has failed to complete a power IRP within a specific time (usually 10 minutes). Arguments: Arg1: 00000003, A device object has been blocking an Irp for too long a time Arg2: 8a58d030, Physical Device Object of the stack Arg3: 8a58d030, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device Object of the stack Arg4: 8a0d4eb8, The blocked IRP Debugging Details: ------------------ ************************************************************************* *** *** *** *** *** Either you specified an unqualified symbol, or your debugger *** *** doesn't have full symbol information. Unqualified symbol *** *** resolution is turned off by default. Please either specify a *** *** fully qualified symbol module!symbolname, or enable resolution *** *** of unqualified symbols by typing ".symopt- 100". Note that *** *** enabling unqualified symbol resolution with network symbol *** *** server shares in the symbol path may cause the debugger to *** *** appear to hang for long periods of time when an incorrect *** *** symbol name is typed or the network symbol server is down. *** *** *** *** For some commands to work properly, your symbol path *** *** must point to .pdb files that have full type information. *** *** *** *** Certain .pdb files (such as the public OS symbols) do not *** *** contain the required information. Contact the group that *** *** provided you with these symbols if you need this command to *** *** work. *** *** *** *** Type referenced: USBHUB!_DEVICE_EXTENSION_PDO *** *** *** ************************************************************************* DRVPOWERSTATE_SUBCODE: 3 DRIVER_OBJECT: 8a40aad8 IMAGE_NAME: usbhub.sys DEBUG_FLR_IMAGE_TIMESTAMP: 49e01fe2 MODULE_NAME: usbhub FAULTING_MODULE: 8a3bc000 usbhub DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT BUGCHECK_STR: 0x9F PROCESS_NAME: System CURRENT_IRQL: 2 STACK_TEXT: 8190aacc 818463bb 0000009f 00000003 8a58d030 nt!KeBugCheckEx+0x1e 8190ab28 81845fd8 8190ab94 803d3078 803d3000 nt!PopCheckIrpWatchdog+0x1ad 8190ab68 818bf26b 819234e0 00000000 0fabe140 nt!PopCheckForIdleness+0x343 8190ac88 818beea1 8190acd0 895b1402 8190acd8 nt!KiTimerListExpire+0x367 8190ace8 818bf595 00000000 00000000 00035839 nt!KiTimerExpiration+0x2a0 8190ad50 818bd7dd 00000000 0000000e 00000000 nt!KiRetireDpcList+0xba 8190ad54 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x49 STACK_COMMAND: kb FOLLOWUP_NAME: MachineOwner FAILURE_BUCKET_ID: 0x9F_VRF_3_IMAGE_usbhub.sys BUCKET_ID: 0x9F_VRF_3_IMAGE_usbhub.sys Followup: MachineOwner Guest CLI for comment #4 : /usr/libexec/qemu-kvm -m 6G -smp 4 -cpu cpu64-rhel6,+x2apic --nodefaults -drive file=win2k8-32-balloon-54.raw,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,cache=none,format=raw -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,mac=00:11:44:13:49:06,bus=pci.0,addr=0x4,id=net0 -uuid 22a8e40a-a410-4180-9406-e6b2d62435e3 -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/monitor-win2k8-32-balloon-54,server,nowait -mon chardev=111a,mode=readline -name win2k8-32-balloon-54 -vnc :1 -device virtio-balloon-pci,addr=0x6,bus=pci.0 -rtc base=localtime,clock=host,driftfix=slew -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vga cirrus try to test this issue with two scenarios, still hit the same BSOD as comment #4: 1. Tested without cache=none CLI: /usr/libexec/qemu-kvm -m 6G -smp 4 -cpu cpu64-rhel6,+x2apic --nodefaults -drive file=win2k8-32-balloon-54.raw,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,format=raw -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,mac=00:11:44:13:49:06,bus=pci.0,addr=0x4,id=net0 -uuid 22a8e40a-a410-4180-9406-e6b2d62435e3 -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/monitor-win2k8-32-balloon-54,server,nowait -mon chardev=111a,mode=readline -name win2k8-32-balloon-54 -vnc :1 -device virtio-balloon-pci,addr=0x6,bus=pci.0 -rtc base=localtime,clock=host,driftfix=slew -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vga cirrus 2. Tried with bios.bin from bug https://bugzilla.redhat.com/show_bug.cgi?id=912561,still hit the same BSOD as comment #4 described. CLI: /usr/libexec/qemu-kvm -m 6G -smp 4 -cpu cpu64-rhel6,+x2apic --nodefaults -drive file=win2k8-32-balloon-54.raw,if=none,id=drive-ide0-0-0,werror=stop,rerror=stop,format=raw -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,id=hostnet0,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,mac=00:11:44:13:49:06,bus=pci.0,addr=0x4,id=net0 -uuid 22a8e40a-a410-4180-9406-e6b2d62435e3 -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/monitor-win2k8-32-balloon-54,server,nowait -mon chardev=111a,mode=readline -name win2k8-32-balloon-54 -vnc :1 -device virtio-balloon-pci,addr=0x6,bus=pci.0 -rtc base=localtime,clock=host,driftfix=slew -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vga cirrus -bios /home/bios.bin Best Regards, Dawn (In reply to comment #4) > According to logs from Comment #3, try to remove options related USB and add > --nodefaults, still hit 9f BSOD,following is the details for analysis log: > ----------------------------------------------------------------------------- > --- > 0: kd> !analyze -v > ***************************************************************************** > ** > * > * > * Bugcheck Analysis > * > * > * > ***************************************************************************** > ** > > DRIVER_POWER_STATE_FAILURE (9f) > A driver has failed to complete a power IRP within a specific time (usually > 10 minutes). > Arguments: > Arg1: 00000003, A device object has been blocking an Irp for too long a time > Arg2: 892cc6b0, Physical Device Object of the stack > Arg3: 897a5030, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device > Object of the stack > Arg4: 92b3eed8, The blocked IRP > > Debugging Details: > ------------------ > > > DRVPOWERSTATE_SUBCODE: 3 > > IMAGE_NAME: pci.sys > > DEBUG_FLR_IMAGE_TIMESTAMP: 49e01a44 > > MODULE_NAME: pci > > FAULTING_MODULE: 81e58000 pci > > DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT > > BUGCHECK_STR: 0x9F > > PROCESS_NAME: System > > CURRENT_IRQL: 2 > > STACK_TEXT: > 8190facc 8184b3bb 0000009f 00000003 892cc6b0 nt!KeBugCheckEx+0x1e > 8190fb28 8184afd8 8190fb94 81914878 81914800 nt!PopCheckIrpWatchdog+0x1ad > 8190fb68 818c426b 819284e0 00000000 85c0b440 nt!PopCheckForIdleness+0x343 > 8190fc88 818c3e2b 8190fcd0 8a890802 8190fcd8 nt!KiTimerListExpire+0x367 > 8190fce8 818c4595 00000000 00000000 0002c858 nt!KiTimerExpiration+0x22a > 8190fd50 818c27dd 00000000 0000000e 00000000 nt!KiRetireDpcList+0xba > 8190fd54 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x49 > > > STACK_COMMAND: kb > > FOLLOWUP_NAME: MachineOwner > > FAILURE_BUCKET_ID: 0x9F_VRF_3_E1G60I32_IMAGE_pci.sys > > BUCKET_ID: 0x9F_VRF_3_E1G60I32_IMAGE_pci.sys > > Followup: MachineOwner > --------- > FYI, the dumps above is same as https://bugzilla.redhat.com/show_bug.cgi?id=929015 when we run the job on serial driver *** Bug 929015 has been marked as a duplicate of this bug. *** test in build 62,hit the same issue.I will upload the dump file later. Created attachment 752495 [details]
build62 BSOD dump file
Removed "serial" from bug's title for two reasons: 1. Each driver should have its own bug. 2. virtio-serial build 64 passed HCK tests on win2k8 without bsod. Because of recent Microsoft's changes regarding WHQL cetification we plan to stop our efforts to pass HCK tests for older Windows version (XP, 2003 and 2008) and use the WLK tests. Do you know what is the status of the latest balloon driver? Is it WLK certified? Thanks. (In reply to Gal Hammer from comment #17) > Because of recent Microsoft's changes regarding WHQL cetification we plan to > stop our efforts to pass HCK tests for older Windows version (XP, 2003 and > 2008) and use the WLK tests. MSFT new ceritifition plan will come into operation almost Next Year .QE Would like to keep our current test strategy that still certifying drivers all on HCK2.0 in rhel6.5.0 stage . > Do you know what is the status of the latest balloon driver? Is it WLK > certified? We will verify it recently Mike > > Thanks. win2k8-32 hit the same issue on virtio-win-prewhql-65 (In reply to guo jiang from comment #19) > win2k8-32 hit the same issue on virtio-win-prewhql-65 Was it 100% reproducible ? (In reply to Gal Hammer from comment #12) > Removed "serial" from bug's title for two reasons: > > 1. Each driver should have its own bug. > 2. virtio-serial build 64 passed HCK tests on win2k8 without bsod. QE re-add "serial" from bug's title because ... 1.Seems the driver is not related to driver itself .but both balloon driver & Vioser driver on win2k8-32 hit this issue . 2.Both balloon & Vioserial can pass w/o s3/s4 enabled ,then it is very easy for us to hit this issue when S3/S4 enabled Mike (In reply to Mike Cao from comment #20) > (In reply to guo jiang from comment #19) > > win2k8-32 hit the same issue on virtio-win-prewhql-65 > > Was it 100% reproducible ? Hi,Mike QE reproduced this issue 100%(4/4) with S3/S4 enabled. Jiang Guo We have a workaround that allows us to pass the WHQL test, so this is not a critical bug anymore. Will keep it open to investigate and find the root cause, but it won't be fixed in RHEL6.5. win2k8-32 still hit the same issue with virtio-win-prewhql-67 on rhel6.5 host.
Package version:
* Red Hat Enterprise Linux Server release 6.4 (Santiago)
* kernel-2.6.32-393.el6.x86_64
* qemu-kvm-rhev-0.12.1.2-2.377.el6.x86_64
* virtio-win-prewhql-0.1-66
* spice-server-0.12.0-14.el6.x86_64
* seabios-0.6.1.2-27.el6.x86_64
* vgabios-0.6b-3.7.el6.noarch
Boot CLI:
/usr/libexec/qemu-kvm -M rhel6.5.0 -m 6G -smp 8,cores=8 -cpu cpu64-rhel6,+x2apic,family=0xf -usb -device usb-tablet -drive file=win2k8-32.raw,if=none,id=drive-ide0-0-0,format=raw,rerror=stop,werror=stop,cache=none -device ide-drive,drive=drive-ide0-0-0,id=ide0-0-0-0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device e1000,netdev=hostnet0,mac=00:21:2c:13:a3:31,id=net0 -uuid 7c5dc2e7-ef73-4cf4-91c7-bbf2a6a924b6 -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/monitor-win2k8-32-serial-67,server,nowait -mon chardev=111a,mode=readline -name win2k8-32-serial-67 -vnc :1 -vga cirrus -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0 -chardev socket,id=channel0,path=/tmp/tt,server,nowait -device virtserialport,chardev=channel0,name=org.linux-kvm.port.0,bus=virtio-serial0.0 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio
WinDbg info:
..............
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 9F, {3, 89e21828, 895dbbf8, 871d4e48}
Probably caused by : acpi.sys
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time (usually 10 minutes).
Arguments:
Arg1: 00000003, A device object has been blocking an Irp for too long a time
Arg2: 89e21828, Physical Device Object of the stack
Arg3: 895dbbf8, nt!TRIAGE_9F_POWER on Win7, otherwise the Functional Device Object of the stack
Arg4: 871d4e48, The blocked IRP
Debugging Details:
------------------
OVERLAPPED_MODULE: Address regions for 'hiber_dumpata' and 'vioser.sys' overlap
DRVPOWERSTATE_SUBCODE: 3
IMAGE_NAME: acpi.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 49e01a37
MODULE_NAME: acpi
FAULTING_MODULE: 81e03000 acpi
DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT
BUGCHECK_STR: 0x9F
PROCESS_NAME: System
CURRENT_IRQL: 2
STACK_TEXT:
8190eacc 8184a3bb 0000009f 00000003 89e21828 nt!KeBugCheckEx+0x1e
8190eb28 81849fd8 8190eb94 86a02078 86a02000 nt!PopCheckIrpWatchdog+0x1ad
8190eb68 818c326b 819274e0 00000000 9ca5b440 nt!PopCheckForIdleness+0x343
8190ec88 818c2e2b 8190ecd0 8aecdf02 8190ecd8 nt!KiTimerListExpire+0x367
8190ece8 818c3595 00000000 00000000 0002d383 nt!KiTimerExpiration+0x22a
8190ed50 818c17dd 00000000 0000000e 00000000 nt!KiRetireDpcList+0xba
8190ed54 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x49
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
FAILURE_BUCKET_ID: 0x9F_VRF_3_i8042prt_IMAGE_acpi.sys
BUCKET_ID: 0x9F_VRF_3_i8042prt_IMAGE_acpi.sys
Followup: MachineOwner
---------
WHQL tests for Windows 2008 are no longer use HCK 2.0 and are done using WLK. AFAIK the bug doesn't occurs on the WLK suite. |