Bug 1126378
Summary: | [virtio-win][vioscsi][rhel6]win2012 guest bsod(d1) when shutdown guest with multi virtio-scsi devices on the same scsi controller | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | lijin <lijin> |
Component: | virtio-win | Assignee: | Vadim Rozenfeld <vrozenfe> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.1 | CC: | famz, hhuang, knoel, lijin, michen, rbalakri, virt-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
Attaching more than four LUNs to the same virtio-scsi controller can causes BSOD
Consequence:
under some circumstances, VM will crash with BSOD
Fix:
Add pending SRB queue to keep unsubmitted SRBs for the future processing instead of failing them.
Result:
Virtio-scsi device driver is capable to service up to 254 LUNs attached to the same controller
*NOTE this bug has the same Doc context as bz#1195920 because these two bugs are closely related and were fixed with the same patch.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2015-11-24 08:43:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
lijin
2014-08-04 09:52:43 UTC
win2012 guest hit the similar issue while running iometers on multiple scsi disks,guest bsod with "d1" error code. How reproducible: 2/3 steps: Testing Matrix , 1. NTFS, cache=none|writethroug|writheback|unsafe 2. FAT32,cache=none|writethrough|writheback|unsafe 3. FAT,cache=none|writethrough|writheback|unsafe 1.boot guest with one scsi system disk and 12 data disks: /usr/libexec/qemu-kvm -drive file=win2k12-scsi-iso.qcow2,if=none,cache=none,media=disk,format=qcow2,werror=stop,rerror=stop,id=drive-scsi-0 -device virtio-scsi-pci,id=scsi0 -device scsi-hd,id=disk-scsi-0,drive=drive-scsi-0,bootindex=1 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -usb -device usb-tablet -monitor stdio -chardev file,path=/root/console.log,id=serial1 -device isa-serial,chardev=serial1,id=s1 -cpu SandyBridge -M rhel6.6.0 -smp 4 -m 4G -enable-kvm -qmp tcp:0:4444,server,nowait -vnc :0 -vga cirrus \ -device virtio-scsi-pci,id=scsi1 \ -drive file=ntfs-none.raw,if=none,cache=none,media=disk,format=raw,id=drive-scsi-1 -device scsi-hd,id=disk-scsi-1,drive=drive-scsi-1 -drive file=ntfs-back.raw,if=none,cache=writeback,media=disk,format=raw,id=drive-scsi-2 -device scsi-hd,id=disk-scsi-2,drive=drive-scsi-2 -drive file=ntfs-through.raw,if=none,cache=writethrough,media=disk,format=raw,id=drive-scsi-3 -device scsi-hd,id=disk-scsi-3,drive=drive-scsi-3 -drive file=ntfs-unsafe.raw,if=none,cache=unsafe,media=disk,format=raw,id=drive-scsi-4 -device scsi-hd,id=disk-scsi-4,drive=drive-scsi-4 -drive file=fat-none.raw,if=none,cache=none,media=disk,format=raw,id=drive-scsi-5 -device scsi-hd,id=disk-scsi-5,drive=drive-scsi-5 -drive file=fat-back.raw,if=none,cache=writeback,media=disk,format=raw,id=drive-scsi-6 -device scsi-hd,id=disk-scsi-6,drive=drive-scsi-6 -drive file=fat-through.raw,if=none,cache=writethrough,media=disk,format=raw,id=drive-scsi-7 -device scsi-hd,id=disk-scsi-7,drive=drive-scsi-7 -drive file=fat-unsafe.raw,if=none,cache=unsafe,media=disk,format=raw,id=drive-scsi-8 -device scsi-hd,id=disk-scsi-8,drive=drive-scsi-8 -drive file=fat32-none.raw,if=none,cache=none,media=disk,format=raw,id=drive-scsi-9 -device scsi-hd,id=disk-scsi-9,drive=drive-scsi-9 -drive file=fat32-back.raw,if=none,cache=writeback,media=disk,format=raw,id=drive-scsi-10 -device scsi-hd,id=disk-scsi-10,drive=drive-scsi-10 -drive file=fat32-through.raw,if=none,cache=writethrough,media=disk,format=raw,id=drive-scsi-11 -device scsi-hd,id=disk-scsi-11,drive=drive-scsi-11 -drive file=fat32-unsafe.raw,if=none,cache=unsafe,media=disk,format=raw,id=drive-scsi-12 -device scsi-hd,id=disk-scsi-12,drive=drive-scsi-12 2.format the disk into ntfs/fat32/fat 3.Execute iometers on those disks together(include the system disk) Actual result: guest bsod with d1 the windbg info: 3: kd> !analyze -v ******************************************************************************* * * * Bugcheck Analysis * * * ******************************************************************************* DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: fffff98000030008, memory referenced Arg2: 0000000000000007, IRQL Arg3: 0000000000000000, value 0 = read operation, 1 = write operation Arg4: fffff8800103683b, address which referenced memory Debugging Details: ------------------ READ_ADDRESS: fffff98000030008 CURRENT_IRQL: 7 FAULTING_IP: vioscsi+283b fffff880`0103683b 8b4608 mov eax,dword ptr [rsi+8] DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT BUGCHECK_STR: AV PROCESS_NAME: System TAG_NOT_DEFINED_c000000f: FFFFF88002D01FB0 TRAP_FRAME: fffffa8005921b00 -- (.trap 0xfffffa8005921b00) NOTE: The trap frame does not contain all registers. Some register values may be zeroed or incorrect. rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000 rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000 rip=0000000000000000 rsp=0100050500000000 rbp=fffffa8005921b00 r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 iopl=0 nv up di pl nz na pe nc 00000000`00000000 ?? ??? Resetting default scope EXCEPTION_RECORD: fffff800a7f4e12a -- (.exr 0xfffff800a7f4e12a) ExceptionAddress: 20418b2b74c98548 ExceptionCode: 8348c033 ExceptionFlags: 90c328c4 NumberParameters: 611631237 Parameter[0]: 0f41c88b44000002 Parameter[1]: 0fca3ac2b60f08b6 Parameter[2]: 0000c0c08149c147 Parameter[3]: e675c9ff49d08a00 Parameter[4]: 8348909090c3c28a Parameter[5]: 3b43b5e8c93328ec Parameter[6]: c328c48348c03300 Parameter[7]: d18b4cc933459090 Parameter[8]: 0280fa813774d285 Parameter[9]: f30d8d4838730000 Parameter[10]: 81048bc28b00217b Parameter[11]: eac1d08b2874c085 Parameter[12]: 0fc0b60f443f2406 Parameter[13]: 4c08ca548b49cab7 Parameter[14]: 41c1920f41c2a30f LAST_CONTROL_TRANSFER: from fffff800a7e87369 to fffff800a7e88040 STACK_TEXT: fffff880`02cfa428 fffff800`a7e87369 : 00000000`0000000a fffff980`00030008 00000000`00000007 00000000`00000000 : nt!KeBugCheckEx fffff880`02cfa430 fffff800`a7e85be0 : 00000000`00000000 fffffa80`05ac0620 fffff800`a8103100 fffff880`02cfa570 : nt!KiBugCheckDispatch+0x69 fffff880`02cfa570 fffff880`0103683b : fffffa80`04f76c70 fffff880`0371a3e0 fffffa80`04f76c70 00000000`00000063 : nt!KiPageFault+0x260 fffff880`02cfa700 fffff800`a7e81106 : fffff880`02cd65c8 fffff880`0000086c fffff880`0316ac00 00000000`ffffffff : vioscsi+0x283b fffff880`02cfa730 fffff800`a858bfdf : fffff800`a7f4e12a fffffa80`05921c70 fffffa80`05921b00 fffff880`02cdce40 : nt!KiInterruptDispatch+0x1d6 fffff880`02cfa8c8 fffff800`a7f4e12a : fffffa80`05921c70 fffffa80`05921b00 fffff880`02cdce40 fffff880`016b3a5b : hal!HalProcessorIdle+0xf fffff880`02cfa8d0 fffff800`a7eb69a0 : fffff880`02cd65c8 00000000`00369e99 00000000`00000000 00000000`00000000 : nt!PpmIdleDefaultExecute+0xa fffff880`02cfa900 fffff800`a7eb53c0 : 00000000`00000000 00010e15`00010e15 fffff880`02cfab68 fffff880`02cfab70 : nt!PpmIdleExecuteTransition+0x47f fffff880`02cfab20 fffff800`a7eb354c : fffff880`02cd1180 fffff880`02cd1180 00000000`00000000 fffff880`02cdce40 : nt!PoIdle+0x1e0 fffff880`02cfac60 00000000`00000000 : fffff880`02cfb000 fffff880`02cf5000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x2c STACK_COMMAND: kb FOLLOWUP_IP: vioscsi+283b fffff880`0103683b 8b4608 mov eax,dword ptr [rsi+8] SYMBOL_STACK_INDEX: 3 SYMBOL_NAME: vioscsi+283b FOLLOWUP_NAME: MachineOwner IMAGE_NAME: Unknown_Image DEBUG_FLR_IMAGE_TIMESTAMP: 0 BUCKET_ID: RAISED_IRQL_USER_FAULT MODULE_NAME: Unknown_Module Followup: MachineOwner --------- Move back to rhel7.1.0 as Fam can 100% reproduce it when debuging https://bugzilla.redhat.com/show_bug.cgi?id=1125796 Please test with the latest virtio-win driver https://brewweb.devel.redhat.com/buildinfo?buildID=361029 It has some improvements on virtio-scsi driver, and also fixes bug 1125796. (In reply to Fam Zheng from comment #5) > Please test with the latest virtio-win driver > > https://brewweb.devel.redhat.com/buildinfo?buildID=361029 > > It has some improvements on virtio-scsi driver, and also fixes bug 1125796. retry scenario in comment #3 five times with virtio-win-prewhql-89,iometer can be finished without any error,and guest works fine. (In reply to lijin from comment #6) > (In reply to Fam Zheng from comment #5) > > Please test with the latest virtio-win driver > > > > https://brewweb.devel.redhat.com/buildinfo?buildID=361029 > > > > It has some improvements on virtio-scsi driver, and also fixes bug 1125796. > > retry scenario in comment #3 five times with virtio-win-prewhql-89,iometer > can be finished without any error,and guest works fine. Since We would like to ship -88 , Can you retest this on virtio-win-prewhql-88 ? Thanks, Mike (In reply to Mike Cao from comment #7) > (In reply to lijin from comment #6) > > (In reply to Fam Zheng from comment #5) > > > Please test with the latest virtio-win driver > > > > > > https://brewweb.devel.redhat.com/buildinfo?buildID=361029 > > > > > > It has some improvements on virtio-scsi driver, and also fixes bug 1125796. > > > > retry scenario in comment #3 five times with virtio-win-prewhql-89,iometer > > can be finished without any error,and guest works fine. > > Since We would like to ship -88 , Can you retest this on > virtio-win-prewhql-88 ? > > Thanks, > Mike retest with build 88,guest still bsod with d1 As guest only bsod once in comment0's scenario ,it's not easy to reproduce this issue,both b86 and b93 work fine during my reproduce; But in comment3's scenario(nearly 100% reproducible),guest BSOD with b86,and guest works fine with b93,hope this info can help. package info: virtio-win-prewhql-86/93 qemu-kvm-rhev-0.12.1.2-2.444.el6.x86_64 kernel-2.6.32-492.el6.x86_64 seabios-0.6.1.2-28.el6.x86_64 spice-server-0.12.4-11.el6.x86_64 (In reply to lijin from comment #12) > As guest only bsod once in comment0's scenario ,it's not easy to reproduce It should be easier to reproduce this problem under heavy read/write load. Let say copying a huge file to 5 disks simultaneously, while shutting the guest down. > this issue,both b86 and b93 work fine during my reproduce; > But in comment3's scenario(nearly 100% reproducible),guest BSOD with b86,and > guest works fine with b93,hope this info can help. > > package info: > virtio-win-prewhql-86/93 > qemu-kvm-rhev-0.12.1.2-2.444.el6.x86_64 > kernel-2.6.32-492.el6.x86_64 > seabios-0.6.1.2-28.el6.x86_64 > spice-server-0.12.4-11.el6.x86_64 (In reply to Vadim Rozenfeld from comment #13) > (In reply to lijin from comment #12) > > As guest only bsod once in comment0's scenario ,it's not easy to reproduce > > It should be easier to reproduce this problem under heavy read/write load. > Let say copying a huge file to 5 disks simultaneously, while shutting the > guest down. Thanks Vadim,it works. guest bsod with build86 and guest works fine with build93. steps: 1.boot win2012 guest with 256 disks(one 50G system disk,255 500M data disks) 2.format several disks(about 11 disks) 3.do iometer on those disks 4.shutdown guest > > this issue,both b86 and b93 work fine during my reproduce; > > But in comment3's scenario(nearly 100% reproducible),guest BSOD with b86,and > > guest works fine with b93,hope this info can help. > > > > package info: > > virtio-win-prewhql-86/93 > > qemu-kvm-rhev-0.12.1.2-2.444.el6.x86_64 > > kernel-2.6.32-492.el6.x86_64 > > seabios-0.6.1.2-28.el6.x86_64 > > spice-server-0.12.4-11.el6.x86_64 Reproduced this issue on virtio-win-prewhql-86 Verified this issue on virtio-win-prewhql-100 package info: qemu-kvm-rhev-0.12.1.2-2.454.el6.x86_64 kernel-2.6.32-539.el6.x86_64 seabios-0.6.1.2-29.el6.x86_64 steps same as comment #14 Actual Results: on virtio-win-prewhql-86, run twice,guest BSOD once; on virtio-win-prewhql-100,run five times,guest can shutdown correctly,no BSOD Based on above ,this issue has been fixed already. Reproduced this issue on virtio-win-prewhql-86 Verified this issue on virtio-win-prewhql-102 package info: qemu-kvm-rhev-0.12.1.2-2.458.el6.x86_64 kernel-2.6.32-540.el6.x86_64 seabios-0.6.1.2-29.el6.x86_64 steps same as comment #14 Actual Results: on virtio-win-prewhql-86,BSOD happens when shutdown guest while doing iometer test; on virtio-win-prewhql-102,run five times,guest can shutdown correctly,no BSOD Based on above ,this issue has been fixed already. The problem was fixed together with bz#1195920 Can we close it or move to verified? Thanks, Vadim. (In reply to Vadim Rozenfeld from comment #21) > The problem was fixed together with bz#1195920 > Can we close it or move to verified? Yes. According to comment#20,this issue has been fixed. Change the status to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2513.html |