Bug 1119966

Summary: [whql][netkvm][RHEL6]guests bsod (0xd1) when running job "NDISTest 6.5 - [1 Machine] - StandardizedKeywords"
Product: Red Hat Enterprise Linux 7 Reporter: lijin <lijin>
Component: virtio-winAssignee: Yvugenfi <yvugenfi>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.2CC: dfleytma, ghammer, hhuang, knoel, mdeng, michen, ovasik, rbalakri, virt-maint, vrozenfe, yvugenfi
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: Fixed_Not_Ship
Fixed In Version: Doc Type: Bug Fix
Doc Text:
NO_DOCS
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-24 08:42:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lijin 2014-07-16 02:26:49 UTC
Description of problem:
nearly all windows guests bsod with "d1" when running job "NDISTest 6.5 - [1 Machine] - StandardizedKeywords"

Version-Release number of selected component (if applicable):
kernel-3.10.0-133.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev_0.2.x86_64
seabios-1.7.2.2-12.el7.x86_64
virtio-win-prewhql-87
spice-server-0.12.4-5.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot guest with:
nic1:
/usr/libexec/qemu-kvm -name 087NIC2012R2C9C -enable-kvm -m 6G -smp 8 -uuid 4775d1b0-da3d-42b0-a036-f98e6982b046 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/087NIC2012R2C9C,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=087NIC2012R2C9C,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=087NIC2012R2C9C.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:17:60:b1:08,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,mq=on,id=net1,mac=00:52:67:0b:c8:c7,bus=pci.0

nic2:
/usr/libexec/qemu-kvm -name 087NIC2012R2SWW -enable-kvm -m 6G -smp 8 -uuid 716167d9-bba9-4c38-a6ce-00bfafcac3c0 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/tmp/087NIC2012R2SWW,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime,driftfix=slew -boot order=cd,menu=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=087NIC2012R2SWW,if=none,id=drive-ide0-0-0,format=raw,serial=mike_cao,cache=none -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive file=en_windows_server_2012_r2_x64_dvd_2707946.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=087NIC2012R2SWW.vfd,if=none,id=drive-fdc0-0-0,format=raw,cache=none -global isa-fdc.driveA=drive-fdc0-0-0 -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=00:52:56:25:06:25,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=isa_serial0 -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -vga cirrus -netdev tap,script=/etc/qemu-ifup-private,downscript=no,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,mq=on,id=net1,mac=00:52:3a:5a:c7:6c,bus=pci.0

2.submit job in hck2.1

Actual results:
guest bsod,job failed

Expected results:


Additional info:

windbg info:1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000028, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff8000065807a, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: fffff803ac739360: Unable to get special pool info
fffff803ac739360: Unable to get special pool info
 0000000000000028 

CURRENT_IRQL:  2

FAULTING_IP: 
NDIS!NdisAcquireRWLockRead+3a
fffff800`0065807a 488b74c720      mov     rsi,qword ptr [rdi+rax*8+20h]

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

BUGCHECK_STR:  AV

PROCESS_NAME:  System

TRAP_FRAME:  ffffd000207ff5d0 -- (.trap 0xffffd000207ff5d0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000001 rbx=0000000000000000 rcx=0000000000000000
rdx=ffffd000207ff7b8 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8000065807a rsp=ffffd000207ff760 rbp=0000000000000010
 r8=0000000000000001  r9=0000000000000010 r10=ffffe000034a2000
r11=ffffd000207ffc50 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl nz na pe nc
NDIS!NdisAcquireRWLockRead+0x3a:
fffff800`0065807a 488b74c720      mov     rsi,qword ptr [rdi+rax*8+20h] ds:00000000`00000028=????????????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff803ac5dfbe9 to fffff803ac5d40a0

STACK_TEXT:  
ffffd000`207ff488 fffff803`ac5dfbe9 : 00000000`0000000a 00000000`00000028 00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
ffffd000`207ff490 fffff803`ac5de43a : 00000000`00000000 ffffd000`207ff7b8 00000000`00000000 ffffd000`207ff5d0 : nt!KiBugCheckDispatch+0x69
ffffd000`207ff5d0 fffff800`0065807a : ffffd000`21b8e000 00000000`00000001 00000000`00001214 00000000`00000007 : nt!KiPageFault+0x23a
ffffd000`207ff760 fffff800`01e43a81 : ffffe000`03499be0 00000000`00000010 00000000`00000000 00000000`00000007 : NDIS!NdisAcquireRWLockRead+0x3a
ffffd000`207ff790 fffff800`01e370f1 : 0003eeb0`00783000 00000000`00000000 ffffe000`0e0fd990 00000000`00000000 : netkvm+0xfa81
ffffd000`207ff7d0 fffff800`01e41183 : ffffe000`034761a0 fffff803`00000040 ffffd000`207ff969 00000000`00000000 : netkvm+0x30f1
ffffd000`207ff810 fffff800`006585f1 : ffffe000`00000000 00001f80`00000045 00000000`000f51f8 ffffffff`ffd07e50 : netkvm+0xd183
ffffd000`207ff8a0 fffff803`ac507e90 : ffffd000`207ffc60 ffffffff`ff09de3e 00000000`000f51f8 fffff803`ac431b7d : NDIS!ndisInterruptDpc+0x1b2
ffffd000`207ff9d0 fffff803`ac507111 : fffff803`ac5e0c8f 00000000`00000001 fffff803`ac78a000 fffff803`ac484000 : nt!KiExecuteAllDpcs+0x1b0
ffffd000`207ffb20 fffff803`ac5d7bea : ffffd000`207d5180 ffffd000`207d5180 ffffd000`207e1200 ffffe000`000b0780 : nt!KiRetireDpcList+0xe1
ffffd000`207ffda0 00000000`00000000 : ffffd000`20800000 ffffd000`207fa000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x5a


STACK_COMMAND:  kb

FOLLOWUP_IP: 
netkvm+fa81
fffff800`01e43a81 837b0402        cmp     dword ptr [rbx+4],2

SYMBOL_STACK_INDEX:  4

SYMBOL_NAME:  netkvm+fa81

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: netkvm

IMAGE_NAME:  netkvm.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  53bb6d3f

FAILURE_BUCKET_ID:  AV_VRF_netkvm+fa81

BUCKET_ID:  AV_VRF_netkvm+fa81

Followup: MachineOwner
---------

Comment 3 lijin 2014-07-16 03:13:27 UTC
guests also hit this issue when run:
1810: "NDISTest 6.0\NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv (Multi-Group Win8+)" 
1789: "NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv"

also upload the dump files in the same place as commnet1

Comment 4 Min Deng 2014-07-18 05:25:14 UTC
(In reply to lijin from comment #3)
> guests also hit this issue when run:
> 
> also upload the dump files in the same place as commnet1

 Please also have a look on win8-64 guest because it has the same issue.
 NDISTest 6.5 - [1 Machine] - StandardizedKeywords"
 NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv -(Multi-Group Win8+)" 
 For win7-32 guest, 
 NDISTest 6.5 - [1 Machine] - StandardizedKeywords"

Comment 6 Shuang Yu 2014-07-22 03:08:56 UTC
Please also have a look on win2012 guest because it has the same issue.
 NDISTest 6.5 - [1 Machine] - StandardizedKeywords
 NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv
For win2008R2 guest
 NDISTest 6.5 - [1 Machine] - StandardizedKeywords
 NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv
For win8.1-32
 NDISTest 6.5 - [1 Machine] - StandardizedKeywords
 NDISTest 6.0 - [2 Machine] - 2c_Mini6RSSSendRecv

Comment 12 lijin 2015-03-31 08:49:16 UTC
Job passed on all guests with build 101

So this issue has been fixed.

Comment 15 lijin 2015-07-17 07:04:22 UTC
change status to verified according to comment#12

Comment 17 errata-xmlrpc 2015-11-24 08:42:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2513.html