Bug 590596

Summary: Ethernet NDIS Test 6.0 BSOD for WHQL testing
Product: Red Hat Enterprise Linux 6 Reporter: Xiaoli Tian <xtian>
Component: virtio-winAssignee: Yan Vugenfirer <yvugenfi>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: jiabwang, llim, szhou, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-19 08:25:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
screenshot of BSOD
none
log of job running before BSOD
none
Minidunp file of BSOD
none
Debug analysis of kernel memory dump file for Test Server
none
Debug analysis of kernel memory dump file for Test Client none

Description Xiaoli Tian 2010-05-10 09:37:40 UTC
Created attachment 412770 [details]
screenshot of BSOD

Description of problem:
BSOD when running job start ndis test client,the log of job running before BSOD is attached,the screen shot and Minidump file are attached

Version-Release number of selected component (if applicable):
virtio-win driver version:05/03/2010,6.0.209.605
kernel:2.6.32-19.el6,x86_64

How reproducible:
always

Steps to Reproduce:
1.prepare to test WHQL network 
2.start to run NDIS Test 6.0
3.
  
Actual results:
BSOD

Expected results:
Pass

Additional info:
if need any more information,I could analyze the dump files with debug tools and upload the analysis result if you need.

Comment 1 Xiaoli Tian 2010-05-10 09:40:22 UTC
Created attachment 412771 [details]
log of job running before BSOD

Comment 2 Xiaoli Tian 2010-05-10 09:44:47 UTC
Created attachment 412772 [details]
Minidunp file of BSOD

Comment 3 Yan Vugenfirer 2010-05-10 09:49:50 UTC
(In reply to comment #1)
> Created an attachment (id=412771) [details]
> log of job running before BSOD    

Please attach the log of the failing job (the previous job log is OK and is not related to failure).
Which  sub-case was it exactly?

Comment 5 Yan Vugenfirer 2010-05-10 10:08:16 UTC
Please set the system to save kernel dump and rerun the test.
Kernel dump should be save in %windir%\memory.dmp

For now it seems as a crash in MS test driver, but I will need to see kernel memory dump to make more conclusions.

Comment 6 RHEL Product and Program Management 2010-05-10 12:10:15 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 7 Xiaoli Tian 2010-05-11 02:30:40 UTC
Created attachment 413025 [details]
Debug analysis of kernel memory dump file for Test Server

Comment 8 Xiaoli Tian 2010-05-11 02:31:34 UTC
Created attachment 413026 [details]
Debug analysis of kernel memory dump file for Test Client

Comment 9 Yaniv Kaul 2010-05-11 06:25:09 UTC
1. It would be helpful to always include the exact qemu command line.
2. It would be helpful to actually post here the result of the stack - so if in the future we'll have a similar stack, we can search for it in Bugzilla. As attachment it's quite difficult. Remove all the 'symbols missing' alerts (and get symbols!)

Comment 10 Xiaoli Tian 2010-05-11 07:48:19 UTC
(In reply to comment #9)
> 1. It would be helpful to always include the exact qemu command line.
> 2. It would be helpful to actually post here the result of the stack - so if in
> the future we'll have a similar stack, we can search for it in Bugzilla. As
> attachment it's quite difficult. Remove all the 'symbols missing' alerts (and
> get symbols!)    

Thanks ,I will post the result here directly in the future.
1)qemu command:

2)analyzed result:
*************************************************************************
Probably caused by : ndistest.sys ( ndistest+2b625 )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: 0000000000000000, The exception code that was not handled
Arg2: 0000000000000000, The address that the exception occurred at
Arg3: 0000000000000000, Parameter 0 of the exception
Arg4: 0000000000000000, Parameter 1 of the exception

Debugging Details:


ADDITIONAL_DEBUG_TEXT:  

Use '!findthebuild' command to search for the target build information.

If the build information is available, run '!findthebuild -s ; .reload' to set symbol path and load symbols.



FAULTING_MODULE: fffff80001406000 nt



DEBUG_FLR_IMAGE_TIMESTAMP:  4b082765



EXCEPTION_CODE: (Win32) 0 (0) - The operation completed successfully.



FAULTING_IP: 

+0

00000000`00000000 ??              ???



EXCEPTION_PARAMETER1:  0000000000000000



EXCEPTION_PARAMETER2:  0000000000000000



DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT



BUGCHECK_STR:  0x1E



CURRENT_IRQL:  0



LAST_CONTROL_TRANSFER:  from fffff8000146e32e to fffff800014765d0



STACK_TEXT:  

fffff800`01319338 fffff800`0146e32e : fffffa80`0521b718 fffff800`013194c0 fffff800`01319ab0 fffff800`014a3524 : nt!KeBugCheck

fffff800`01319340 fffff800`0149c2ed : fffff800`01684b88 fffff800`015bda40 fffff800`01406000 fffff800`0131a248 : nt!KiCpuId+0x41e

fffff800`01319370 fffff800`014a3950 : fffff800`015c5b24 fffff800`013193e8 fffff800`0131a248 fffff800`01406000 : nt!KeReleaseQueuedSpinLock+0xdd

fffff800`013193a0 fffff800`014b08df : fffff800`0131a248 fffff800`01319ab0 fffff800`00000000 00000000`00000000 : nt!FsRtlLookupLastBaseMcbEntry+0x4d0

fffff800`01319a80 fffff800`01475c42 : fffff800`0131a248 fffff800`0131a618 fffff800`0131a2f0 00000000`00000000 : nt!FsRtlInitializeBaseMcbEx+0x430b

fffff800`0131a110 fffff800`01473a74 : fffff800`0131a318 fffff880`01786801 00000000`00000000 fffff880`0469c5fa : nt!KeSynchronizeExecution+0x3e32

fffff800`0131a2f0 fffff880`0469c625 : fffff880`0469b8e5 fffff880`08000000 fffffa80`069cec30 00000000`00000065 : nt!KeSynchronizeExecution+0x1c64

fffff800`0131a488 fffff880`0469b8e5 : fffff880`08000000 fffffa80`069cec30 00000000`00000065 00000000`00000003 : ndistest+0x2b625

fffff800`0131a490 fffff800`014826a6 : fffffa80`069ced40 fffffa80`06e2cce0 00000000`078ae162 00000000`01cae7c8 : ndistest+0x2a8e5

fffff800`0131a570 fffff800`01481a26 : fffffa80`06e22c68 fffffa80`06e22c68 00000000`00000000 00000000`00000000 : nt!ExReleaseResourceAndLeavePriorityRegion+0xbe

fffff800`0131a5e0 fffff800`0148257e : 00000013`ce76f5a2 fffff800`0131ac58 00000000`00084eb5 fffff800`015f4928 : nt!KeRemoveQueueEx+0xb66

fffff800`0131ac30 fffff800`01481d97 : 00000007`171247c7 00000007`00084eb5 00000007`171247e2 00000000`000000b5 : nt!ExEnterPriorityRegionAndAcquireResourceShared+0x1f6

fffff800`0131acd0 fffff800`0147edfa : fffff800`015f0e80 fffff800`015fec40 00000000`00000000 fffff800`0152f170 : nt!KeRemoveQueueEx+0xed7

fffff800`0131ad80 00000000`00000000 : fffff800`0131b000 fffff800`01315000 fffff800`0131ad40 00000000`00000000 : nt!KeUpdateSystemTime+0x47a





STACK_COMMAND:  kb



FOLLOWUP_IP: 

ndistest+2b625

fffff880`0469c625 b001            mov     al,1



SYMBOL_STACK_INDEX:  7



SYMBOL_NAME:  ndistest+2b625

Comment 11 Yaniv Kaul 2010-05-11 08:01:16 UTC
I'm still not seeing the qemu command line. Also, the KVM version and the host CPU might be relevant.

Comment 12 Xiaoli Tian 2010-05-11 09:00:58 UTC
(In reply to comment #11)
> I'm still not seeing the qemu command line. Also, the KVM version and the host
> CPU might be relevant.    

qemu command:"/usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -name nic1-08-R2 -m 6G -smp 4 -net nic,vlan=1,macaddr=00:32:49:7a:26:f1,model=rtl8139 -net tap,vlan=1,script=/etc/qemu-ifup -net nic,vlan=2,macaddr=00:1b:e1:15:b1:12,model=virtio -net tap,vlan=2,script=/etc/qemu-ifup-private -net nic,vlan=3,macaddr=00:5b:3a:46:13:22,model=virtio -net tap,vlan=3,script=/etc/qemu-ifup-private -drive file=/home/WHQL/win08R2-1.qcow2,if=ide -uuid 0483f474-b3a1-4131-af31-1a16177911c0 -vnc :1&"

KVM Version:
qemu-kvm-0.12.1.2-2.17.el6.x86_64

CPU Info:
ssor:		  15
cpu family      : 15
model           : 26
model name      : Intel(R) Xeon(R) CPU E5520 @2.27GHZ
stepping        : 5
cpu MHz         : 2261.260
cache size      : 8192KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
bogomips        : 4521.29
clflush size    : 64
address sizes   :40 bits physical ,48 bits virtual

Comment 13 Yaniv Kaul 2010-05-11 09:07:44 UTC
I think the command line is missing -cpu flag.
For example:
-cpu qemu64,+sse2 (not sure what the default in RHEL 6 is!).

Comment 14 Yan Vugenfirer 2010-05-11 09:32:02 UTC
(In reply to comment #13)
> I think the command line is missing -cpu flag.
> For example:
> -cpu qemu64,+sse2 (not sure what the default in RHEL 6 is!).    

based on the stack the bug check comes right after "KiCpuId" function call:

fffff800`01319338 fffff800`0146e32e : fffffa80`0521b718 fffff800`013194c0
fffff800`01319ab0 fffff800`014a3524 : nt!KeBugCheck

fffff800`01319340 fffff800`0149c2ed : fffff800`01684b88 fffff800`015bda40
fffff800`01406000 fffff800`0131a248 : nt!KiCpuId+0x41e

Comment 15 Xiaoli Tian 2010-05-19 08:25:09 UTC
I have tested this job again on newer RHEL6.0 with kernel-2.6.32-24.el6.x86_64 and qemu-kvm-0.12.1.2-2.51.el6.x86_64,using the latest virtio-win driver:05/17/2010,6.0.209.605,no BSOD,but bug 590945 still exists in this job.
I'll close this bug as not a bug.