Bug 1003522

Summary: guest bsod(4E) while running iozone.exe in Rhevm 3.2
Product: Red Hat Enterprise Linux 6 Reporter: lijin <lijin>
Component: virtio-winAssignee: Vadim Rozenfeld <vrozenfe>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.5CC: acathrow, bcao, bsarathy, chayang, juzhang, lijin, michen, mkenneth, qzhang, rhod, virt-maint, vrozenfe
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virtio-win-prewhql-0.1-68 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-01 16:32:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lijin 2013-09-02 09:06:10 UTC
Description of problem:
run iozone in windows guest in rhevm 3.2,guest bsod with 4e code

Version-Release number of selected component (if applicable):
RHEV-toolsSetup_3.2_13.iso
virtio-win-prewhql-67
qemu-kvm-rhev-0.12.1.2-2.355.el6.x86_64
kernel-2.6.32-358.el6.x86_64

How reproducible:
2/5

Steps to Reproduce:
1.boot a win2k8-32 guest in rhevm
2.run iozone on system disk in this guest(the size of system disk is 50g):
  the iozone command:iozone.exe -az -b c:\aaaa -g 30g -y 64k

Actual results:
guest bsod with 4E code


Expected results:
guest can finish the iozone test,no bsod

Additional info:
I will upload the dump file later.

Comment 1 lijin 2013-09-02 09:08:37 UTC
the windbg info :

Microsoft (R) Windows Debugger Version 6.2.8400.0 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [D:\share\win2k8-32-lijin.DMP]
Kernel Summary Dump File: Only kernel address space is available

WARNING: Inaccessible path: 'C:\symbols\vioser.pdb'
Symbol search path is: C:\symbols;C:\symbols\vioser.pdb;SRV*c:\symbols\*http://msdl.microsoft.com/download/symbols
Executable search path is: 
Windows Server 2008/Windows Vista Kernel Version 6002 (Service Pack 2) UP Free x86 compatible
Product: Server, suite: TerminalServer DataCenter SingleUserTS
Built by: 6002.18881.x86fre.vistasp2_gdr.130707-1535
Machine Name:
Kernel base = 0x81810000 PsLoadedModuleList = 0x81927c70
Debug session time: Mon Sep  2 15:50:03.484 2013 (UTC + 8:00)
System Uptime: 2 days 9:51:10.875
Loading Kernel Symbols
...............................................................
....................................................
Loading User Symbols

Loading unloaded module list
....
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 4E, {99, 70141, 3, ffffffff}

Probably caused by : memory_corruption ( nt!MiBadShareCount+24 )

Followup: MachineOwner
---------

kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

PFN_LIST_CORRUPT (4e)
Typically caused by drivers passing bad memory descriptor lists (ie: calling
MmUnlockPages twice with the same list, etc).  If a kernel debugger is
available get the stack trace.
Arguments:
Arg1: 00000099, A PTE or PFN is corrupt
Arg2: 00070141, page frame number
Arg3: 00000003, current page state
Arg4: ffffffff, 0

Debugging Details:
------------------


BUGCHECK_STR:  0x4E_99

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

PROCESS_NAME:  System

CURRENT_IRQL:  2

LAST_CONTROL_TRANSFER:  from 818a6a85 to 818dd9bd

STACK_TEXT:  
88323960 818a6a85 0000004e 00000099 00070141 nt!KeBugCheckEx+0x1e
88323978 818c9fbe 00000000 83863410 84be9658 nt!MiBadShareCount+0x24
88323be4 81a55226 d4400000 946f20a0 00000000 nt!MmUnmapViewInSystemCache+0x585
88323c14 818c9556 83863410 00000000 00000001 nt!CcUnmapVacb+0x170
88323c58 818ca137 00000000 88323c01 84be9658 nt!CcUnmapVacbArray+0x1cc
88323c74 818ca284 88323c01 84647a98 84be969c nt!CcUnmapAndPurge+0x32
88323ca0 81848928 00000001 84647a98 81946148 nt!CcDeleteSharedCacheMap+0x107
88323cec 81845635 84be9658 88323d10 00000000 nt!CcWriteBehind+0x582
88323d44 818b5d4a 8388f9b0 00000000 8388bad0 nt!CcWorkerThread+0x11e
88323d7c 819e601c 8388f9b0 3bcd7c24 00000000 nt!ExpWorkerThread+0xfd
88323dc0 8184eeee 818b5c4d 00000000 00000000 nt!PspSystemThreadStartup+0x9d
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!MiBadShareCount+24
818a6a85 cc              int     3

SYMBOL_STACK_INDEX:  1

SYMBOL_NAME:  nt!MiBadShareCount+24

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

DEBUG_FLR_IMAGE_TIMESTAMP:  51da1840

IMAGE_NAME:  memory_corruption

FAILURE_BUCKET_ID:  0x4E_99_nt!MiBadShareCount+24

BUCKET_ID:  0x4E_99_nt!MiBadShareCount+24

Followup: MachineOwner
---------

Comment 9 lijin 2013-09-06 05:29:12 UTC
(In reply to Qunfang Zhang from comment #8)
> (In reply to lijin from comment #7)
> > (In reply to Qunfang Zhang from comment #6)
> > > Hi, lijin
> > > 
> > > Any news about the question on comment 3 (rhel6.5 behaviour) and comment 5?
> > 
> > As this case will last for a few days,sorry for the late response.
> > We have upgrade the virtio-win driver to build 68,all guests works fine
> > until now.
> 
> Thanks for the update. And is the host a RHEL6.5 one? And the guest(s) is
> running the same scenario as the comment 0?

The host is still rhel6.4,guests are running the same scenario as comment 0.
As I mentioned before,this test will last for a few days,after we finish build 68 testing,we will upgrade host to rhel 6.5 to retest it.

Comment 11 Vadim Rozenfeld 2013-09-06 10:04:25 UTC
Hi Qunfang,

Yes absolutely, it is a virtio-win problem. Just didn't pay attention that
it is under qemu category - moving it virtio-win.
This problem was fixed in build 68 which is was released several days ago.


Best regards,
Vadim.

Comment 12 Qunfang Zhang 2013-09-06 10:07:37 UTC
(In reply to Vadim Rozenfeld from comment #11)
> Hi Qunfang,
> 
> Yes absolutely, it is a virtio-win problem. Just didn't pay attention that
> it is under qemu category - moving it virtio-win.
> This problem was fixed in build 68 which is was released several days ago.
> 
> 
> Best regards,
> Vadim.

Okay, thanks for the confirmation.

Comment 14 Mike Cao 2013-09-12 05:33:12 UTC
Move status to Verified according to comment #9