Bug 681440

Summary: xenpv-win drivers leak memory on (misaligned I/O && backend is busy)
Product: Red Hat Enterprise Linux 5 Reporter: cshao <cshao>
Component: xenpv-winAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 5.7CC: drjones, mshao, pbonzini, rwu, whuang
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: 1.3.4-6.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-08 08:15:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 518435    
Attachments:
Description Flags
disk stress process will disappear after run a few minutes
none
2k3-64-diskIO.wtl
none
disk Verification-diskIO-before.png
none
disk Verification-diskIO-disappear-after.png
none
disk stress-diskIO-after.png
none
disk stress-diskIO-before.png
none
win7 -32 wtl file
none
blk-win7-32-disk-stress none

Description cshao 2011-03-02 07:25:43 UTC
Created attachment 481797 [details]
disk stress process will disappear after run a few minutes

Description of problem:
Disk Verification (or Disk Stress) job process will disappear after run a few minutes in win2k3-32&64 guest.

Version-Release number of selected component (if applicable):
kernel-xen-2.6.18-245.el5.x86_64.rpm
xen-3.0.3-123.el5.x86_64.rpm
xenpv-win-1.3.4-2.el5.noarch.rpm


How reproducible:
100%

Steps to Reproduce:
1. Set the DWORD value "EnumerateDevicesOverride" to 3 in the registry key "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\rhelscsi",
2. Run Disk Stress or Disk Verification job with below configure:
disk = [ "file:/var/lib/xen/images/win2k3-64b-50G,hda,w",  "tap:aio:/var/lib/xen/images/disk2,xvda,w" ]
3. Please see attachment for more details.
  
Actual results:
1. In win2k3-32 and win2k3-64 guest, disk Verification process will disappear after run a few minutes.
2. In DTM studio job queue, this job will keep Running status all along. But in fact, the job didn't run.

Expected results:
The Disk Verification pass.
The Disk Stress pass.


Additional info:
When I set the DWORD value "EnumerateDevicesOverride" to 1, I get the same result.

Comment 1 Paolo Bonzini 2011-03-02 08:06:45 UTC
Please attach the .wtl file from the job.  Also, it can be useful if you attach a screenshot of Task Manager's "Processes" tab before and after the process disappears.

Thanks!

Comment 2 Paolo Bonzini 2011-03-02 08:07:43 UTC
(From the picture, it seems like diskio.exe is picking a runtime of 15 minutes or so, rather than the 3 days that are used for newer OSes).

Comment 3 cshao 2011-03-02 09:24:26 UTC
Created attachment 481814 [details]
2k3-64-diskIO.wtl

Comment 4 cshao 2011-03-02 09:27:57 UTC
Created attachment 481815 [details]
disk Verification-diskIO-before.png

Comment 5 cshao 2011-03-02 09:28:29 UTC
Created attachment 481817 [details]
disk Verification-diskIO-disappear-after.png

Comment 6 cshao 2011-03-02 09:29:16 UTC
Created attachment 481818 [details]
disk stress-diskIO-after.png

Comment 7 cshao 2011-03-02 09:29:47 UTC
Created attachment 481819 [details]
disk stress-diskIO-before.png

Comment 8 cshao 2011-03-02 09:35:45 UTC
(In reply to comment #2)
> (From the picture, it seems like diskio.exe is picking a runtime of 15 minutes
> or so, rather than the 3 days that are used for newer OSes).

Hi Paolo,

I get the same result when run Disk Stress and Disk Verification jobs.
I upload screen-shot of Task Manager's "Processes tab.
Please see the attachment.

Thanks!

Comment 9 Huang Wenlong 2011-03-04 07:10:30 UTC
I also can reproduce this bug in Win7-32 guest when test the "Disk Stress" blk  WHQL job,run a few minutes the job windows will disappeared, and the guest can not reboot or shutdown normally,if I want to run the "Explorer" it will show up the error message: 
"There is not enough free memory to run this program.Exit one or more programs,and then try again."

I found the job's processes are closed but the memory not be freed , the number of processes before windows disappeared is 40 ,the number of processes after windows disappeared is 37 and the Physical Memory:53% still . 


Version-Release number of selected component (if applicable):
kernel-xen-2.6.18-245.el5.x86_64.rpm
xen-3.0.3-123.el5.x86_64.rpm
xenpv-win-1.3.4-2.el5.noarch.rpm

Comment 10 Paolo Bonzini 2011-03-04 16:37:19 UTC
Is it possible to get the Win7-32 wtl file?

Comment 11 Paolo Bonzini 2011-03-04 16:59:32 UTC
(Forget about my comment 3, I was confusing "disk stress" and "disk verification".

Comment 13 Huang Wenlong 2011-03-07 02:16:16 UTC
Created attachment 482582 [details]
win7 -32 wtl file

attach  the win7-32 wtl file

Comment 14 Paolo Bonzini 2011-03-07 22:29:58 UTC
This is not the WTL file for disk {stress,verification}.  If you prefer, just generate a .cpk and I'll get the right log myself (.cpk files are just cab files).

Comment 15 Huang Wenlong 2011-03-08 03:14:28 UTC
Created attachment 482828 [details]
blk-win7-32-disk-stress

Hi,Paolo

There is my disk stress cpk file .

Wenlong

Comment 17 Rita Wu 2011-03-10 05:15:13 UTC
Verified with the following pkgs:

xenpv-win-1.3.4-6.el5
kernel-xen-2.6.18-245.el5.x86_64.rpm
xen-3.0.3-123.el5.x86_64.rpm
WLK1.5

Disk Stress on all Windows Oses(2k3-32/64,2k8-32/64,7-32/64,xp-32,2k8-r2) pass.

Comment 18 cshao 2011-03-15 09:07:39 UTC
Test version:
xenpv-win-1.3.4-6.el5
kernel-xen-2.6.18-245.el5.x86_64.rpm
xen-3.0.3-123.el5.x86_64.rpm
WLK1.5

Test result:
Disk Verification on all Windows Oses(2k3-32/64,2k8-32/64,7-32/64,xp-32,2k8-r2) pass.

So change bug status to verified.

Comment 19 errata-xmlrpc 2011-06-08 08:15:39 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0853.html