Bug 465116

Summary: Formatting a disk on a Windows Server 2008/Vista 32 guest causes Windows to panic.
Product: Red Hat Enterprise Linux 5 Reporter: Barry Donahue <bdonahue>
Component: xenAssignee: Rik van Riel <riel>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.3CC: armbru, bburns, cward, czhao, jdenemar, llim, ltroan, sputhenp, syeghiay, xen-maint, yoyzhang
Target Milestone: rc   
Target Release: 5.4   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-94.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 552573 (view as bug list) Environment:
Last Closed: 2009-09-02 10:08:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 441627    
Bug Blocks: 450783, 477162, 480689, 483784, 490846    
Attachments:
Description Flags
this is a screen shot of the crash
none
This is the configuration file for the guest.
none
Then xend.log from Dom 0 during the time of the windows crash.
none
Screen shot from the vista crash
none
QEMU AIO infrastructure backport
none
QEMU AIO IDE backport
none
QEMU AIO SCSI backport none

Description Barry Donahue 2008-10-01 16:47:29 UTC
Created attachment 319123 [details]
this is a screen shot of the crash

Description of problem: I installed Windows 2008 64 bit Standard Edition on a RHEL 5.3 Dom 0. Twice now, while formatting a disk, the guest crashed. One time I got a memory dump the other time I did not.


Version-Release number of selected component (if applicable):
kernel: 2.6.18-116.el5xen #1 SMP
RHEL: RHEL5.3-Server-20080922.0
Windows guest: 64 bit Windows 2008 Server Standard Edition.

How reproducible:
I was able to reproduce this twice in about 10 tries.

Steps to Reproduce:
1. Install the guest.
2. Shutdown guest, add attached storage devices, xm create guest.
3. Within disk manager, format a FAT32 partition of 1GB size.
  
Actual results:
Sometimes the guest crashes.

Expected results:
The disk should format and then come into service.

Additional info:
Attached is a screen shot of the blue screen. Also attached is the xen configuration file of the guest. I also have a memory dump file if anyone can read Windows dumps.

Comment 1 Barry Donahue 2008-10-01 16:48:42 UTC
Created attachment 319125 [details]
This is the configuration file for the guest.

Comment 2 Bill Burns 2008-10-01 18:48:16 UTC
Can you provide the system logs from the host dom0?
Do you know if this is a new failure? I.E. did it work on earlier
builds?

Comment 3 Barry Donahue 2008-10-01 19:03:37 UTC
Created attachment 319154 [details]
Then xend.log from Dom 0 during the time of the windows crash.

Comment 4 Barry Donahue 2008-10-01 19:04:52 UTC
RHEL 5.3 is the first RHEL release that we have tested Windows 2008.

Comment 5 Barry Donahue 2008-10-03 15:09:13 UTC
I just hit this crash on a vista 32 bit guest.

Comment 6 Barry Donahue 2008-10-03 15:10:19 UTC
Created attachment 319375 [details]
Screen shot from the vista crash

Comment 8 Don Dutile (Red Hat) 2008-10-03 19:45:26 UTC
Additional test info:

(1) Passes on 5.2 installation: 5.2 kernel-xen + 5.2 userspace 
(2) Fails  on 5.3 installation: 5.3 kernel-xen + 5.3 userspace
    (as stated above)
(3) Fails on 5.2 kernel-xen on 5.3 installation (5.3 userspace)


Thus, it appears that the problem is not kernel &/or hypervisor based,
but (xen) tools/userspace based.

Could this be (re-)assigned to a xen-tools master, Obi-Wan ?

- Don

Comment 12 Barry Donahue 2008-10-08 17:13:23 UTC
Some further testing has revealed that you can hit this bug in 5.2. It's a little bit harder but possible. The guest needs to have at least 2 vcpus for the problem to occur. Well before the threshold of a crash, the user will experience a desktop freeze situation.

Comment 16 Rik van Riel 2009-03-26 17:32:26 UTC
I have backported AIO to qemu's IDE and SCSI emulation.  Could you try out the
test RPMs on http://people.redhat.com/riel/.xen-aio/ to see if the timer irq still gets locked out from one virtual CPU?

Comment 17 Barry Donahue 2009-03-26 18:54:40 UTC
I just formatted 3 drives on a 2 VPU 64 bit Windows 2008 guest with no problems. It looks like this fixed the problem.

Comment 18 Rik van Riel 2009-03-26 20:56:51 UTC
Created attachment 336893 [details]
QEMU AIO infrastructure backport

Comment 19 Rik van Riel 2009-03-26 20:57:17 UTC
Created attachment 336894 [details]
QEMU AIO IDE backport

Comment 20 Rik van Riel 2009-03-26 20:57:42 UTC
Created attachment 336895 [details]
QEMU AIO SCSI backport

Comment 21 Rik van Riel 2009-03-27 17:55:20 UTC
Barry, there was a problem with the original xen-aio packages - specifically, they did not ensure data is always synced to disk.

Could you please try again with my .aio2 test RPMs from http://people.redhat.com/riel/.xen-aio/ to see if Windows still works fine, or if opening the disk images with O_DSYNC reintroduces the bug?

Thank you.

Comment 22 Barry Donahue 2009-03-30 14:58:29 UTC
I retried that tests with the new RPM's and the test passed. I formatted both basic and dynamic disks with NTFS and FAT32 formats.

Comment 23 Jiri Denemark 2009-05-21 15:40:02 UTC
Fix built into xen-3.0.3-86.el5

Comment 25 Chris Ward 2009-07-03 18:09:59 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 26 Chris Ward 2009-07-10 19:05:37 UTC
~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~

RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching.

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. 

Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative.

Comment 27 zhanghaiyan 2009-07-30 05:32:07 UTC
Verified on xen-3.0.3-91.el5 PASS

Comment 28 Lawrence Lim 2009-08-04 19:00:45 UTC
Moving bug to assigned based on the following comment.

<https://bugzilla.redhat.com/show_bug.cgi?id=479339#c58>

Comment 29 Rik van Riel 2009-08-04 19:40:31 UTC
The bug appears to only be present in the emulated SCSI disks.  This means we can get away with reverting just the SCSI part of the AIO backport (xen-qemu-aio-scsi.patch):

diff -u -d -u -r1.287 xen.spec
--- xen.spec    3 Aug 2009 05:25:12 -0000       1.287
+++ xen.spec    4 Aug 2009 19:39:14 -0000
@@ -863,7 +863,7 @@
 # AIO backport for qemu
 %patch873 -p1
 %patch874 -p1
-%patch875 -p1
+# %patch875 -p1
 %patch876 -p1
 # Fix HVM time skew problems
 %patch877 -p1

Comment 32 Jiri Denemark 2009-08-05 14:14:42 UTC
Fix built into xen-3.0.3-94.el5

Comment 34 xingzhao 2009-08-06 08:06:06 UTC
Verified on xen-3.0.3-94.el5 both with windows guest and rhel guest

Comment 35 xingzhao 2009-08-06 08:08:04 UTC
Totally tested 4 group: ide/scsi on windows and rhel guest

Group1:

Version:xen-3.0.3-94.el5
Host:RHEL5.4-i386-xen
Guest:32 bit Windows 2008

Tested 10/10 pass

Test steps:
1. Install the guest.
2. Shutdown guest, attach ide disk(xml file as followed), xm create guest.
<disk type="file" device="disk">
	<source file="/var/lib/libvirt/images/a.img"/>
	<target dev="hdb" bus="ide"/>
</disk>
3. Within disk manager,format a FAT32 partition of 1GB size
------->no panic.
4. After format,do some operations,such as open the disk,make file and delete file
------->no panic.

Comment 36 xingzhao 2009-08-06 08:09:10 UTC
Group2:

Version:xen-3.0.3-94.el5
Host:RHEL5.4-i386-xen
Guest:32 bit Windows 2008

Tested 5/5 pass

Steps to Reproduce:
1. Install the guest.
2. Shutdown guest, attached scsi disk(xml file as followed), xm create guest.
<disk type="file" device="disk">
	<source file="/var/lib/libvirt/images/a.img"/>
	<target dev="sdb" bus="scsi"/>
</disk>
3. Within disk manager, format a FAT32 partition of 1GB size
------->no panic.
4. After format,do some operations,such as open the disk,make file and delete file
------->no panic.

Comment 37 xingzhao 2009-08-06 08:10:27 UTC
Group3:

Version:xen-3.0.3-94.el5
Host:RHEL5.4-x86_64-xen
Guest:RHEL5.1-i386-kvm

Tested 10/10 pass

Steps to Reproduce:
1. Install the guest.
2. Shutdown guest, add attached ide disk(xml file as followed), xm create guest.
<disk type="file" device="disk">
	<source file="/var/lib/libvirt/images/a.img"/>
	<target dev="hdb" bus="ide"/>
</disk>
3. Within disk manager, format a EXT3 partition of 1GB size.
------->no panic.
4. After format,do some operations,such as open the disk,make file and delete file
------->no panic.

Comment 38 xingzhao 2009-08-06 08:11:10 UTC
Group4:

Version:xen-3.0.3-94.el5
Host:RHELx86_64 2.6.18-159.el5xen
Guest:RHEL5.1-i386-kvm

Tested 5/5 pass

Steps to Reproduce:
1. Install the guest.
2. Shutdown guest, add attached ide disk(xml file as followed), xm create guest.
<disk type="file" device="disk">
	<source file="/var/lib/libvirt/images/a.img"/>
	<target dev="sdb" bus="scsi"/>
</disk>
3. Within disk manager, format a EXT3 partition of 1GB size
------->no panic.
4. After format,do some operations,such as open the disk,make file and delete file
------->no panic.

Comment 39 xingzhao 2009-08-06 09:32:18 UTC
step 3 in comment 37&38 should be 
format a ext3 partion of 1GB size with command #mkfs.ext3 /dev/sdb

Comment 41 errata-xmlrpc 2009-09-02 10:08:11 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html