Bug 688456 - BSOD while installing 64bit windows guest from iso image
Summary: BSOD while installing 64bit windows guest from iso image
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.7
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Michal Novotny
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 622763 699611
TreeView+ depends on / blocked
 
Reported: 2011-03-17 05:56 UTC by Yuyu Zhou
Modified: 2014-02-02 22:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-25 14:11:10 UTC
Target Upstream Version:


Attachments (Terms of Use)
Screenshot of guest (11.60 KB, image/png)
2011-03-17 05:57 UTC, Yuyu Zhou
no flags Details
xend log (12.94 KB, text/plain)
2011-03-17 05:58 UTC, Yuyu Zhou
no flags Details
xm dmesg log (16.00 KB, text/plain)
2011-03-17 05:58 UTC, Yuyu Zhou
no flags Details
Backport multi-profile DVD-ROM support commit to fix the issue (9.39 KB, patch)
2011-03-21 12:59 UTC, Michal Novotny
no flags Details | Diff
Patch containing fix that passed the testing in comment 36 (7.45 KB, patch)
2011-04-12 09:52 UTC, Michal Novotny
no flags Details | Diff

Description Yuyu Zhou 2011-03-17 05:56:47 UTC
Description of problem:

The fix of bug 622763 introduced a regression, 64bit windows guest will get a BSOD
during installation from iso image.

8cf5c58 xen-ioemu: Implement cdrom commands to handle drives correctly

Version-Release number of selected component (if applicable):
xen-3.0.3-124.el5 

How reproducible:
100%

Steps to Reproduce:
1. boot up 64bit windows guest (anyone of we supported) from iso image
  
Actual results:
BSOD after the guest boot up (before install configuration).

Expected results:
No problem with installation from iso image.

Additional info:
(1) no issue with RHEL guests
(2) no issue with 32bit windows guests

Comment 1 Yuyu Zhou 2011-03-17 05:57:40 UTC
Created attachment 485918 [details]
Screenshot of guest

Comment 2 Yuyu Zhou 2011-03-17 05:58:11 UTC
Created attachment 485920 [details]
xend log

Comment 3 Yuyu Zhou 2011-03-17 05:58:43 UTC
Created attachment 485921 [details]
xm dmesg log

Comment 4 Michal Novotny 2011-03-21 08:23:17 UTC
I'm going to investigate this since this was apparently caused by my patch.

Michal

Comment 5 Michal Novotny 2011-03-21 10:30:10 UTC
Well, I've been looking to this one and apparently some bits were set incorrectly as double-checked in the T10 Get Configuration command document, revision 0.01. I've been trying to look at the upstream QEMU a little more (since I found nothing on the first look there) and I've been able to find commit 38cdea7c [1] by Carlo Marcelo Arenas Belon so I guess it's better to backport this commit instead of writing it on my own. I need to test it a little bit more however according to my testing the installation of Windows XP x64 boots up successfully. I need to check against Linux i386 and x86_64 and also Windows 32-bit guests as well to confirm it won't introduce any other regression.

Michal

http://git.qemu.org/qemu.git/commit/?id=38cdea7c

Comment 6 Michal Novotny 2011-03-21 12:59:38 UTC
Created attachment 486606 [details]
Backport multi-profile DVD-ROM support commit to fix the issue

Hi,
this is the backport of upstream QEMU git commit 38cdea7c to
implement multi-profile DVD-ROM support. We've have been
having issues/regression caused by introduction of new SCSI
commands for the ATAPI CD-ROM driver that was caused by some
bits being incorrectly set (number of profiles were set to
the bogus number however it was working fine for all the
guest except Windows x64 guests). I've been investigating
this on my own since I was unable to find the working commit
solving this issue at first however I've been able to find
the upstream QEMU commit later and it turned out to be
solving the issues for Windows x64 guests. To confirm
there will be no regression introduced by this commit
I've been doing testing of both installation and CD/DVD-ROM
drive usage (standard read operation since we don't support
write for the virtual CD-ROM) for all of the guests supported,
i.e. Windows x86 and x64 and Linux i386 and x86_64 and
everything was working fine and correctly.

Michal

Comment 7 Paolo Bonzini 2011-03-22 09:54:24 UTC
Please do not change DVD-ROM to CD-ROM, it's just a string and it's guest visible.  Things such as guests losing drive letters may happen.

Comment 11 Yuyu Zhou 2011-04-02 07:15:27 UTC
Result of Installation HVM guest from CDROM for all distro

Three version of HVM guests failed for install from DVD iso:
RHEl5.5-64
Win2008-64
WinVista-64

Host:
xen-3.0.3-127.el5
kernel-xen-2.6.18-254.el5
Intel 64bit

Guest:
Linux:
+-------------+-------------+-------------+-------------+-------------+
|   RHEL3.9   |   RHEL4.8   |   RHEL5.5   |   RHEL6.0   |    RHEL6.1  |
+------+------+------+------+------+------+------+------+------+------+
|  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
+------+------+------+------+------+------+------+------+------+------+
| PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS |
+------+------+------+------+------+------+------+------+------+------+

Windows:
+-------------+-------------+-------------+-------------+-------------+-------------+
| Windows XP  |   Win2003   |   Win2008   |  Win2008r2  |   WinVista  |     Win7    |
+------+------+------+------+------+------+------+------+------+------+------+------+  
|  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
+------+------+------+------+------+------+------+------+------+------+------+------+  
| PASS | N/A  | PASS | PASS | PASS | FAIL | N/A  | PASS | PASS | FAIL | PASS | PASS |
+------+------+------+------+------+------+------+------+------+------+------+------+

Comment 12 Yuyu Zhou 2011-04-02 07:22:57 UTC
ignore the comment 11, update the info "RHEL5.5 -> RHEL5.6"

The test of installing HVM guests via iso images failed (RHEL5.6-64, Win2008-64, WinVista-64).

Intel host with :
xen-3.0.3-127.el5
kernel-xen-2.6.18-254.el5

Guest:
Linux:
+-------------+-------------+-------------+-------------+-------------+
|   RHEL3.9   |   RHEL4.8   |   RHEL5.6   |   RHEL6.0   |    RHEL6.1  |
+------+------+------+------+------+------+------+------+------+------+
|  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
+------+------+------+------+------+------+------+------+------+------+
| PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS |
+------+------+------+------+------+------+------+------+------+------+

Windows:
+-------------+-------------+-------------+-------------+-------------+-------------+
| Windows XP  |   Win2003   |   Win2008   |  Win2008r2  |   WinVista  |     Win7    |
+------+------+------+------+------+------+------+------+------+------+------+------+  
|  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
+------+------+------+------+------+------+------+------+------+------+------+------+  
| PASS | N/A  | PASS | PASS | PASS | FAIL | N/A  | PASS | PASS | FAIL | PASS | PASS |
+------+------+------+------+------+------+------+------+------+------+------+------+

1. for RHEL5.6 x86_64 guest, we saw guest crash at random stages, for example:
[1]. after skip installation number, it reboot immediately and re-boot from cd
[2]. after disk format, it qemu-dm crash immediately:
$ ps aux | grep qemu-dm
root      9617  6.0  0.0      0     0 ?        Z    02:54   0:17 [qemu-dm] <defunct>
$ cat /var/log/xen/qemu-dm.9617.log 
domid: 99
qemu: the number of cpus is 1
Using file  in read-write mode
Using file  in read-only mode
Watching /local/domain/99/logdirty/next-active
Watching /local/domain/0/device-model/99/command
xs_read(): vncpasswd get error. /vm/7b8795f8-40a8-6d53-0d95-667c709a18ac/vncpasswd.
char device redirected to /dev/pts/3
qemu_map_cache_init nr_buckets = 10000
shared page at pfn 3ffff
buffered io page at pfn 3fffd
xs_read(/vm/7b8795f8-40a8-6d53-0d95-667c709a18ac/rtc/timeoffset): read error
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
Triggered log-dirty buffer switch
gpe_en_write: addr=0x1f6c, val=0x0.
gpe_sts_write: addr=0x1f68, val=0xff.
gpe_en_write: addr=0x1f6d, val=0x0.
gpe_sts_write: addr=0x1f69, val=0xff.
gpe_en_write: addr=0x1f6e, val=0x0.
gpe_sts_write: addr=0x1f6a, val=0xff.
gpe_en_write: addr=0x1f6f, val=0x0.
gpe_sts_write: addr=0x1f6b, val=0xff.
gpe_en_write: addr=0x1f6c, val=0x8.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
ACPI PCI hotplug: read addr=0x10c1, val=0x0.
ACPI PCI hotplug: read addr=0x10c2, val=0x0.
inp: bad size: 0 0

2. For Windows 2008 x86_64 and vista x86_64, we hit the same BSOD issue as original reported during installation.
Or you can reproduce it by booting up the Win2008-64bit guest with the qemu ide cd-rom attached, then open server-manager -> storage -> disk management (crash now)

Comment 13 Michal Novotny 2011-04-04 08:16:17 UTC
(In reply to comment #12)
> ignore the comment 11, update the info "RHEL5.5 -> RHEL5.6"
> 
> The test of installing HVM guests via iso images failed (RHEL5.6-64,
> Win2008-64, WinVista-64).
> 
> Intel host with :
> xen-3.0.3-127.el5
> kernel-xen-2.6.18-254.el5
> 
> Guest:
> Linux:
> +-------------+-------------+-------------+-------------+-------------+
> |   RHEL3.9   |   RHEL4.8   |   RHEL5.6   |   RHEL6.0   |    RHEL6.1  |
> +------+------+------+------+------+------+------+------+------+------+
> |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
> +------+------+------+------+------+------+------+------+------+------+
> | PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS |
> +------+------+------+------+------+------+------+------+------+------+
> 


Well, that's strange since I didn't run into those issues on any RHEL guest and I tried mainly RHEL-5 GA and RHEL-6 guests - both i386 and x86_64. The truth is that I didn't try installing it directly from ISO image the time I've been writing the patch since I tried only using the CD-ROM drive in the guest. Now I tried to install RHEL-5.6 x86_64 guest directly from the ISO and it got frozen right after skipping the installation number input. I'm investigating this right now.



> Windows:
> +-------------+-------------+-------------+-------------+-------------+-------------+
> | Windows XP  |   Win2003   |   Win2008   |  Win2008r2  |   WinVista  |    
> Win7    |
> +------+------+------+------+------+------+------+------+------+------+------+------+ 
> |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  | 
> 64  |
> +------+------+------+------+------+------+------+------+------+------+------+------+ 
> | PASS | N/A  | PASS | PASS | PASS | FAIL | N/A  | PASS | PASS | FAIL | PASS |
> PASS |
> +------+------+------+------+------+------+------+------+------+------+------+------+
> 


I did try it using Windows XP x64 but not Windows 2008 x64 and Vista x64.  The strange thing is that it's working fine for Windown 2008 R2 and Windows 7 and the only affected "type" of guest is Windows 2008/Vista (those 2 are very similar AFAIK).

I'm investigating this further.

Michal

Comment 15 Paolo Bonzini 2011-04-04 08:41:27 UTC
This commit looks interesting

091d055 (Fix ATAPI GET_CONFIGURATION function, 2008-06-02)

Also:

8114e9e (Fix ATAPI read drive structure command, 2008-07-03)

Comment 16 Miroslav Rezanina 2011-04-04 11:07:28 UTC
*** Bug 622763 has been marked as a duplicate of this bug. ***

Comment 17 Michal Novotny 2011-04-04 15:13:24 UTC
(In reply to comment #15)
> This commit looks interesting
> 
> 091d055 (Fix ATAPI GET_CONFIGURATION function, 2008-06-02)
> 
> Also:
> 
> 8114e9e (Fix ATAPI read drive structure command, 2008-07-03)

I backported both of them and it enabled 64-bit Windows guests installation however I still need to investigate the RHEL-5.6 issue a little bit more since it's not fixed by those patches.

What I know for sure was that before applying those 2 patches the change of buf[10] from "0x02 | 0x01" back to "0x10 | 0x01" fixed the issue for RHEL-5.6 however I'm afraid is happens randomly. I need to spend a little more time on this one.

Michal

Comment 18 Paolo Bonzini 2011-04-04 15:20:41 UTC
Then please split the bug in two (clone) and let's fix Windows first.

Comment 19 Miroslav Rezanina 2011-04-05 08:58:23 UTC
(In reply to comment #18)
> Then please split the bug in two (clone) and let's fix Windows first.

I disagree. If we are not able provide solution for all platforms we will have to completely revert original patch and win only fix has no reason.

Comment 20 Michal Novotny 2011-04-05 10:14:37 UTC
I've backported the commit 8114e9e (Fix ATAPI read drive structure command) and I've been able to make it working for both Windows 2008 x64 but for case of  RHEL-5.6 x86_64 guest the device model was unable to be reboot so I guess it's better to revert the original patch instead.

Michal

Comment 23 Miroslav Rezanina 2011-04-06 10:05:13 UTC
Strange....can you please retest problematic Win guests with xen-3.0.3-120.el5?

Comment 24 Qixiang Wan 2011-04-06 10:26:16 UTC
(In reply to comment #23)
> Strange....can you please retest problematic Win guests with xen-3.0.3-120.el5?

Both Win2k8-64 and WinVista-64 works well with 120 build.

Comment 28 Michal Novotny 2011-04-06 14:42:57 UTC
Well, I can see what the problem is there.

It's in the IDE CD-ROM emulation implementation since the Windows guest is giving driver the packet of following bytes:

PACKET: 46 01 00 00 00 00 00 00 0c 00

and according to the T10 GET CONFIGURATION command document the second field (buf[1] that's having value of 01) is the value of requested type (RT field) to "Indicate that the Feature Header and only those Feature Descriptors that have their Current bit set shall be returned" and handling of this is missing in the implementation of the CD-ROM emulation and this is where Windows x64 guests stops running and returns BSOD.

Since I think I'm pretty close now I've been talking to Mirek in the morning and he told me that if I can make it work it's a good thing so I'm working on this one right now.

Michal

Comment 36 Yuyu Zhou 2011-04-12 09:47:03 UTC
Tested this build with all supported guests, PASS for all the guests, no issue
found by now:

Host (Intel):
kernel : kernel-xen-2.6.18-256.el5
xen : xen-3.0.3-127.el5virttest02.g1ad0654

Guest:
Linux:
+-------------+-------------+-------------+-------------+-------------+
|   RHEL3.9   |   RHEL4.8   |   RHEL5.6   |   RHEL6.0   |    RHEL6.1  |
+------+------+------+------+------+------+------+------+------+------+
|  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
+------+------+------+------+------+------+------+------+------+------+
| PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
+------+------+------+------+------+------+------+------+------+------+

Windows:
+----------+-----------+-----------+------------+-----------+-----------+
|WindowsXP |  Win2003  |  Win2008  |  Win2008r2 |  WinVista |    Win7   |
+----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+
| 32 |  64 | 32  |  64 | 32  |  64 |  32  |  64 | 32  |  64 | 32  |  64 |
+----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+
|PASS| N/A | PASS| PASS| PASS| PASS| N/A  | PASS| PASS| PASS| PASS| PASS|
+----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+

Comment 37 Michal Novotny 2011-04-12 09:49:30 UTC
(In reply to comment #36)
> Tested this build with all supported guests, PASS for all the guests, no issue
> found by now:
> 
> Host (Intel):
> kernel : kernel-xen-2.6.18-256.el5
> xen : xen-3.0.3-127.el5virttest02.g1ad0654
> 
> Guest:
> Linux:
> +-------------+-------------+-------------+-------------+-------------+
> |   RHEL3.9   |   RHEL4.8   |   RHEL5.6   |   RHEL6.0   |    RHEL6.1  |
> +------+------+------+------+------+------+------+------+------+------+
> |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |  32  |  64  |
> +------+------+------+------+------+------+------+------+------+------+
> | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS |
> +------+------+------+------+------+------+------+------+------+------+
> 
> Windows:
> +----------+-----------+-----------+------------+-----------+-----------+
> |WindowsXP |  Win2003  |  Win2008  |  Win2008r2 |  WinVista |    Win7   |
> +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+
> | 32 |  64 | 32  |  64 | 32  |  64 |  32  |  64 | 32  |  64 | 32  |  64 |
> +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+
> |PASS| N/A | PASS| PASS| PASS| PASS| N/A  | PASS| PASS| PASS| PASS| PASS|
> +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+

Great, thanks a lot for testing. I'll attach patch in the new comment.

Michal

Comment 38 Michal Novotny 2011-04-12 09:52:19 UTC
Created attachment 491440 [details]
Patch containing fix that passed the testing in comment 36

Hi,
this is the patch to fix bug 688456 that is about regression when
installing Windows 2008/Vista x64. The fix for bug bug 622763
introduced this regression so this is the add-on fix to be applied
on top of the patch for bug 622763.

The regression has been cause by the buffer overflow for io_buffer in
the GPCMD_READ_DVD_STRUCTURE command implementation and invalid bits
set in the GPCMD_GET_CONFIGURATION which caused the NULL pointer
dereference in the Windows 2008/Vista x64 ATAPI driver.

Michal

Comment 41 Miroslav Rezanina 2011-04-13 06:18:50 UTC
Returning back to assigned and move to 5.8 as it require more work to comply with specification and match upstream.

Comment 45 Miroslav Rezanina 2011-07-25 14:11:10 UTC
This bz was caused by fix for 622763. As problematic patch was reverted and there's no plan for it's re-use, closing this bz.


Note You need to log in before you can comment on or make changes to this bug.