| Summary: | BSOD while installing 64bit windows guest from iso image | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Yuyu Zhou <yuzhou> | ||||||||||||
| Component: | xen | Assignee: | Michal Novotny <minovotn> | ||||||||||||
| Status: | CLOSED CANTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||
| Priority: | urgent | ||||||||||||||
| Version: | 5.7 | CC: | areis, drjones, leiwang, minovotn, mrezanin, pbonzini, qwan, xen-maint | ||||||||||||
| Target Milestone: | rc | ||||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: | ||||||||||||||
| Last Closed: | 2011-07-25 14:11:10 UTC | Type: | --- | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Bug Depends On: | |||||||||||||||
| Bug Blocks: | 622763, 699611 | ||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Yuyu Zhou
2011-03-17 05:56:47 UTC
Created attachment 485918 [details]
Screenshot of guest
Created attachment 485920 [details]
xend log
Created attachment 485921 [details]
xm dmesg log
I'm going to investigate this since this was apparently caused by my patch. Michal Well, I've been looking to this one and apparently some bits were set incorrectly as double-checked in the T10 Get Configuration command document, revision 0.01. I've been trying to look at the upstream QEMU a little more (since I found nothing on the first look there) and I've been able to find commit 38cdea7c [1] by Carlo Marcelo Arenas Belon so I guess it's better to backport this commit instead of writing it on my own. I need to test it a little bit more however according to my testing the installation of Windows XP x64 boots up successfully. I need to check against Linux i386 and x86_64 and also Windows 32-bit guests as well to confirm it won't introduce any other regression. Michal http://git.qemu.org/qemu.git/commit/?id=38cdea7c Created attachment 486606 [details]
Backport multi-profile DVD-ROM support commit to fix the issue
Hi,
this is the backport of upstream QEMU git commit 38cdea7c to
implement multi-profile DVD-ROM support. We've have been
having issues/regression caused by introduction of new SCSI
commands for the ATAPI CD-ROM driver that was caused by some
bits being incorrectly set (number of profiles were set to
the bogus number however it was working fine for all the
guest except Windows x64 guests). I've been investigating
this on my own since I was unable to find the working commit
solving this issue at first however I've been able to find
the upstream QEMU commit later and it turned out to be
solving the issues for Windows x64 guests. To confirm
there will be no regression introduced by this commit
I've been doing testing of both installation and CD/DVD-ROM
drive usage (standard read operation since we don't support
write for the virtual CD-ROM) for all of the guests supported,
i.e. Windows x86 and x64 and Linux i386 and x86_64 and
everything was working fine and correctly.
Michal
Please do not change DVD-ROM to CD-ROM, it's just a string and it's guest visible. Things such as guests losing drive letters may happen. Result of Installation HVM guest from CDROM for all distro Three version of HVM guests failed for install from DVD iso: RHEl5.5-64 Win2008-64 WinVista-64 Host: xen-3.0.3-127.el5 kernel-xen-2.6.18-254.el5 Intel 64bit Guest: Linux: +-------------+-------------+-------------+-------------+-------------+ | RHEL3.9 | RHEL4.8 | RHEL5.5 | RHEL6.0 | RHEL6.1 | +------+------+------+------+------+------+------+------+------+------+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +------+------+------+------+------+------+------+------+------+------+ | PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS | +------+------+------+------+------+------+------+------+------+------+ Windows: +-------------+-------------+-------------+-------------+-------------+-------------+ | Windows XP | Win2003 | Win2008 | Win2008r2 | WinVista | Win7 | +------+------+------+------+------+------+------+------+------+------+------+------+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +------+------+------+------+------+------+------+------+------+------+------+------+ | PASS | N/A | PASS | PASS | PASS | FAIL | N/A | PASS | PASS | FAIL | PASS | PASS | +------+------+------+------+------+------+------+------+------+------+------+------+ ignore the comment 11, update the info "RHEL5.5 -> RHEL5.6" The test of installing HVM guests via iso images failed (RHEL5.6-64, Win2008-64, WinVista-64). Intel host with : xen-3.0.3-127.el5 kernel-xen-2.6.18-254.el5 Guest: Linux: +-------------+-------------+-------------+-------------+-------------+ | RHEL3.9 | RHEL4.8 | RHEL5.6 | RHEL6.0 | RHEL6.1 | +------+------+------+------+------+------+------+------+------+------+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +------+------+------+------+------+------+------+------+------+------+ | PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS | +------+------+------+------+------+------+------+------+------+------+ Windows: +-------------+-------------+-------------+-------------+-------------+-------------+ | Windows XP | Win2003 | Win2008 | Win2008r2 | WinVista | Win7 | +------+------+------+------+------+------+------+------+------+------+------+------+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +------+------+------+------+------+------+------+------+------+------+------+------+ | PASS | N/A | PASS | PASS | PASS | FAIL | N/A | PASS | PASS | FAIL | PASS | PASS | +------+------+------+------+------+------+------+------+------+------+------+------+ 1. for RHEL5.6 x86_64 guest, we saw guest crash at random stages, for example: [1]. after skip installation number, it reboot immediately and re-boot from cd [2]. after disk format, it qemu-dm crash immediately: $ ps aux | grep qemu-dm root 9617 6.0 0.0 0 0 ? Z 02:54 0:17 [qemu-dm] <defunct> $ cat /var/log/xen/qemu-dm.9617.log domid: 99 qemu: the number of cpus is 1 Using file in read-write mode Using file in read-only mode Watching /local/domain/99/logdirty/next-active Watching /local/domain/0/device-model/99/command xs_read(): vncpasswd get error. /vm/7b8795f8-40a8-6d53-0d95-667c709a18ac/vncpasswd. char device redirected to /dev/pts/3 qemu_map_cache_init nr_buckets = 10000 shared page at pfn 3ffff buffered io page at pfn 3fffd xs_read(/vm/7b8795f8-40a8-6d53-0d95-667c709a18ac/rtc/timeoffset): read error I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 Triggered log-dirty buffer switch gpe_en_write: addr=0x1f6c, val=0x0. gpe_sts_write: addr=0x1f68, val=0xff. gpe_en_write: addr=0x1f6d, val=0x0. gpe_sts_write: addr=0x1f69, val=0xff. gpe_en_write: addr=0x1f6e, val=0x0. gpe_sts_write: addr=0x1f6a, val=0xff. gpe_en_write: addr=0x1f6f, val=0x0. gpe_sts_write: addr=0x1f6b, val=0xff. gpe_en_write: addr=0x1f6c, val=0x8. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. ACPI PCI hotplug: read addr=0x10c1, val=0x0. ACPI PCI hotplug: read addr=0x10c2, val=0x0. inp: bad size: 0 0 2. For Windows 2008 x86_64 and vista x86_64, we hit the same BSOD issue as original reported during installation. Or you can reproduce it by booting up the Win2008-64bit guest with the qemu ide cd-rom attached, then open server-manager -> storage -> disk management (crash now) (In reply to comment #12) > ignore the comment 11, update the info "RHEL5.5 -> RHEL5.6" > > The test of installing HVM guests via iso images failed (RHEL5.6-64, > Win2008-64, WinVista-64). > > Intel host with : > xen-3.0.3-127.el5 > kernel-xen-2.6.18-254.el5 > > Guest: > Linux: > +-------------+-------------+-------------+-------------+-------------+ > | RHEL3.9 | RHEL4.8 | RHEL5.6 | RHEL6.0 | RHEL6.1 | > +------+------+------+------+------+------+------+------+------+------+ > | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | > +------+------+------+------+------+------+------+------+------+------+ > | PASS | PASS | PASS | PASS | PASS | FAIL | PASS | PASS | PASS | PASS | > +------+------+------+------+------+------+------+------+------+------+ > Well, that's strange since I didn't run into those issues on any RHEL guest and I tried mainly RHEL-5 GA and RHEL-6 guests - both i386 and x86_64. The truth is that I didn't try installing it directly from ISO image the time I've been writing the patch since I tried only using the CD-ROM drive in the guest. Now I tried to install RHEL-5.6 x86_64 guest directly from the ISO and it got frozen right after skipping the installation number input. I'm investigating this right now. > Windows: > +-------------+-------------+-------------+-------------+-------------+-------------+ > | Windows XP | Win2003 | Win2008 | Win2008r2 | WinVista | > Win7 | > +------+------+------+------+------+------+------+------+------+------+------+------+ > | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | > 64 | > +------+------+------+------+------+------+------+------+------+------+------+------+ > | PASS | N/A | PASS | PASS | PASS | FAIL | N/A | PASS | PASS | FAIL | PASS | > PASS | > +------+------+------+------+------+------+------+------+------+------+------+------+ > I did try it using Windows XP x64 but not Windows 2008 x64 and Vista x64. The strange thing is that it's working fine for Windown 2008 R2 and Windows 7 and the only affected "type" of guest is Windows 2008/Vista (those 2 are very similar AFAIK). I'm investigating this further. Michal This commit looks interesting 091d055 (Fix ATAPI GET_CONFIGURATION function, 2008-06-02) Also: 8114e9e (Fix ATAPI read drive structure command, 2008-07-03) *** Bug 622763 has been marked as a duplicate of this bug. *** (In reply to comment #15) > This commit looks interesting > > 091d055 (Fix ATAPI GET_CONFIGURATION function, 2008-06-02) > > Also: > > 8114e9e (Fix ATAPI read drive structure command, 2008-07-03) I backported both of them and it enabled 64-bit Windows guests installation however I still need to investigate the RHEL-5.6 issue a little bit more since it's not fixed by those patches. What I know for sure was that before applying those 2 patches the change of buf[10] from "0x02 | 0x01" back to "0x10 | 0x01" fixed the issue for RHEL-5.6 however I'm afraid is happens randomly. I need to spend a little more time on this one. Michal Then please split the bug in two (clone) and let's fix Windows first. (In reply to comment #18) > Then please split the bug in two (clone) and let's fix Windows first. I disagree. If we are not able provide solution for all platforms we will have to completely revert original patch and win only fix has no reason. I've backported the commit 8114e9e (Fix ATAPI read drive structure command) and I've been able to make it working for both Windows 2008 x64 but for case of RHEL-5.6 x86_64 guest the device model was unable to be reboot so I guess it's better to revert the original patch instead. Michal Strange....can you please retest problematic Win guests with xen-3.0.3-120.el5? (In reply to comment #23) > Strange....can you please retest problematic Win guests with xen-3.0.3-120.el5? Both Win2k8-64 and WinVista-64 works well with 120 build. Well, I can see what the problem is there. It's in the IDE CD-ROM emulation implementation since the Windows guest is giving driver the packet of following bytes: PACKET: 46 01 00 00 00 00 00 00 0c 00 and according to the T10 GET CONFIGURATION command document the second field (buf[1] that's having value of 01) is the value of requested type (RT field) to "Indicate that the Feature Header and only those Feature Descriptors that have their Current bit set shall be returned" and handling of this is missing in the implementation of the CD-ROM emulation and this is where Windows x64 guests stops running and returns BSOD. Since I think I'm pretty close now I've been talking to Mirek in the morning and he told me that if I can make it work it's a good thing so I'm working on this one right now. Michal Tested this build with all supported guests, PASS for all the guests, no issue found by now: Host (Intel): kernel : kernel-xen-2.6.18-256.el5 xen : xen-3.0.3-127.el5virttest02.g1ad0654 Guest: Linux: +-------------+-------------+-------------+-------------+-------------+ | RHEL3.9 | RHEL4.8 | RHEL5.6 | RHEL6.0 | RHEL6.1 | +------+------+------+------+------+------+------+------+------+------+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +------+------+------+------+------+------+------+------+------+------+ | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | +------+------+------+------+------+------+------+------+------+------+ Windows: +----------+-----------+-----------+------------+-----------+-----------+ |WindowsXP | Win2003 | Win2008 | Win2008r2 | WinVista | Win7 | +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ |PASS| N/A | PASS| PASS| PASS| PASS| N/A | PASS| PASS| PASS| PASS| PASS| +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ (In reply to comment #36) > Tested this build with all supported guests, PASS for all the guests, no issue > found by now: > > Host (Intel): > kernel : kernel-xen-2.6.18-256.el5 > xen : xen-3.0.3-127.el5virttest02.g1ad0654 > > Guest: > Linux: > +-------------+-------------+-------------+-------------+-------------+ > | RHEL3.9 | RHEL4.8 | RHEL5.6 | RHEL6.0 | RHEL6.1 | > +------+------+------+------+------+------+------+------+------+------+ > | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | > +------+------+------+------+------+------+------+------+------+------+ > | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | PASS | > +------+------+------+------+------+------+------+------+------+------+ > > Windows: > +----------+-----------+-----------+------------+-----------+-----------+ > |WindowsXP | Win2003 | Win2008 | Win2008r2 | WinVista | Win7 | > +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ > | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | 32 | 64 | > +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ > |PASS| N/A | PASS| PASS| PASS| PASS| N/A | PASS| PASS| PASS| PASS| PASS| > +----+-----+-----+-----+-----+-----+------+-----+-----+-----+-----+-----+ Great, thanks a lot for testing. I'll attach patch in the new comment. Michal Created attachment 491440 [details] Patch containing fix that passed the testing in comment 36 Hi, this is the patch to fix bug 688456 that is about regression when installing Windows 2008/Vista x64. The fix for bug bug 622763 introduced this regression so this is the add-on fix to be applied on top of the patch for bug 622763. The regression has been cause by the buffer overflow for io_buffer in the GPCMD_READ_DVD_STRUCTURE command implementation and invalid bits set in the GPCMD_GET_CONFIGURATION which caused the NULL pointer dereference in the Windows 2008/Vista x64 ATAPI driver. Michal Returning back to assigned and move to 5.8 as it require more work to comply with specification and match upstream. This bz was caused by fix for 622763. As problematic patch was reverted and there's no plan for it's re-use, closing this bz. |