Bug 486863

Summary: 32-bit para-virt (RHEL v5.3) guest kernel panics on one hypervisor, functions correctly on another
Product: Red Hat Enterprise Linux 5 Reporter: Stephen Gardner <stephen-rhel>
Component: kernel-xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 5.3CC: clalance, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-22 20:24:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Para-virt kernel output / panic text none

Description Stephen Gardner 2009-02-22 17:26:06 UTC
Created attachment 332861 [details]
Para-virt kernel output / panic text

Description of problem:
A 32-bit para-virt guest (RHEL v5.3) can be installed and runs correctly on a 64-bit (RHEL v5.3) HP ProLiant DL380-G5 (2x Xeon X5450, 12GB RAM) but the same VM fails to install (or when the disk image is copied from the working server boot) on a 64-bit (RHEL v5.3) HP ProLiant DL580-G5 (4x Xeon X7350, 128GB RAM). At kernel startup either both installation or booting of the para-virt guest on the non-working hypervisor the guest errors with  "kernel BUG at include/linux/mm.h:310!"  and the kernel panics very shortly after booting (full output attached). For reference a 64-bit para-virt guest (RHEL v5.3) as well as HVM guests (32-bit, RHEL v4.6) operate fine on the hosting server having issues with the 32-bit para-virt guest. I am able to reproduce this on two sets of matching hardware and software. I have tried raising and lowering the memory specified in the config files (or virt-install command-lines) and dropping from 2 VCPUs to 1 VCPU with no success. This does occur with SAN hosted volume groups or disk-image based stored. It also occurs when the hosting DL580 has no other guests present. I am able to see from our RHN Proxy logs that virt-install is correct retrieving the xen kernel and initrd from the installation location provided.

Version-Release number of selected component (if applicable):

Para-virt guest kernel:           kernel-xen-2.6.18-128.el5 (also tried 2.6.18-128.1.1)
Hypervisor kernel (both servers): kernel-xen-2.6.18-128.1.1.el5
Xen version (both servers):       xen-3.0.3-80.el5

How reproducible:
Matching virt-install command-lines carried out on 2x DL380-G5s (one with 12GB memory, one with 16GB memory, both with 2x Xeon 54xx processors) worked fine. The same command-line tried on 2x DL580-G5s (both with 128GB memory, both with 4x Xeon 73xx processors) caused kernel panics on guest startup. This included when a pre-installed and working guest was imported (ie. disk image was copied and booted) from the working systems.

Steps to Reproduce:
virt-install -n misap5u4 --nographics -r 4096 --vcpus=2 -f /dev/vms-vg1/misap5u4-hda -m '00:00:XX:XX:XX:XX' --bridge=xenbr3 -p -l 'http://XX.XX.XX.XX/install/software/rhel5-u3/i386/' -x 'ic=as,5,3,i386,mis,apvm, nofb text noipv6 ks=http://XX.XX.XX.XX/install/scripts/dev/rhel4-server/ks/as5-i386-u3.ks'

The exact specifics of re-producibility as you will gather from the desription appear to be purely based on the hosting hardware (perhaps memory or differences between the 54xx and 73xx CPUs).

Actual results:
A 32-bit para-virt guest can be installed and operated on a smaller hosting server but not a larger platform.

Expected results:
Matching 32-bit para-virt hosting on both (sets of) hosting servers.

Additional info:
The hosting servers have been booted with and without selinux. The DL380s use the Xen parameter  "dom0_mem=512M", the DL580s use "dom0_mem=768M". I have been through the contents of xend.log, domain-builder-ng.log, the output from  xm dmesg  on both servers and found little to differentiate the systems. I can provide them on request.

Comment 1 Chris Lalancette 2009-02-22 20:24:45 UTC
As of 5.3, 32-on-64 guests are not supported, due to issues such as these.  However, that being said, I'm pretty sure this is a known issue; on hosts with > 64GB of RAM, 32-on-64 guests will currently crash.  We have a patch pending for it; would you be able to test the patch in BZ 448511?  I usually would have it in the virttest kernels, but somehow I made a mistake and dropped it from virttest9.  You can either build a test kernel yourself with the patch from 448511 applied, or you can wait for the next iteration of the virttest kernel (virttest10 will come out later this week), which will have the fix in it.

For now, I'm going to close this as a dup.

Chris Lalancette

*** This bug has been marked as a duplicate of bug 448115 ***

Comment 2 Stephen Gardner 2009-02-23 01:59:43 UTC
As requested I compiled virttest9 + the mach-xen/asm/page.h MASK_SHIFT patch (as referenced in 448511) and am pleased to report my 32-bit para-virt guest is now bootable on my 128GB DL580. This certainly was a dup and I appreciate the quick response.

I understand that this functionality is still a Tech Preview and as of yet unsupported.

Stephen

Comment 3 Chris Lalancette 2009-02-23 07:44:09 UTC
OK, great, the extra testing is wonderful, as always :).  The patch in 448511 is queued up for the next RHEL-5 release.

Thanks,
Chris Lalancette