Bug 207606 - Running multiple HVM guests causes hang of Dom0
Summary: Running multiple HVM guests causes hang of Dom0
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: xen
Version: rawhide
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Steven Rostedt
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-21 20:25 UTC by Daniel Berrangé
Modified: 2007-12-11 15:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-12-11 15:52:36 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Serial console log showing all messages from HV & Dom0 (581.39 KB, text/plain)
2006-09-21 20:25 UTC, Daniel Berrangé
no flags Details

Description Daniel Berrangé 2006-09-21 20:25:02 UTC
Description of problem:
Create 2 HVM guests (with ACPI=1 and VCPUS=4), one guest running RHEL3 x86_64,
and the other RHEl4 x86_64. At some point during initialization of the guest
kernels the Dom0 host will lockup completely. It looks like its locking while
the guest is bringing up its VCPUs, because the VCPU count for the guest does
not ever reach 4. Sometimes it can bring up 2 /3 before Dom0 hangs, but often
hangs on 2nd CPU.

The serial console from the hypervisor / dom0 kernel shows:

(XEN) (GUEST: 3) HVMAssist BIOS, 1 cpu, $Revision: 1.138 $ $Date: 2005/05/07
15:55:26 $
(XEN) (GUEST: 3) 
(XEN) (GUEST: 3) ata0-0: PCHS=4063/16/63 translation=lba LCHS=1015/64/63
(XEN) (GUEST: 3) ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (2000 MBytes)
(XEN) (GUEST: 3) ata0  slave: Unknown device
(XEN) (GUEST: 3) 
(XEN) (GUEST: 3) Booting from Hard Disk...
(XEN) (GUEST: 3) int13_harddisk: function 41, unmapped device for ELDL=81
(XEN) (GUEST: 3) int13_harddisk: function 08, unmapped device for ELDL=81
(XEN) (GUEST: 3) *** int 15h function AX=00C0, BX=0000 not yet supported!
(XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=81
(XEN) (GUEST: 1) *** int 15h function AX=EC00, BX=0002 not yet supported!
(XEN) (GUEST: 1) KBD: unsupported int 16h function 03
(XEN) (GUEST: 1) int13_harddisk: function 15, unmapped device for ELDL=81
(XEN) (GUEST: 1) int13_harddisk: function 02, unmapped device for ELDL=81
(XEN) (GUEST: 1) int13_harddisk: function 41, unmapped device for ELDL=81
(XEN) Local APIC Write to read-only register
(XEN) This hvm_vlapic is for P4, no work for De-assert init
(XEN) AP 1 bringup suceeded.
(XEN) malloc vlapic regs error for vcpu 1


These last four lines are always present immediately before Dom0 hangs solid.

Occasionally, the Dom0 kernel will also output

BUG: spinlock lockup on CPU#1, swapper/0, ffff88000100e460 (Not tainted)
BUG: spinlock lockup on CPU#1, swapper/0, ffff88000100e460 (Not tainted)
BUG: spinlock recursion on CPU#1, swapper/0 (Not tainted)


But this looks like a consequence of the earlier HV problems logged.


Version-Release number of selected component (if applicable):
xen-3.0.2-33

How reproducible:
Often (sometimes requires creation of 3/4 guests, but always eventually hangs)

Steps to Reproduce:
1. Create 2 or more HVM guests with vcpus=4, acpi=1, apic=1, pae=1
2. Start the guests in quick succession of each other
3.
  
Actual results:
Dom0 hangs

Expected results:
Guests boot

Additional info:
The file which logs the final error message is

./xen/arch/x86/hvm/vlapic.c

The code  is

    vlapic->regs_page = alloc_domheap_page(NULL);
    if ( vlapic->regs_page == NULL )
    {
        printk("malloc vlapic regs error for vcpu %x\n", v->vcpu_id);
        xfree(vlapic);
        return -ENOMEM;
    }


Now, the host has 4 GB of memory, and the two guests are only confugured to have
500 MB, so there is no way it should be out of memory at this point. There
should not be any significant memory fragmentation either, since this was done
immediately after booting dom0

Comment 1 Daniel Berrangé 2006-09-21 20:25:03 UTC
Created attachment 136906 [details]
Serial console log showing all messages from HV & Dom0

Comment 3 Red Hat Bugzilla 2007-07-24 23:58:44 UTC
change QA contact

Comment 4 Daniel Berrangé 2007-12-11 15:52:36 UTC
Works fine in current Fedora



Note You need to log in before you can comment on or make changes to this bug.