Bug 234325 - Paravirtualized guest with 90GB of memory crashes system
Paravirtualized guest with 90GB of memory crashes system
Status: CLOSED DUPLICATE of bug 251353
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen (Show other bugs)
5.0
ia64 Linux
medium Severity urgent
: ---
: ---
Assigned To: Tetsu Yamamoto
: OtherQA
Depends On:
Blocks: 223107
  Show dependency treegraph
 
Reported: 2007-03-28 10:44 EDT by Joseph Szczypek
Modified: 2009-06-19 12:23 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-09-04 10:47:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Joseph Szczypek 2007-03-28 10:44:32 EDT
Description of problem:
Booting a single large memory (90GB) paravirtualized RHEL5 guest crashes xen. 
No other guests were running on the system.

Configuration information:
Dom0 memory was configured for either 1 or 2GB  
System has 4 sockets/8 cores with 96GB memory
Guest was configured with one VPCU
Guest was installed to a physical drive on the system.
Guest was configured to use 12 physical drives as targets for testing (drives
are in one MSA1000)

Version-Release number of selected component (if applicable):
kernel                 : 2.6.18-8.1.1el5xen
xen_major              : 3
xen_minor              : 0
xen_extra              : .3-rc5-8.1.1.el

How reproducible:
Every time I boot the guest

Steps to Reproduce:
1.  Install a paravirtualized RHEL5 guest using less memory (8GB)
2.  Modify guest config file memory entry to 92160
3.  Boot guest using 'xm create'
  
Actual results:
System crashes

Expected results:
Guest starts or a graceful exit from the 'xm create' if system resources are not
available to start the guest.

Additional info:
Output from console when doing 'xm create' of guest:

(XEN) (file=domain.c, line=416) arch_domain_create:416 domain 1 pervcpu_vhpt 1
(XEN) tlb_track_allocate_entries:68 allocated 256 num_entries 256 num_free 256
(XEN) tlb_track_create:114 hash 0xf0000100e4d08000 hash_size 512
(XEN) ### domain f000000007c3c080: rid=80000-c0000 mp_rid=2000
(XEN) arch_domain_create: domain=f000000007c3c080
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!

<<<<  many of these messages >>>

(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!
(XEN) Cannot handle page request order 0!

okup_domain_mpa: d 0xf000000007c3c080 id 1 current 0xf000000007d28000 id 0
(XEN) lookup_domain_mpa: non-allocated mpa 0x167fff4480 (< 0x1680000000)
(XEN) DomainU EFI build up: ACPI 2.0=0x1000
(XEN) dom mem: type=13, attr=0x8000000000000008, range=[0x0000000000000000-0x000
0000000001000) (4KB)
(XEN) dom mem: type=10, attr=0x8000000000000008, range=[0x0000000000001000-0x000
0000000002000) (4KB)
(XEN) dom mem: type= 6, attr=0x8000000000000008, range=[0x0000000000002000-0x000
0000000003000) (4KB)
(XEN) dom mem: type= 7, attr=0x0000000000000008, range=[0x0000000000003000-0x000
000167fff4000) (92159MB)
(XEN) dom mem: type=12, attr=0x0000000000000001, range=[0x00000ffffc000000-0x000
0100000000000) (64MB)
BUG: soft lockup detected on CPU#0!

Call Trace:
 [<a00000010001c8a0>] show_stack+0x40/0xa0
                                sp=e00000001f61f7d0 bsp=e00000001f619610
 [<a00000010001c930>] dump_stack+0x30/0x60
                                sp=e00000001f61f9a0 bsp=e00000001f6195f0
 [<a0000001000f2120>] softlockup_tick+0x200/0x240
                                sp=e00000001f61f9a0 bsp=e00000001f6195a8
 [<a0000001000a3430>] run_local_timers+0x30/0x60
                                sp=e00000001f61f9a0 bsp=e00000001f619590
 [<a0000001000a34e0>] update_process_times+0x80/0x100
                                sp=e00000001f61f9a0 bsp=e00000001f619560
 [<a000000100040e70>] timer_interrupt+0x150/0x380
                                sp=e00000001f61f9a0 bsp=e00000001f619520
 [<a0000001000f2860>] handle_IRQ_event+0x160/0x240
                                sp=e00000001f61f9a0 bsp=e00000001f6194e0
 [<a0000001000f2c00>] __do_IRQ+0x2c0/0x420
                                sp=e00000001f61f9a0 bsp=e00000001f619498
 [<a0000001003c2400>] evtchn_do_upcall+0x160/0x260
                                sp=e00000001f61f9a0 bsp=e00000001f619408
 [<a000000100065340>] xen_leave_kernel+0x0/0x3b0
                                sp=e00000001f61f9a0 bsp=e00000001f619408
 [<a00000010006b7a0>] privcmd_hypercall+0x520/0x1780
                                sp=e00000001f61fb70 bsp=e00000001f619398
 [<a0000001003c9c50>] privcmd_ioctl+0xf0/0x920
                                sp=e00000001f61fdd0 bsp=e00000001f619330
 [<a0000001001853e0>] do_ioctl+0x140/0x180
                                sp=e00000001f61fe10 bsp=e00000001f6192f0
 [<a000000100185ca0>] vfs_ioctl+0x880/0x8e0
                                sp=e00000001f61fe10 bsp=e00000001f6192a8
 [<a000000100185dd0>] sys_ioctl+0xd0/0x140
                                sp=e00000001f61fe20 bsp=e00000001f619228
 [<a000000100065060>] xen_trace_syscall+0x100/0x140
                                sp=e00000001f61fe30 bsp=e00000001f619228
 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400
                                sp=e00000001f620000 bsp=e00000001f619228
(XEN) Cannot handle page request order 0!
(XEN) lookup_domain_mpa: d 0xf000000007d50080 id 0 current 0xf000000007d28000 id
 0
(XEN) lookup_domain_mpa: bad mpa 0x10104000000 (=> 0x40000000)
(XEN) lookup_domain_mpa: d 0xf000000007d50080 id 0 current 0xf000000007d28000 id
 0
(XEN) lookup_domain_mpa: bad mpa 0x10104000000 (=> 0x40000000)
(XEN) Cannot handle page request order 0!
(XEN) ia64_fault, vector=0x4, ifa=0xf300000296990064, iip=0xf00000000406d1d0, ip
sr=0x0000121008226018, isr=0x00000a0400000000
(XEN) Alt DTLB.
(XEN) d 0xf000000007d50080 domid 0
(XEN) vcpu 0xf000000007d28000 vcpu 0
(XEN)
(XEN) CPU 0
(XEN) psr : 0000121008226018 ifs : 800000000000048d ip  : [<f00000000406d1d1>]
(XEN) ip is at assign_domain_page_replace+0xc1/0x2e0
(XEN) unat: 0000000000000000 pfs : 000000000000048d rsc : 0000000000000003
(XEN) rnat: f000000007d2ffe8 bsps: 0000000300000006 pr  : 0000000000699aa9
(XEN) ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0  : f00000000406d150 b6  : f000000004048880 b7  : a0000001003c9b60
(XEN) f6  : 0ffff8000000000000000 f7  : 000000000000000000000
(XEN) f8  : 000000000000000000000 f9  : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1  : f00000000432f200 r2  : 000000000051e094 r3  : f000000007d2ffe8
(XEN) r8  : 0000000000000000 r9  : 0000000000000000 r10 : 0000000000000000
(XEN) r11 : 0009804c0270033f r12 : f000000007d2f940 r13 : f000000007d28000
(XEN) r14 : f300000000000000 r15 : 0000000052d3200a r16 : 000000001090a002
(XEN) r17 : f300000296990064 r18 : 000000000051a094 r19 : f000000004131ef8
(XEN) r20 : f0000000040f8200 r21 : f000000004137be0 r22 : 2000000009100000
(XEN) r23 : ffffffffffffffff r24 : f000000007d2fe20 r25 : f000000007d2fe28
(XEN) r26 : 0000000000000000 r27 : 0000000000000000 r28 : a0000002008c4010
(XEN) r29 : 0000000000000000 r30 : 0000000000000000 r31 : f00000000413c7d0
(XEN)
(XEN) Call Trace:
(XEN)  [<f0000000040a7ef0>] show_stack+0x80/0xa0
(XEN)                                 sp=f000000007d2f570 bsp=f000000007d29258
(XEN)  [<f000000004073400>] ia64_fault+0xa30/0xad0
(XEN)                                 sp=f000000007d2f740 bsp=f000000007d29220
(XEN)  [<f0000000040a4d80>] ia64_leave_kernel+0x0/0x310
(XEN)                                 sp=f000000007d2f740 bsp=f000000007d29220
(XEN)  [<f00000000406d1d0>] assign_domain_page_replace+0xc0/0x2e0
(XEN)                                 sp=f000000007d2f940 bsp=f000000007d291b0
(XEN) Cannot handle page request order 0!
(XEN) unwind.desc_label_state(): out of memory
(XEN)  [<f00000000406f2a0>] dom0vp_add_physmap+0x2b0/0x570
(XEN)                                 sp=f000000007d2f940 bsp=f000000007d29158
(XEN) Cannot handle page request order 0!
(XEN) unwind.desc_label_state(): out of memory
(XEN) unwind: failed to find state labeled 0x3
(XEN)  [<f000000004054350>] do_dom0vp_op+0x1c0/0x3c0
(XEN)                                 sp=f000000007d2f950 bsp=f000000007d29118
(XEN)  [<f00000000406d150>] assign_domain_page_replace+0x40/0x2e0
(XEN)                                 sp=f000000007d2f950 bsp=f000000007d290d8
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Fault in Xen.
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
Comment 2 RHEL Product and Program Management 2007-05-01 11:52:59 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 3 Brian Stein 2007-08-03 11:07:26 EDT
Please confirm this behavior with the current 5.1 beta.
Comment 4 Joseph Szczypek 2007-08-13 13:29:15 EDT
(In reply to comment #3)
> Please confirm this behavior with the current 5.1 beta.
If I try to boot a 1VCPU, 90GB PV guest (92160MB) using RHEL5.1 Beta 1
(2.6.18-36.el5xen) on an rx6600 I get the following:
pygrub(31875): unaligned access to 0x600000000015110e, ip=0x20000000043b9640
pygrub(31875): unaligned access to 0x6000000000151112, ip=0x20000000043b9150
pygrub(31875): unaligned access to 0x6000000000151114, ip=0x20000000043b9150
pygrub(31875): unaligned access to 0x6000000000151116, ip=0x20000000043b9150
pygrub(31875): unaligned access to 0x600000000015111a, ip=0x20000000043b9150
Error: (4, 'Out of memory', "xc_dom_boot_mem_init: can't allocate low memory for
domain\n")

I find that I can boot a PV guest if maxmem=memory=65024MB.   If I set them to
65536MB, I get the 'Out of memory' error.  The system has 96GB of memory, so why
can't I create a larger memory guest?

Also note that via virt-manager, the max I could increase my PV guest memory
size to was 32000MB (the guest was originally installed as a 8GB guest).  The
system has 96GB in it.  I could increase memory beyond 32000MB by editing the
config file myself.

Comment 5 Ronald Pacheco 2007-08-30 12:25:41 EDT
Please bear in mind that we officially support 64 GB per dom
Comment 6 Martine Silbermann 2007-08-30 12:42:24 EDT
unfortunatelly 65024 MB = 63.5 GB....not quite 64
Comment 8 Chris Lalancette 2007-09-04 10:47:25 EDT
We actually already have a bug opened about this, and a patch available for 5.2.
 Closing this as a DUP.

Chris Lalancette

*** This bug has been marked as a duplicate of 251353 ***
Comment 9 Martine Silbermann 2007-09-05 23:48:11 EDT
Chris,

I tried to access 251353 to look at the patch but I'm not authorized. Could 
you please add me and Joseph Szczypek to the cc list of 251353?

Thx - Martine

Note You need to log in before you can comment on or make changes to this bug.