Bug 131295 - Hugepages configured on kernel boot line causes x86_64 kernel boot to fail with OOM.
Summary: Hugepages configured on kernel boot line causes x86_64 kernel boot to fail wi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Larry Woodman
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 168424
TreeView+ depends on / blocked
 
Reported: 2004-08-30 19:11 UTC by Brian Baker
Modified: 2007-11-30 22:07 UTC (History)
9 users (show)

Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-15 15:40:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0144 0 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 7 2006-03-15 05:00:00 UTC

Description Brian Baker 2004-08-30 19:11:08 UTC
Configuring a large number of hugepages using the kernel boot 
parameter "hugepages=", causes the kernel to not boot with Out of 
Memory errors.

On a 64GB system, adding "hugepages=28160" to the kernel boot line 
(asks for 55GB of hugepages) causes the boot to fail.  9GB of free 
memory should be plenty of memory for the kernel to boot.  The same 
55GB allocation of hugepages done interactively on the command 
line "echo 28160 > /proc/sys/vm/nr_hugepages" works without problem.  
Adding the above interactive command to rc.local causes the command 
to fail. 

It appears that during the boot process memory in a 64GB x86_64 
environment is configured differently than it is once the system is 
fully booted.

It looks like the hugepage allocation problem is numa related.  
Disabling numa features with "numa=off" on the kernel boot line gets 
rid of the problem, however performance suffers.

Comment 1 Brian Baker 2004-08-30 19:11:49 UTC
BTW, this is RH EL 3 Update 3

Comment 3 Jim Paradis 2004-09-14 15:09:11 UTC
This suggests that the pre-allocation of hugepages should also be made
NUMA-aware.  Will investigate.


Comment 7 Larry Woodman 2004-11-29 19:36:27 UTC
*** Bug 130489 has been marked as a duplicate of this bug. ***

Comment 10 Ernie Petrides 2005-02-15 05:18:07 UTC
It has been decided that x86_64 RHEL3 kernels should continue to enable
NUMA by default.  However, if an OOM kill occurs on a NUMA system, an
extra message will be printed by the kernel suggesting that using the
"numa=off" boot option might be a good way to work around the issue.

The exact message is:

    OOM kill occurred on an x86_64 NUMA system!
    The numa=off boot option might help avoid this.

This change was committed to the RHEL3 U5 patch pool on 9-Feb-2005 (in
kernel version 2.4.21-27.12.EL).


Comment 11 Ernie Petrides 2005-11-30 07:47:51 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.12.EL).

To enable an improved NUMA-friendly page allocation policy, please
set /proc/sys/vm/numa_memory_allocator via the "sysctl" command
(or put "vm.numa_memory_allocator = 1" in /etc/sysctl.conf).


Comment 15 Red Hat Bugzilla 2006-03-15 15:40:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html



Note You need to log in before you can comment on or make changes to this bug.