Bug 130489

Summary: kernel kills db2 processes because of OOM error on RHEL Update2 and Update3
Product: Red Hat Enterprise Linux 3 Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: anderson, peterm, petrides, riel, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0144 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-15 15:38:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168424    
Attachments:
Description Flags
var_log_messages.txt
none
meminfo.txt none

Comment 7 Larry Woodman 2004-08-23 01:43:59 UTC
Glen, I think this is a NUMA issue, please add "numa=off" to the
command line, reboot and and let me know if the problem goes away.

Larry Woodman

Comment 8 IBM Bug Proxy 2004-08-23 23:02:26 UTC
----- Additional Comments From ysluk.com  2004-08-23 18:43 -------
The problem goes away after rebooting with "numa=off"
Thanks! 

Comment 9 IBM Bug Proxy 2004-08-24 15:07:55 UTC
----- Additional Comments From salina.com  2004-08-24 11:07 -------
Question for RedHat,
Is numa=on by default on RHEL 3 U3 ?   If this does not work, it will be better 
if this is off by default i.e. mayne until RHEL 4 where 2.6 kernel has more 
stable numa code ??  Thhanks. 

Comment 10 Larry Woodman 2004-08-24 15:51:26 UTC
For RHEL3-U3 NUMA is ON by default and its too late to change it now.
 For RHEL3-U4 wa are investigating what changes would be necessary to
allow it to remain ON by default and still work correctly under high
memory load situations.

Larry Woodman


Comment 11 IBM Bug Proxy 2004-08-24 17:32:28 UTC
----- Additional Comments From salina.com  2004-08-24 13:29 -------
Ananda,
I have added you to cc list.  Please see if you are hitting this on p-series 
with NUMA=on.  Thanks. 

Comment 12 IBM Bug Proxy 2004-08-24 17:32:48 UTC
----- Additional Comments From AVenkat.com  2004-08-24 13:32 -------
Thanks for letting me know.  Will check and let you know.  --Ananda 

Comment 13 Larry Woodman 2004-08-24 17:42:28 UTC
I think this problem is restricted to Opteron.

Larry


Comment 14 IBM Bug Proxy 2004-08-24 18:12:22 UTC
----- Additional Comments From salina.com  2004-08-24 13:25 -------
Based on response from Larry, I am going to put target milesstone in as 
RHEL 3 QU4.   We will leave problem open until numa=on works.   Thanks. 

Comment 15 Larry Woodman 2004-11-29 19:36:08 UTC

*** This bug has been marked as a duplicate of 131295 ***

Comment 16 IBM Bug Proxy 2004-12-02 03:28:18 UTC
----- Additional Comments From AVenkat.com  2004-12-01 22:19 EDT -------
I haven't used numa feature so far on pSeries. Probably, I might not be using
the large pages since the squadron boxes that I use have only 2GB (SF2) or 4GB
(L4).  However, I will try to borrow one box with a larger memory. 

Comment 17 Ernie Petrides 2005-02-15 05:16:29 UTC
It has been decided that x86_64 RHEL3 kernels should continue to enable
NUMA by default.  However, if an OOM kill occurs on a NUMA system, an
extra message will be printed by the kernel suggesting that using the
"numa=off" boot option might be a good way to work around the issue.

The exact message is:

    OOM kill occurred on an x86_64 NUMA system!
    The numa=off boot option might help avoid this.

This change was committed to the RHEL3 U5 patch pool on 9-Feb-2005 (in
kernel version 2.4.21-27.12.EL).


Comment 18 Ernie Petrides 2005-11-30 07:47:13 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.12.EL).

To enable an improved NUMA-friendly page allocation policy, please
set /proc/sys/vm/numa_memory_allocator via the "sysctl" command
(or put "vm.numa_memory_allocator = 1" in /etc/sysctl.conf).


Comment 22 Red Hat Bugzilla 2006-03-15 15:38:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html