Bug 159330

Summary: x86_64 kernel stops allocating memory too early when overcommit_memory set to strict
Product: Red Hat Enterprise Linux 3 Reporter: Kevin Kruzich <kkruzich>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: jbaron, jburke, jparadis, peterm, petrides, riel, tao, tburke, thomas.walker
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2005-663 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-28 15:15:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156321    
Attachments:
Description Flags
Walk /pro/<pid>/maps to determine mem usage (supplied by Redhat) none

Description Kevin Kruzich 2005-06-01 18:09:37 UTC
+++ This bug was initially created as a clone of Bug #106503 +++

Description of problem:
-----------------------
During a compilation of OpenAFS, I see memory allocation errors some way through
the compile.  Depending on how the allocation failed, subsequent invocations of
make will fail, with most of the applications complaining of insufficient memory

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
Linux saias83 2.4.21-3.EL #1 SMP Fri Sep 19 13:59:46 EDT 2003 ia64 ia64 ia64
GNU/Linux

How reproducible:
-----------------
Set memory overcommit to strict
Compile something (like OpenAFS)

Steps to Reproduce:
-------------------
1. sysctl -w vm.overcommit_memory=2
2. ./configure --with-afs-sysname=ia64_linux24 --enable-transarc-paths
--with-linux-kernel-headers=/usr/src/linux-2.4.21-3.EL
3. make
    
Actual results:
---------------
bash: fork: Cannot allocate memory

Expected results:
-----------------
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
...

Additional info:
----------------
It seems that this was reported and fixed on i386 and x86_64 platforms.  I
looked at BugZilla bugs: 106010, 98413 and 104172.

To avoid this, we currently set vm.overcommit_memory to 0, but strict overcommit
is what we would like to use.

Comment 1 Kevin Kruzich 2005-06-01 18:55:43 UTC

*** This bug has been marked as a duplicate of 106503 ***

Comment 2 Ernie Petrides 2005-06-01 18:59:59 UTC
Kevin, the title suggests that this is an x86_64 bug, but the "Steps to
Reproduce" section indidates an ia64-specific compilation.  Which arch
is this for?

Comment 3 Eric Hagberg 2005-06-01 19:20:21 UTC
The duplicate bug above (106503) was opened as an ia64 bug. The same methods to
reproduce work on x86_64. And the problem still exists for x86_64 (as stated in
the details for BZ 106503).

There are numerous ways to reproduce the problem, but basically do anything that
allocates and releases memory often enough over some span of time (sometimes a
very short span) and you'll see leakage in the Committed_AS value in /proc/meminfo.

Comment 4 Eric Hagberg 2005-06-01 19:24:22 UTC
Also, this was reported originally on ia64 and fixed there... then another bug
was fixed (probably for all platforms, but we saw it on ia32). It was never
fixed for x86_64 (even though that's what's said above in the initial description).

Comment 5 Thomas Walker 2005-06-01 19:31:14 UTC
Created attachment 115060 [details]
Walk /pro/<pid>/maps to determine mem usage (supplied by Redhat)

Yep, Kevin just clipped the very beginning of a long thread where we saw this
on ia64, then ia32, and finally on x86_64 (all but the last are fixed).

With the latest U5 eratta kernel (2.4.21-32.0.1EL) we even see this w/ a test
script used to try to determine the *real* memory usage by walking
/proc/<pid>/maps, don't even need the reproducer app anymore.

 root@saias78:walkert# uname -a 					       
									       
						  
Linux saias78 2.4.21-32.0.1.EL #1 SMP Tue May 17 17:52:26 EDT 2005 x86_64      
									       
						 
root@saias78:walkert# uptime						       
									       
						 
 11:20:16  up 2 min,  1 user,  load average: 0.29, 0.20, 0.08		       
									       
						 
root@saias78:walkert# sysctl -a | grep overcommit			       
									       
						 
vm.overcommit_ratio = 90						       
									       
						 
vm.overcommit_memory = 2						       
									       
						 
root@saias78:walkert#  while true					       
									       
						 
> do									       
									       
						 
>  grep Committed_AS /proc/meminfo					       
									       
						 
> ~walkert/bin/procmem.sh > /dev/null					       
									       
						 
> done									       
									       
						 
Committed_AS:	258476 kB						       
									       
						 
Committed_AS:	353024 kB						       
									       
						 
Committed_AS:	447380 kB						       
									       
						 
Committed_AS:	541196 kB						       
									       
						 
Committed_AS:	635052 kB						       
									       
						 
Committed_AS:	728952 kB						       
									       
						 
Committed_AS:	823324 kB						       
									       
						 
Committed_AS:	916936 kB						       
									       
						 
Committed_AS:  1010488 kB						       
									       
						 
Committed_AS:  1104648 kB						       
									       
						 
Committed_AS:  1199292 kB						       
									       
						 
Committed_AS:  1293068 kB						       
									       
						 
Committed_AS:  1386588 kB						       
									       
						 
Committed_AS:  1480632 kB						       
									       
						 
Committed_AS:  1574084 kB						       
									       
						 
									       
									       
						 
<...>									       
									       
						 
									       
									       
						 
Until we implode...							       
									       
						 
									       
									       
						 
Committed_AS: 11151536 kB						       
									       
						 
Committed_AS: 11248696 kB						       
									       
						 
awk: fatal: more_nodes: nextfree: can't allocate 6400 bytes of memory	       
									       
						 
(Cannot allocate memory)						       
									       
						 
/ms/user/w/walkert/bin/procmem.sh: fork: Cannot allocate memory 	       
									       
						 
root@saias78:walkert# sysctl -w vm.overcommit_memory=1			       
									       
						 
vm.overcommit_memory = 1						       
									       
						 
root@saias78:walkert# uptime						       
									       
						 
 11:34:58  up 17 min,  1 user,	load average: 1.19, 1.58, 1.05		       
									       
						 
root@saias78:walkert# grep Commit /proc/meminfo 			       
									       
						 
CommitLimit:  11276400 kB						       
									       
						 
Committed_AS: 11271168 kB						       
									       
						 
root@saias78:walkert# ~walkert/bin/procmem.sh				       
									       
						 
<...>									       
									       
						 
Total process shared mem (x):	      31560 K				       
									       
						 
Total process private mem (w):	     212364 K				       
									       
						 
Total allocated (calc): 	     203844 K	 

Script attached (this is also in IT 28410)

Comment 6 Ernie Petrides 2005-06-01 20:59:55 UTC
Okay, thanks for clarifying things.  I've reclosed bug 106503 (against ia64,
which was due to a different problem) and have removed the dependency.

Comment 7 Jeff Burke 2005-06-01 21:06:17 UTC
I ran the following test: With RHEL3 U5 x86_64 arch ia32e kernel

 [root@testsystem tst3]# sysctl -w vm.overcommit_memory=2
 vm.overcommit_memory = 2
 [root@testsystem tst3]# ./pig 
 fork failed:: Cannot allocate memory
 fork failed:: Cannot allocate memory
 fork failed:: Cannot allocate memory
 fork failed:: Cannot allocate memory
 fork failed:: Cannot allocate memory
 [root@testsystem tst3]# child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 At this point the test application is virtual hung. 
  It will never complete. It duplicates the result in
   Comment #28.

I then ran the following test: 

 [root@testsystem tst3]# sysctl -w vm.overcommit_memory=1
 vm.overcommit_memory = 1
 [root@testsystem tst3]# sysctl -a  | grep comm
 vm.overcommit_ratio = 50
 vm.overcommit_memory = 1
 [root@testsystem tst3]# ./pig 
 child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 child is done
 (Above message many more times........)
 parent is done

 The test runs to completion. 

 I believe that the summary on the BZ is in fact still true. 
  "x86_63 kernel stops allocating memory too early when 
  overcommit_memory set to strict"

Comment 8 Larry Woodman 2005-06-08 19:56:07 UTC
When /proc/sys/vm/overcommit_memory is set to 2 memory allocations will fail
pre-maturely.  The reason for this is that we are leaking vm_committed_space! 
Durring exec(), setup_arg_pages() calls vm_enough_memory() for a vma without the
VM_ACCOUNT flag set(every other call to vm_enough_memory either sets this flag
or insures that it is already set before making this call). When the process
exits, exit_mmap() only calls vm_unacct_memory() if the vma has the VM_ACCOUNT
flag set, therefore we leak vm_committed_space.  This is benign by default
because its ignored but it causes incorrect memory allocation failures when
overcommit_memory is set to strict(2).

This patch fixes this problem:

--------------------------------------------------------------------------------------------------------------------------------------------------
--- linux-2.4.21/include/asm-x86_64/page.h.orig
+++ linux-2.4.21/include/asm-x86_64/page.h
@@ -152,7 +152,8 @@ extern __inline__ int get_order(unsigned
#define __VM_DATA_DEFAULT_FLAGS        (VM_READ | VM_WRITE | VM_EXEC | \
                               VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
#define __VM_STACK_FLAGS       (VM_GROWSDOWN | VM_READ | VM_WRITE | VM_EXEC | \
-                                VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+                                VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC | \
+                                VM_ACCOUNT)
                                                                               
                                 
#define VM_DATA_DEFAULT_FLAGS \
      ((current->thread.flags & THREAD_IA32) ? \
--------------------------------------------------------------------------------------------------------------------------------------------------------


The test kernel binary rpms for x86_64 and EM64T are located here:

>>>http://people.redhat.com/~lwoodman/RHEL3

Comment 9 Ernie Petrides 2005-06-09 03:28:45 UTC
A fix for this problem has just been committed to the RHEL3 U6
patch pool this evening (in kernel version 2.4.21-32.7.EL).


Comment 18 Red Hat Bugzilla 2005-09-28 15:15:09 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html