Bug 246086 - Is accounting of Committed memory correct?
Is accounting of Committed memory correct?
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Larry Woodman
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2007-06-28 09:45 EDT by Issue Tracker
Modified: 2010-10-22 11:58 EDT (History)
3 users (show)

See Also:
Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-11-15 11:29:36 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Backport of upstream patch to fix Committed_AS leakage (2.24 KB, patch)
2007-06-28 09:48 EDT, Chris Lalancette
no flags Details | Diff

  None (edit)
Description Issue Tracker 2007-06-28 09:45:38 EDT
Escalated to Bugzilla from IssueTracker
Comment 1 Issue Tracker 2007-06-28 09:45:41 EDT
Description of problem:
We have a problem understanding the value for Committed_AS from /pro/meminfo, and see malloc failing despite lots of free memory.
How reproducible:


Steps to Reproduce:

Actual results:

The value for Committed_AS appears to be very high even for a machine doing nothing. We are not able to determine which process commits the memory. 

Expected results:

Value for Committed_AS should match memory usage (?), we should be able to determine which process commits the memory.

Additional info:

We have seen frequent cases of OOM_kill's, and in an attempt to reduce these have set /proc/sys/vm/overcommit_memory to 2 and /proc/sys/vm/overcommit_ratio to 50. To our surprise we ended up with huge amounts of free memory, and processes not being able to allocate memory. We observe very high values of Committed_AS, which we can not attribute to the processes on the machine or even usage as filecache. Can you please let us know 

- whether you are aware of a known issue with the accounting of committed memory;

- how we can determine which processes are responsible for the huge memory commits we observe;

- which value for the overcommit_ratio is reasonable for machines with 8GB RAM and 8GB swap;

This event sent from IssueTracker by clalance  [Support Engineering Group]
 issue 124630
Comment 2 Issue Tracker 2007-06-28 09:45:43 EDT
it seems as if other people also came across this problem, and even found a
fix for it. Can you have a look at http://bugs.centos.org/view.php?id=1608
and check whether the proposed fix
can be applied to the actual 4.5 kernel?

Internal Status set to 'Waiting on Support'
Status set to: Waiting on Tech

This event sent from IssueTracker by clalance  [Support Engineering Group]
 issue 124630
Comment 3 Issue Tracker 2007-06-28 09:45:46 EDT
Customer seems to have hit a Committed_AS counter leakage which has been
addressed with the following upstream patch:


It doesn't seem to be included in latest 2.6.9-55.0.2. Escalating to
request inclusion in RHEL4 (RHEL5 already has it).

-- Navid

Issue escalated to Support Engineering Group by: navid.
Internal Status set to 'Waiting on SEG'

This event sent from IssueTracker by clalance  [Support Engineering Group]
 issue 124630
Comment 4 Chris Lalancette 2007-06-28 09:48:27 EDT
Created attachment 158124 [details]
Backport of upstream patch to fix Committed_AS leakage

This is a quick backport of the named upstream patch.  It should fix
Committed_AS leakage, although I haven't confirmed that 100% yet.

Chris Lalancette
Comment 7 Matthias Schroder 2007-06-29 10:51:46 EDT
I noticed that you have not included the changes to arch/x86_64/ia32/syscall32.c
in your version of the patch. Is that part not applicable?  
Comment 8 Chris Lalancette 2007-07-02 15:53:26 EDT
Hm, no, that part should still be applicable; I have no idea how it was missed
in the patch I did.  I'll redo the patch with that part back in.

Chris Lalancette
Comment 9 Matthias Schroder 2007-07-04 10:33:51 EDT
I did some more tests using a little python scriptlet running under a 32-bit
python (as this is what the users that suffered most from this issue do). There
I saw the following Committed_AS leaks:

kernel 2.6.9-55.EL.cernsmp : 56 kB per run

2.6.9-55.EL.cern.2smp : 40 kB per run

So there are other areas where Committed_AS leaks, not fixed by the patch...

The amount of the leakage did not change when having a little or a lot of memory
allocation within the scriptlet.

Comment 10 Matthias Schroder 2007-07-05 11:37:35 EDT
Seems I am getting closer to a test case that shows the (or yet another?)
problem nicely. A small C program (talking compiler here, not style) with
recursion. Build on a i386 system, run on x86_64. Committed_AS leakage per run : 

2.6.9-42.0.10.ELsmp   : 9768 kB / run
2.6.9-55.EL.cernsmp   : 9840 kB / run
2.6.9-55.EL.cern.2smp : 9768 kB / run

This amount of leakage does get serious...

The code:

int main(int argc, char *argv[]){
  int num;
  double fact(); 
  if ( argc != 2 ){
    printf("Please specify one integer number as argument.\n");
    return (1);
  num = atoi(argv[1]);
  if ( num > -1 ) 
    printf ("factorial of %d is about %lf\n", num, fact(num));
    printf ("factorial of %d is not defined.\n", num);
  return (0);

double fact(int num){
  if ( num == 0 )
    return (1.);
  return ( num * fact( num-1 ) );

To build:
cc -o fact -g fact.c

To run:
./fact 250000

To see effect:

@ outer = 100
@ inner = 100
@ i = 0
@ j = 0
@ starting = `awk '/Committed_AS/ {print $2}' /proc/meminfo`
@ last = $starting
while ( $i < $outer ) 
    @ i ++
    echo "iteration $i"
    grep Committed /proc/meminfo
    while ( $j < $inner )
        @ j ++
	./fact 250000 >& /dev/null
    @ now = `awk '/Committed_AS/ {print $2}' /proc/meminfo`
    @ loss = `expr $now - $last`
    echo "leak since last outer loop: $loss"
    @ mean = `expr $loss / $inner`
    echo "leak per call since last outer loop: $mean"
    @ loss = `expr $now - $starting`
    echo "total leak since start: $loss"
    @ mean = `expr $loss / $inner / $i` 
    echo "leak per call since start: $mean"
    @ j = 0
    @ last = $now
echo "Done."
grep Committed /proc/meminfo
Comment 11 Issue Tracker 2007-07-06 07:06:56 EDT

thank you very much for that reproducer. I was able to obtain the same
kind of results using 2.6.9-55.0.2.ELsmp on a x86_64 system.


-- Navid

This event sent from IssueTracker by navid 
 issue 125299
Comment 13 RHEL Product and Program Management 2007-07-26 21:24:35 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 14 Chris Lalancette 2007-07-27 10:26:30 EDT
2 things here:

1)  The original patch I had actually was the right patch; we already had the
bit in syscall32.c from another patch

2)  This patch was committed to the 4.6 tree.  The maintainer should be updating
this with MODIFIED pretty soon.

Chris Lalancette
Comment 15 Jason Baron 2007-07-27 13:00:24 EDT
committed in stream U6 build 55.24. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 19 errata-xmlrpc 2007-11-15 11:29:36 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.