Bug 224600 - running 32-bit executables on x86_64/ia64/s390x causes negative "vm_committed_space" value
Summary: running 32-bit executables on x86_64/ia64/s390x causes negative "vm_committed...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.8
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
Assignee: Ernie Petrides
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-26 16:30 UTC by Fabio Olive Leite
Modified: 2007-11-17 01:14 UTC (History)
6 users (show)

Fixed In Version: RHSA-2007-0436
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-06-11 17:58:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2007:0436 0 normal SHIPPED_LIVE Important: Updated kernel packages for Red Hat Enterprise Linux 3 Update 9 2007-06-08 00:03:57 UTC

Description Fabio Olive Leite 2007-01-26 16:30:25 UTC
+++ This bug was initially created as a clone of Bug #218757 +++

Description of problem:
Centos user found problems with vm accounting and noted it appeared to be fixed
upstream but not in Centos (and it seems not in RHEL either)

GIT commit:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2fd4ef85e0db9ed75c98e13953257a967ea55e03

-- Additional comment from burt on 2006-12-26 12:18 EST --
We have seen this VM accounting bug on kernel 2.6.9-42.0.2.ELsmp for x86_64 (but
it does not seem to be fixed in the latest revision either).  In particular,
vm.overcommit was set to 2, and the machines (which run a lot of short i386
binaries at high frequencies) refused to fork any more processes -- malloc() was
failing.  After killing all user processes, Committed_AS was still well over 8
GB, clearly an accounting leak.

Please make this a relatively high priority, as overcommit features on x86_64 is
essentially broken without it (we have set it to 0 for now).

-- Additional comment from alan on 2007-01-26 05:46 EST --
Also reported as affecting RHEL3


-- Additional comment from fleite on 2007-01-26 10:55 EST --
This is a fairly serious bug, if we consider that people turn on strict memory
overcommit exactly because of increased server stability in the long run.

Comment 5 Ernie Petrides 2007-01-30 23:49:48 UTC
Note that this problem (i.e., negative "vm_committed_space" value) is only
cosmetic (bogus /proc/meminfo "Committed_AS" value) when the system is booted
with the default setting for /proc/sys/vm/overcommit_memory.  It is only when
the "overcommit_memory" sysctl value has been set to 2 that the kernel will
alter its behavior based on the "vm_committed_space" value.

If a customer is experiencing ENOMEM errors in user-space programs on a system
with "overcommit_memory" set to 2, a work-around would be to use the default
setting instead.


Comment 7 Ernie Petrides 2007-01-31 00:57:53 UTC
In the case of RHEL3 (unlike RHEL4), running a 32-bit executable on x86_64
results in decrementing the "Committed_AS" value by the initial stack size.

This is because no call to vm_enough_memory() was made in ia32_setup_arg_pages().

Since I'm working on the RHEL4 case, I'll reassign this RHEL3 bug to myself.

Comment 9 RHEL Program Management 2007-01-31 22:44:52 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 Ernie Petrides 2007-01-31 23:52:20 UTC
I'm changing the subject line to reflect what the real problem is in RHEL3,
which exists on the x86_64, ia64, and s390x arches (in 32-bit exec support).

Note that the upstream changes indicated in the "GIT commit" link in the
initial comment of this bug report do not address the RHEL3 problem, which
is causing the kernel's "vm_committed_space" global variable to go backwards
until it wraps to a negative value (and then later interpreted as a very
large unsigned value).

Note that this problem does not exist in RHEL4.  (But RHEL4 has a different
problem in maintaining the "vm_committed_space" global variable.)

Comment 11 Ernie Petrides 2007-02-01 00:57:05 UTC
Patch posted for internal review on 31-Jan-2007.

Comment 12 Ernie Petrides 2007-02-08 04:04:55 UTC
A fix for this problem has just been committed to the RHEL3 U9
patch pool this evening (in kernel version 2.4.21-47.5.EL).


Comment 13 Jay Turner 2007-02-08 13:45:02 UTC
QE ack for RHEL3.9.

Comment 21 Red Hat Bugzilla 2007-06-11 17:58:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2007-0436.html


Comment 22 Issue Tracker 2007-06-14 16:51:02 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Client

This event sent from IssueTracker by yves.begrand 
 issue 111949


Note You need to log in before you can comment on or make changes to this bug.