Bug 859886

Summary: Process.exec memory bug and fixes
Product: Red Hat Enterprise Linux 5 Reporter: Levente Farkas <lfarkas>
Component: java-1.6.0-openjdkAssignee: Andrew John Hughes <ahughes>
Status: CLOSED WONTFIX QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: high Docs Contact:
Priority: high    
Version: 5.8CC: dbhole
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: All   
URL: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7034935
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-21 21:05:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
the patch we use none

Description Levente Farkas 2012-09-24 09:58:45 UTC
you can read through this link with detail description of the problem:
http://stackoverflow.com/questions/1124771/how-to-solve-java-io-ioexception-error-12-cannot-allocate-memory-calling-run

the relevant part is:
----------------------------
This is solved in Java version 1.6.0_23 and upwards.

See more details at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7034935
----------------------------
and it seems while rhel-6's java -version gives
java version "1.6.0_24"s:
on rhel-5 it's only:
java version "1.6.0_22" :-(((

it's a very serious bug on all machine with low memory and java programs running external process (shell scripts) frequently.

please fix it on 5.x as it's fixed on 6.x.

Comment 1 Levente Farkas 2012-09-24 15:27:00 UTC
anyway here's a very good and detailed description of the problem form openjdk-1.7's source code UNIXProcess_md.c and for me it seems it's not even solved on rhel-6:-(

/*
 * There are 3 possible strategies we might use to "fork":
 *
 * - fork(2).  Very portable and reliable but subject to
 *   failure due to overcommit (see the documentation on
 *   /proc/sys/vm/overcommit_memory in Linux proc(5)).
 *   This is the ancient problem of spurious failure whenever a large
 *   process starts a small subprocess.
 *
 * - vfork().  Using this is scary because all relevant man pages
 *   contain dire warnings, e.g. Linux vfork(2).  But at least it's
 *   documented in the glibc docs and is standardized by XPG4.
 *   http://www.opengroup.org/onlinepubs/000095399/functions/vfork.html
 *   On Linux, one might think that vfork() would be implemented using
 *   the clone system call with flag CLONE_VFORK, but in fact vfork is
 *   a separate system call (which is a good sign, suggesting that
 *   vfork will continue to be supported at least on Linux).
 *   Another good sign is that glibc implements posix_spawn using
 *   vfork whenever possible.  Note that we cannot use posix_spawn
 *   ourselves because there's no reliable way to close all inherited
 *   file descriptors.
 *
 * - clone() with flags CLONE_VM but not CLONE_THREAD.  clone() is
 *   Linux-specific, but this ought to work - at least the glibc
 *   sources contain code to handle different combinations of CLONE_VM
 *   and CLONE_THREAD.  However, when this was implemented, it
 *   appeared to fail on 32-bit i386 (but not 64-bit x86_64) Linux with
 *   the simple program
 *     Runtime.getRuntime().exec("/bin/true").waitFor();
 *   with:
 *     #  Internal Error (os_linux_x86.cpp:683), pid=19940, tid=2934639536
 *     #  Error: pthread_getattr_np failed with errno = 3 (ESRCH)
 *   We believe this is a glibc bug, reported here:
 *     http://sources.redhat.com/bugzilla/show_bug.cgi?id=10311
 *   but the glibc maintainers closed it as WONTFIX.
 *
 * Based on the above analysis, we are currently using vfork() on
 * Linux and fork() on other Unix systems, but the code to use clone()
 * remains.
 */

Comment 2 Deepak Bhole 2012-09-24 15:34:18 UTC
Re-assigning to Andrew. Andrew, can you please take a look?

Comment 3 RHEL Program Management 2012-10-09 19:09:21 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 4 Andrew John Hughes 2012-10-24 14:34:09 UTC
There is no correspondence between the bxx number for the proprietary Oracle 6 JDK and OpenJDK6: https://dbhole.wordpress.com/2011/05/27/why-do-xx-and-yy-in-jdk6-uxx-and-openjdk-byy-differ/

7034935 seems to be an umbrella bug for a number of fixes we can look at backporting:

6850720: Use clone(CLONE_VM), not fork, on Linux to avoid swap exhaustion
6853336: (process) disable or remove clone-exec feature (6850720)

with the latter seemingly reverting the actual use of the former.

Are you actually seeing an issue due to this?  I can't reproduce the issue given with the example in the link, which seems terribly dated (it refers to Fedora 10 and IcedTea6 1.5, we are now on Fedora 18 and IcedTea6 1.11.5).

$ /usr/lib/jvm/icedtea-6/bin/java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.5) (Gentoo build 1.6.0_24-b24)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
$ /usr/lib/jvm/icedtea-6/bin/java Prova
$

Comment 5 Levente Farkas 2012-10-26 12:45:13 UTC
we see this on rhel5 and rhel6 32bit system with 512mb ram where our java process use large memory (~300-400mb) in this case the system start to swap and this cause huge load etc... ie the same happened even when we run very little shell scripts from the java process (ie: ls, ip addr, ifconfig etc).

Comment 6 RHEL Program Management 2012-10-30 06:10:45 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 7 Andrew John Hughes 2012-10-30 13:18:16 UTC
Ok, reading this again, it just sounds like RHEL 5 needs upgrading to IcedTea6 1.11, which I think may be in progress anyway.  Deepak?

Comment 8 Deepak Bhole 2012-10-30 16:54:40 UTC
Yep, RHEL 5.9 will have IcedTea6 1.11.5

Comment 9 Levente Farkas 2012-10-30 19:00:44 UTC
i still not see that it's fixed in 1.11.5

Comment 10 Andrew John Hughes 2012-10-31 10:36:33 UTC
You said "please fix it on 5.x as it's fixed on 6.x.".

Comment 11 Levente Farkas 2012-10-31 11:19:52 UTC
after i look onto the source code i see it's only fixed in java7 in rhel6. my first sentence was from comment i found somewhere but it doesn't seems so.

Comment 12 Andrew John Hughes 2012-10-31 16:40:40 UTC
Ok, reopening.

Online comments may be misleading as there is a proprietary JDK6 from Oracle and there is OpenJDK6 and the two are largely unrelated.  OpenJDK6 was created from an early OpenJDK7 snapshot while the proprietary JDK6 is maintained privately in a completely different version control system by Oracle.  

Oracle contribute little more to OpenJDK6 than reviewing/packaging so when they say they are backporting something "to 6", it's only to the proprietary version, even though they could add it to OpenJDK6 much more easily than us having to track it down and apply it later.

I'll have to try and track down what changes we want from 7 and backport them.

Comment 13 Levente Farkas 2012-11-07 14:32:41 UTC
a simple way to check this: does the current UNIXProcess_md.c use fork, vfork or clone?

Comment 14 Andrew John Hughes 2012-11-07 19:20:12 UTC
Yeah I know how to tell, it's a case of finding the right backports to add the changes.  I'm also confused by the fact that they seem to add clone() support and then disable it.  If it's disabled, I'm not sure what the point is in backporting it.  Is it definitely enabled in 7?

Comment 15 Levente Farkas 2012-11-08 10:07:37 UTC
see comment 1 the clone is a glibc bug but as usual ulrich has his "normal" attitude to the problem (anyway clone would be the best solution), so vfork the second, but afais in the source current rhel jdk use the original form in 1.6 and vfork in 1.7 only.

Comment 16 Levente Farkas 2012-11-08 10:11:37 UTC
Created attachment 640715 [details]
the patch we use

ok actualy i simple replace the files in this patch to the jdk7 files during build so this's not a real patch.

Comment 17 Deepak Bhole 2013-03-21 21:05:26 UTC
This change is too big to put into RHEL-5 at this point (given that it is in Phase 2).

Please re-open this against RHEL-6 if it is an issue there as well. Both RHEL-5 and RHEL-6 are now at 1.11.9.