128729 – Large java processes core dump on hugemem kernel

Bug 128729 - Large java processes core dump on hugemem kernel

Summary: Large java processes core dump on hugemem kernel

Keywords:
Status:	CLOSED DUPLICATE of bug 123253
Alias:	None
Product:	Red Hat Enterprise Linux 3
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	3.0
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Dave Anderson
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-07-28 19:04 UTC by Richard Homolka
Modified:	2007-11-30 22:07 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2004-11-08 19:25:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Richard Homolka 2004-07-28 19:04:35 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.1)
Gecko/20040707

Description of problem:
We have large weblogic 8.1sp2 and 8.1sp3 processes that occasionally
but regularly crash and core dump on the hugemem kernel.  This happens
under Sun JVMs 1.4.2_04, 1.4.2_05, and jrockit 8.2.  BEA tech support
has found a similar case and referred us to the "[PATCH] bogus
sigaltstack calls by rt_sigreturn" patch recently accepted into 2.6.7.
 The client customer performed this patch (modified slightly because
of the differences in the code in 2.4AS3.0 kernel and stock 2.6) and
the problems seem to have gone away (no further tech support requests
to BEA).

Version-Release number of selected component (if applicable):
kernel-2.4.21-15.EL

How reproducible:
Sometimes

Steps to Reproduce:
1. Run large java apps on hugemem kernel with large amounts of RAM (we
have 12Gb in affected systems)
2. wait
3.
    

Actual Results:  after a while, apps would core dump, with large call
stacks

Expected Results:  app should run without issue

Additional info:

Happens 12Gb machines, and 36Gb machines.

Comment 1 Kevin Stussman 2004-08-02 19:41:23 UTC

Like to add myself to this. Our weblogic server crashes "every so often" for no apparent 
reason leaving a large core file.

OS : 2.4.21-15.ELhugemem
MemTotal:      5931956 kB
GDB : 
Core was generated by `/usr/local/bea/jdk141_05/bin/java'.
Program terminated with signal 11, Segmentation fault.

Comment 2 Kevin Stussman 2004-08-02 21:02:07 UTC

FYI, here is the patch for those interested.

http://linux.bkbits.net:8080/linux-2.6/gnupatch@40b035f1haADUZ5Ujxb0PPoxPYHX_g

We will try to move off of the hugemem kernel before trying this patch. (fortunately that 
is an option for us right now)

Comment 3 Kevin Stussman 2004-08-06 16:14:54 UTC

After reading the patch and this bug, I felt pretty sure that our problem was related to 
this hugemem kernel problem, but after switching to the smp kernel...the problem is still 
happening. The only noticible differences are now the stack trace produces this 
message:

# HotSpot Virtual Machine Error, Internal Error
# Please report this error at
# http://java.sun.com/cgi-bin/bugreport.cgi
#
# Java VM: Java HotSpot(TM) Client VM (1.4.1_05-b01 mixed mode)
#
# Error ID: 43113F32554E54494D45110E4350500305
#
# Problematic Thread: prio=1 tid=0x0x83f6028 nid=0x7a0 runnable
#

and the crashes now happen at a regular time (around midnight) instead of at random 
times.

Comment 4 Richard Homolka 2004-11-08 18:51:19 UTC

Seems closed.  There is a fix to the AS hugemem kernel as of kernel
update 18 (the beta for U3).  The U3 kernel has solved our problems.

Comment 5 Ernie Petrides 2004-11-08 19:25:00 UTC

Closing per above comment (seems to be dup of bug 123253).

*** This bug has been marked as a duplicate of 123253 ***

Note You need to log in before you can comment on or make changes to this bug.