Bug 198775
Description
Len DiMaggio
2006-07-13 14:05:18 UTC
Created attachment 132374 [details]
javacore and heapdump file from one core dump
Added Ryan Campbell (JBoss QE) to cc list. Additional information - we're seeing this with the (32-bit) IBM 1.4.2 JDK as shipped in RHEL4/U4 - during the stacks project testing. We're seeing the core dumps repeatedly on both 686 and x86_64 servers - both running the 32-bit IBM 1.4.2 JDK. We have not been able to recreate the core dumping by running a single test from the test suite, but the most recent pattern has the JDK core dumping here: tests-iiop: [junit] Running org.jboss.test.bankiiop.test.BankStressTestCase [junit] Tests run: 8, Failures: 0, Errors: 9, Time elapsed: 31.843 sec [junit] Test org.jboss.test.bankiiop.test.BankStressTestCase FAILED [junit] Running org.jboss.test.excepiiop.test.ExceptionTimingStressTestCase [junit] Tests run: 6, Failures: 0, Errors: 7, Time elapsed: 26.615 sec [junit] Test org.jboss.test.excepiiop.test.ExceptionTimingStressTestCase FAILED [junit] Running org.jboss.test.helloiiop.test.HelloTimingStressTestCase [junit] Tests run: 6, Failures: 0, Errors: 7, Time elapsed: 26.577 sec [junit] Test org.jboss.test.helloiiop.test.HelloTimingStressTestCase FAILED [junit] Running org.jboss.test.hellojrmpiiop.test.HelloTimingStressTestCase [junit] Tests run: 7, Failures: 0, Errors: 7, Time elapsed: 45.492 sec [junit] Test org.jboss.test.hellojrmpiiop.test.HelloTimingStressTestCase FAILED Core dump files are: http://torweb.toronto.redhat.com/~dbhole/QE/RH_stack_tests/StacksV1-re20060804.0/javacore.20060810.193426.27564.txt http://torweb.toronto.redhat.com/~dbhole/QE/RH_stack_tests/StacksV1-re20060804.0/javacore.20060810.193407.16060.txt Correct unintentional edit This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Additional test results for java-1.4.2-ibm-1.4.2.6-1jpp.2.el4.i386 - build
1.4.2, J2RE 1.4.2 IBM build cxia32142-20060824 (SR6) (JIT enabled: jitc)
Results on 686 architecture include multiple, intermittent, but recurring core
dumps on this test in JBossAS 4.0.5GA test suite:
org.jboss.test.jbossmq.test.NackWithRollbackUnitTestCase
javacore file is attached - attachment #2 [details]
Created attachment 139229 [details]
Core file from 686 architecture
Additional test results for java-1.4.2-ibm-1.4.2.6-1jpp.2.el4.i386 - build
1.4.2, J2RE 1.4.2 IBM build cxia32142-20060824 (SR6) (JIT enabled: jitc)
Results on 686 architecture include multiple, intermittent, but recurring core
dumps on this test in JBossAS 4.0.5GA test suite:
org.jboss.test.jbossmq.test.NackWithRollbackUnitTestCase
Created attachment 139230 [details]
Core file from x86_64 architecture
Additional test results for java-1.4.2-ibm-1.4.2.6-1jpp.2.el4.i386 - build
1.4.2, J2RE 1.4.2 IBM build cxia32142-20060824 (SR6) (JIT enabled: jitc)
Results on x86_64 architecture include multiple, intermittent, but recurring
core
dumps on this test in JBossAS 4.0.5GA test suite:
org.jboss.test.jbossmq.test.NackWithRollbackUnitTestCase
Results for tests on x86_64 architecture with 64-bit JVM: IBM J9SE VM (build 2.2, J2RE 1.4.2 IBM J9 2.2 Linux amd64-64 j9xa64142-20060824 (JIT enabled) J9VM - 20060802_1551_LHdSMr JIT - r7_level20060707_1808ifx1) Still seeing multiple core dumps - see the original note in this bz - sample files: https://ldimaggi.108.redhat.com/files/documents/142/149/heapdump.20061024.115217.11308.phd https://ldimaggi.108.redhat.com/files/documents/142/150/javacore.20061024.115216.11308.txt This one won't be fixed by RHEL-4.5. Requesting for RHEL-4.6. Created attachment 145614 [details]
Additional core dump - from running JBossAS 4.0.5 test suite (1 of 2)
Created attachment 145618 [details]
Additional core dump - from running JBossAS 4.0.5 test suite (2 of 2)
Still seeing core dumps with: java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2) Classic VM (build 1.4.2, J2RE 1.4.2 IBM build cxia32142-20061124 (SR7) (JIT enabled: jitc)) java-1.4.2-ibm-1.4.2.7-1jpp.4.el4.i386.rpm java-1.4.2-ibm-devel-1.4.2.7-1jpp.4.el4.i386.rpm core file: https://ldimaggi.108.redhat.com/files/documents/142/263/core.24680.gz javacore file: https://ldimaggi.108.redhat.com/files/documents/142/264/javacore.20070228.150354.24680.txt This bugzilla had previously been approved for engineering consideration but Red Hat Product Management is currently reevaluating this issue for inclusion in RHEL4.6. adding Stephanie Glass to the CC list. ----- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-04-25 16:28 EDT ------- I received the following update from the JTC. It was recommended that I open a separate problem report for the out-of-memory issue and they requested all the binaries to analyze the core dump. "I see that you have reported two problems in the same problem report - 1.Problem with J9 build, this seems like a java heap out of memory problem as you have provided javacores and heapdumps. Still seeing multiple core dumps - - sample files: https://ldimaggi.108.redhat.com/files/documents/142/149/heapdump.2006102 4.115217.11308.phd https://ldimaggi.108.redhat.com/files/documents/142/150/javacore.2006102 4.115216.11308.txt 2.Problem with classic 142 build - provided core dumps show that this is a crash issue. I believe that these are two different issues and separate pmr should be opened for OM issue with J9 VM. Also for the crash issue ,please provide us all the shared libraries used by the process." Len, can you provide IBM with the shared libraries they're requesting? ----- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-05-02 22:10 EDT ------- Here's another comment from the Java folks (with a suggestion) regarding one of the latter javacore files: "I found in the javacore file provided by you that Xmx size is "-Xmx128m" ,which is very less ,so my first suggestion is to increase it to 750m.If we still see problem with that ,then we will pursue analyzing memory leaks(if any). " Response to comments 21 and 22 - yes - I'll try to recreate the problem with the new settings and will provide the shared libraries. Reran the test with JBoss 4.0.5GA and this JVM: java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2) Classic VM (build 1.4.2, J2RE 1.4.2 IBM build cxia32142-20070321 (SR8) (JIT enabled: jitc)) and -Xmx750m Here is the resulting javacore file: https://ldimaggi.108.redhat.com/files/documents/142/393/javacore.20070508.104025.4473.txt And the shared libs: https://ldimaggi.108.redhat.com/files/documents/142/394/sharedlibs.tar.gz Pease let me know if you'd like to see a 2nd issue opened to track a memory leak - thanks! This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. ------- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-05-09 22:37 EDT ------- (In reply to comment #7) > Pease let me know if you'd like to see a 2nd issue opened to track a memory leak Yes, let's work the Java 5 problem separately please. Let me know what the issue # is and I will reverse mirror it over to our bugzilla. Thanks. The 1.5 JVM bugzilla is here: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=203682 ------- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-05-14 12:20 EDT ------- (In reply to comment #10) > ----- Additional Comments From ldimaggi 2007-05-10 10:08 EST ------- > The 1.5 JVM bugzilla is here: > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=203682 I get access denied trying to bring up that bug. Can you place bugproxy.com on the cc list so I can see it please? ----- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-05-20 22:14 EDT ------- I had an update Friday from Java L3: I looked at the core file sent by you . Gdb was not able to stack trace for the crash,I used lcore and found that the crash happened in the compiled code,while executing following method,I see that core was generated with 142 SR7,as couple of compiled code crash problems are fixed in 142 SR8,will it be possible to test with 142 SR8?? SIGSEGV raised in CompiledCode systhread=09C23378 execmode=JIT jitlastf=F2A18B40 stackend=F2A1B4C8 jvmlastf=09FBFC30 mb=00000000 sp=F2A1A858 lastframe=F2A1B4C8 trace_back_stack (eip=F18CDA61 esp=F2A18A04 ebp=1120FED0) F2A18A04-F2A1B4C8 (cc)=F18CDA61 (mb)=08338C48 at java/util/Collections$SynchronizedMap.clear ()V ==>Failing method unwinding (old_esp=F2A18A04 old_eip=F18CDA61 old_ebp=1120FED0) *** check *** unknown pattern (6C 85 8C) at F18CDA68 unwinding (new_esp=F2A18A24 new_eip=1120FDD8 new_ebp=00000000) from prolog *** check *** (ip)=1120FDD8 at data range unwinding (old_esp=F2A18A24 old_eip=1120FDD8 old_ebp=00000000) *** check *** unknown pattern (EC 12 08) at 1120FDD9 Action Plan:Please test with 142 SR8 and let me know the results. If the problem is recreated with SR8 please provide the must gather docs for crash - 1.Core 2.Shared libs 3.core.sdff <generated by running <sdk_home>/jre/bin/jextract <core file> ,on the machine where falilure happened. SR8 is available from http://www-128.ibm.com/developerworks/java/jdk/linux/download.html ----- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-06-12 16:51 EDT ------- Len, any update regarding using SR8? Sorry about that - the SR8 core file is here: https://ldimaggi.108.redhat.com/files/documents/142/393/javacore.20070508.104025.4473.txt J2RE 1.4.2 IBM build cxia32142-20070321 (SR8) ----- Additional Comments From chavez.com (prefers email at lnx1138.com) 2007-07-20 10:28 EDT ------- Requested an update from Java support on the SR8 log analysis. ------- Comment From chavez.com 2007-08-09 15:48 EDT------- Still no update from Java support on the SR8 logs. Requested an update again. Bridge attempted to add attachment ibm_core.tar.bz2 size = 10967 KB to https://bugzilla.linux.ibm.com/, but the limit is 2048 KB This request was previously evaluated by Red Hat Product Management for inclusion in the current Red Hat Enterprise Linux release, but Red Hat was unable to resolve it in time. This request will be reviewed for a future Red Hat Enterprise Linux release. ------- Comment From chavez.com 2007-08-09 15:48 EDT------- Still no update from Java support on the SR8 logs. Requested an update again. Created attachment 197791 [details]
Additional core dump - from running JBossAS 4.0.5 test suite (2 of 2)
Created attachment 197801 [details]
Additional core dump - from running JBossAS 4.0.5 test suite (1 of 2)
Created attachment 197811 [details]
Core file from x86_64 architecture
Created attachment 197821 [details]
Core file from 686 architecture
------- Comment From chavez.com 2007-10-16 14:28 EDT------- I had a response this week from Java support. In summary, they are indicating that the SR8 crash is different than the crashes seen on prior service releases. They also want to know how they can get the JBoss testsuite in order to see the problem first hand and a contact in Red Hat that can communicate more directly with. "Apologies for the delay in feedback on this issue. While analysing the latest set of footprints, I do not seem to be getting a proper stack of crash occurance. The instruction pointer address :: NULL 3HPNATIVESTACK Native Stack of "RMI TCP Connection(311)-127.0.0.1" PID 12315 NULL ------------------------- 3HPSTACKLINE 3EE00000 does not lie in the range of the jit library nor the jvm library : 2HPMEMMAPLINE b4395000-b45ea000 r-xp 00000000 fd:00 166124 /usr/lib/jvm/java-1.4.2-ibm-1.4.2.8/jre/bin/libjitc.so 2HPMEMMAPLINE b7dce000-b7fca000 r-xp 00000000 fd:00 166083 /usr/lib/jvm/java-1.4.2-ibm-1.4.2.8/jre/bin/classic/libjvm.so nor does it look like compiled code of a particular method. The Java stack for this thread is unavailable. This particular thread is also holding the following locks : Heap lock (0x08083678) Monitor Cache lock (0x080835B8) Thread queue lock (0x08076CD0) The only time we grab these locks in sequence is before suspension in Garbage Collection, Since the native stack of the thread is largely unavailable, we are unable to clarify if Garbage Collection is actually on. However, In summary, this crash is not the same crash that was seen with pre-SR8 (SR7) build. The crash in compiled code looks to have been resolved with this level. The current footprints are not conclusive in diagnosing this further. It would be feasible to have another run with JVM signal handling turned off and a core file produced through that to have a complete look at the stack." ------- Comment From chavez.com 2007-11-06 17:10 EDT------- Hi Thomas, Any idea if you can provide the previously requested information? Thanks. ------- Comment From chavez.com 2007-11-12 15:29 EDT------- Novell, Any update is appreciated. ------- Comment From chavez.com 2007-11-12 15:29 EDT------- So sorry, I meant Red Hat. Sorry about the slow reposnse - the JBossAS source code and test suite are available for download here: http://labs.jboss.com/jbossas/downloads/ Any news from IBM, now that the JBoss test suite has been provided? ------- Comment From chavez.com 2008-03-05 10:52 EDT------- Tom, I apologize for such a late reply. I am not purposely ignoring you. The last thing I was able to do was download the JBoss code but higher priority bugs kept me constantly away from this bug. I will make an attempt to get back to this soon or re-assign to someone with less of a workload to make some progress. This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?". Tom, I have kept this bug on the backburner for far too long and with other releases and work going on, I don't thing I will be able to get to it. I have no idea if this problem is still present in the latest release of RHEL 4 and how important it would be to get it fixed for RHEL 4.8. Please let me know if the priority or severity needs to be bumped or whether we can close this out. Again my apologies for holding onto this one for way to long without working it. Len- can you verify that this bug still exists? Hi Len, Can you let me know if this is still a problem in RHEL 4.8 you want to pursue? Thanks. Sorry for the delay in responding - yes - this one can be closed. Thanks Len for the quick reply. Closing... |