Red Hat Bugzilla – Bug 163699
[RHEL3] JVM Crash
Last modified: 2007-11-30 17:07:07 EST
Escalated to Bugzilla from IssueTracker
Please report the results you get with the RHEL3 U5 kernel (2.4.21-32.EL)
or later. Thanks.
(In reply to comment #4)
> Please report the results you get with the RHEL3 U5 kernel (2.4.21-32.EL)
> or later. Thanks.
Ernie, I have test on both the -15 && the 32.0.1 kernels with the same results:
(gdb) info reg
eax 0x0 0
ecx 0x2387 9095
edx 0x6 6
ebx 0x2387 9095
esp 0xbfffd160 0xbfffd160
ebp 0xbfffd170 0xbfffd170
esi 0x2387 9095
edi 0xb75a6a98 -1218811240
eip 0xb7499c0f 0xb7499c0f
eflags 0x206 518
cs 0x23 35
ss 0x2b 43
ds 0xc03f002b -1069613013
es 0x2b 43
fs 0xfff7 65527
gs 0x33 51
Thanks for the feedback, Johnray. I do recall a potentially relevant
data corruptor fix that we released in 2.4.21-32.0.1.EL, but since you
still encountered the failures with that kernel, this must be the result
of some other problem.
Moving to ASSIGNED state for Ingo to work on.
On rel3 I was able to reproduce the suspect value reported by gdb by:
(silly.c and Test.java are attached to issue 74303)
gcc silly.c -o silly
/usr/lib/jvm/java-1.4.2-ibm-18.104.22.168/bin/java Test 1
kill -6 N # the above
gdb -c core.N /usr/lib/jvm/java-1.4.2-ibm-22.214.171.124/bin/java
(gdb) info reg ds
ds 0xc03f002b -1069613013
Worked okay on rel4
My suspician was that the value for ds was incorrect in the corefile but the
cause of the problem was elsewhere since ds is a 16 bit register.
However his gdb session is puzzling (assuming it is as reported):
0xb6f488a3 <copy+19>: mov (%eax),%eax // eax SHOULD BE 1 (ds based load)
0xb6f488a5 <copy+21>: mov %eax,0xffffffc0(%ebp) // NO!!! eax contains 0x33343d6e
The corefile results that Johnray and I noticed were on a R3U2 system. I just
tried on R3U4 and it worked okay.
Regarding comment #6, which indicates that failures have been reproduced
on the latest released RHEL3 kernel (3 months ago), i.e., 2.4.21-32.0.1.EL,
I'd like to reiterate that Larry fixed a pte_clear() race in that kernel
that was causing very similar symptoms.
Could someone please reconfirm that the 2.4.21-32.0.1.EL kernel still
exhibits the problem reported in this bugzilla?
Thanks in advance. -ernie
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
*** This bug has been marked as a duplicate of 141394 ***