Bug 138306

Summary: Heap corruption occurs during call to JNI_CreateJavaVM
Product: Red Hat Enterprise Linux 3 Reporter: Matthew Gregan [:kinetik] <kinetik>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: shillman
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-11-17 16:58:16 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
Testcase none

Description Matthew Gregan [:kinetik] 2004-11-07 19:58:26 EST
Description of problem:
This bug report is related to behaviour between some part of RHEL 3AS
and the Sun Java runtime 1.4.2_06 (and at least earlier 1.4
versions).  I'm not entirely sure if the problem is within glibc, but
the investigation I have done so far suggests that there is a good
chance it is.

I have been observing random crashes during JVM startup, or after
some time using an application executing within the JVM.  After
reviewing my code, I determined that there appears to be heap
corruption occuring when the Java VM is created via the
JNI_CreateJavaVM invocation API.  As far as I can tell, I am using
this API correctly and as documented.

Using the attached testcase, and executing it multiple times, errant
behaviour can be observed occuring fairly regularly.  This errant
behaviour includes Hotspot JVM crashes, segmentation faults before
the JVM is fully initialized, etc.  Enabling strict malloc()
checking, or using Electric Fence, causes the crashes to occur on a
100% repeatable basis.

Note that this problem does not occur on SuSE Linux Enterprise 9.0
with the same JDK (installed from the same set of binaries), so it is
possible that this problem is either: a) specific to RHEL 3AS, b) a
bug in RHEL 3 AS, c) an incompatible interaction between the JDK and
RHEL 3AS, d) a legitimate bug in the JDK that does not occur on some
systems.

I have reproduced this problem with an earlier version of the 1.4 JDK
that was installed on the RHEL 3AS machines.  I have a single RHEL
3AS machine (an x86_64, so it is quite a different machine) with an
older glibc (glibc-2.3.2-95.20) installed, but otherwise the same
software versions, and this problem does NOT occur on this machine. 
This is what leads me to believe that the problem may be within glibc.
 I've also filed a bug at the Sun Developer Connection site.


Version-Release number of selected component (if applicable):
glibc-2.3.2-95.27
j2sdk1.4.2_06 (Sun)


How reproducible:
100% if strict malloc() checking is enabled


Steps to Reproduce:
1.  Compile attached testcase as follows:

gcc -pthread -D_REENTRANT -g3 -ggdb -W -Wall -o jnitest
-I/usr/java/j2sdk1.4.2_06/include -I/usr/java/j2sdk1.4.2_0
6/include/linux jnitest.c
-L/usr/java/j2sdk1.4.2_06/jre/lib/i386/client -ljvm

2.  Set up the execution environment:

export
LD_LIBRARY_PATH=/usr/java/j2sdk1.4.2_06/jre/lib/i386/client:/usr/java/j2sdk1.4.2_06/jre/lib/i386

export MALLOC_CHECK_=3

(or, instead of using malloc() checking, do)

export LD_PRELOAD=libefence.so

3.  Execute the testcase and observe the errant behaviour.


Actual results:
Without strict malloc() checking, JVM creation fails intermittently,
or the program executing in the
JVM crashes at a later time.  With strict malloc() checking or
Electric Fence enabled, JVM creation fails 100% of the time.

Expected results:
JVM should be created successfully, and no heap corruption should occur.

Additional info:
Using Electric Fence, the following stack trace appears to occur
every time (out of the times I've tried, at least):                  
                                                                     
                                                                     
                  #0  0x0079f6d1 in pthread_mutex_init () from
/lib/tls/libpthread.so.0                                            
#1  0x00a6e29c in ObjectMonitor::ObjectMonitor () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so      #2 
0x00a2f5d7 in CreateRawMonitor () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
 #3  0x00522872 in JVM_OnLoad () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/libjdwp.so                      
       #4  0x00a27aee in JvmdiInternal::post_event () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so         #5 
0x00a31ace in jvmdi::post_vm_initialized_event () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so  #6  0x00abd51c
in Threads::create_vm () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so               
#7  0x009e3468 in JNI_CreateJavaVM () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
 #8  0x08048703 in main (argc=1, argv=0xbfffa6d4) at jnitest.c:62    
                                                                     
                                                                     
                   Using MALLOC_CHECK_=3, the backtrace always seems
to be:                                                          #0 
0x00138cdf in raise () from /lib/tls/libc.so.6                       
                                        #1  0x0013a4e5 in abort ()
from /lib/tls/libc.so.6                                              
                 #2  0x00184729 in malloc_check () from
/lib/tls/libc.so.6                                                   
     #3  0x001824fd in calloc () from /lib/tls/libc.so.6             
                                                 #4  0x008925cf in
ZIP_Close () from /usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so     
                          #5  0x00892bf7 in ZIP_GetEntry () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so                       
     #6  0x00893135 in ZIP_FindEntry () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so                       
    #7  0x003f6620 in ClassPathZipEntry::open_stream () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so    #8 
0x003f7525 in ClassLoader::load_classfile () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so       #9 
0x00534a97 in SystemDictionary::load_instance_class ()               
                                           from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
                                    #10 0x005345ac in
SystemDictionary::resolve_instance_class_or_null ()                  
                             from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
                                    #11 0x005385d2 in
SystemDictionary::resolve_or_null () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so #12 0x005337d4
in SystemDictionary::resolve_or_fail () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so #13 0x00533bf8
in SystemDictionary::resolve_super_or_fail ()                        
                                from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
                                    #14 0x003f40d7 in
ClassFileParser::parseClassFile () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so   #15 0x003f75ed
in ClassLoader::load_classfile () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so       #16
0x00534a97 in SystemDictionary::load_instance_class ()               
                                           from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
                                    #17 0x005345ac in
SystemDictionary::resolve_instance_class_or_null ()                  
                             from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so                
                                    #18 0x005385d2 in
SystemDictionary::resolve_or_null () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so #19 0x005337d4
in SystemDictionary::resolve_or_fail () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so #20 0x00533bf8
in SystemDictionary::resolve_super_or_fail ()
   from /usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#21 0x003f40d7 in ClassFileParser::parseClassFile () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#22 0x003f75ed in ClassLoader::load_classfile () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#23 0x00534a97 in SystemDictionary::load_instance_class ()
   from /usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#24 0x005345ac in SystemDictionary::resolve_instance_class_or_null ()
   from /usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#25 0x005385d2 in SystemDictionary::resolve_or_null () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#26 0x005337d4 in SystemDictionary::resolve_or_fail () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#27 0x00533942 in SystemDictionary::resolve_or_fail () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#28 0x005367cb in SystemDictionary::initialize_preloaded_classes ()
   from /usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#29 0x005362a5 in SystemDictionary::initialize () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#30 0x00558140 in Universe::genesis () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#31 0x00559369 in universe2_init () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#32 0x00433fc0 in init_globals () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#33 0x0054cf6b in Threads::create_vm () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#34 0x00473468 in JNI_CreateJavaVM () from
/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#35 0x08048703 in main (argc=1, argv=0xbfff9fe4) at jnitest.c:62
Comment 1 Matthew Gregan [:kinetik] 2004-11-07 19:59:40 EST
Created attachment 106265 [details]
Testcase
Comment 2 Jakub Jelinek 2004-11-17 15:56:18 EST
The chances this is a bug in Sun JDK are way bigger.
If I LD_PRELOAD=libefence.so.0, I always see a crash in apparently JIT created
code that does:
0x00fdfa1e:     mov    %eax,0xffffd000(%esp)
This is invalid, on i386 it is never allowed to access memory below the stack
and the kernel rightfully kills the process with SIGSEGV.
This is not code created by glibc (glibc never creates executable code in malloced memory), but JDK, therefore the bug is in there.

With MALLOC_CHECK_=3 I see:
#0  0x00562cdf in raise () from /lib/tls/libc.so.6
#1  0x005644e5 in abort () from /lib/tls/libc.so.6
#2  0x005ae729 in malloc_check () from /lib/tls/libc.so.6
#3  0x005ac4fd in calloc () from /lib/tls/libc.so.6
#4  0x006a95cf in ZIP_Close () from /tmp/usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so
#5  0x006a9bf7 in ZIP_GetEntry () from /tmp/usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so
#6  0x006aa135 in ZIP_FindEntry () from /tmp/usr/java/j2sdk1.4.2_06/jre/lib/i386/libzip.so
#7  0x0025d620 in ClassPathZipEntry::open_stream () from /tmp/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
#8  0x0025e525 in ClassLoader::load_classfile () from /tmp/usr/java/j2sdk1.4.2_06/jre/lib/i386/client/libjvm.so
...

Comment 3 Jakub Jelinek 2004-11-17 16:58:16 EST
Even the MALLOC_CHECK_=3 failures look like bugs in JDK.
But it is really hard to debug this without JDK's source, so it is something
that Sun should debug and fix.