Bug 222288

Summary: Sun's JVM 1.6.0 crashes after upgrading glibc to 2.5-10
Product: [Fedora] Fedora Reporter: Oded Arbel <oded>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 6   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-01-11 19:53:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Test case which causes JVM to crash with glibc 2.5-10
none
Java error report
none
Another java crash report none

Description Oded Arbel 2007-01-11 14:30:28 UTC
Description of problem:
After upgrading glibc to 2.5-10 from current Fedora 6 updates, Sun's JVM 1.6.0 
receives a segmentation fault in every program that loads libnet.so.

Version-Release number of selected component (if applicable):
2.5-10

How reproducible:
Always.

Steps to Reproduce:
1. Upgrade to glibc 2.5-10
2. Compile the attached test case using Sun's JDK
3. Run the resulting class file using Sun's JVM 1.6.0
  
Actual results:
JVM receives SIGSEGV and outputs:
#
# An unexpected error has been detected by Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00002aaaaaab4202, pid=7276, tid=1076017472
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.6.0-b105 mixed mode)
# Problematic frame:
# C  [ld-linux-x86-64.so.2+0x9202]
#
# An error report file with more information is saved as hs_err_pid7276.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#


Expected results:
"OK" should be outputted and no segmentation fault should happen.

Additional info:
This bug was also reported to Sun with internal review ID 885210.

Comment 1 Oded Arbel 2007-01-11 14:30:28 UTC
Created attachment 145346 [details]
Test case which causes JVM to crash with glibc 2.5-10

Comment 2 Oded Arbel 2007-01-11 14:31:48 UTC
Created attachment 145347 [details]
Java error report

Comment 3 Oded Arbel 2007-01-11 14:33:08 UTC
This problem did not happen before upgrading to latest Fedora update (with 
glibc 2.5-10). However, downgrading back to original Fedora Core 6 glibc 
(2.5-3) did not fix the problem.


Comment 4 Jakub Jelinek 2007-01-11 15:20:27 UTC
This doesn't correspond to glibc-2.5-10.x86_64.rpm dynamic linker:
The dump says:
Instructions: (pc=0x00002aaaaaab4202)
0x00002aaaaaab41f2:   ff 02 00 00 48 8b 8d 20 ff ff ff 31 d2 45 31 d2
0x00002aaaaaab4202:   48 8b 01 48 85 c0 0f 84 47 02 00 00 48 8b 9d 20
...
2aaaaaaab000-2aaaaaac5000 r-xp 00000000 fd:00 31653891                  
/lib64/ld-2.5.so

So 0x9202 within ld-2.5.so.  But glibc-2.5-10's ld.so has:
    91ef:       4c 8b 69 10             mov    0x10(%rcx),%r13
    91f3:       48 8b 85 38 ff ff ff    mov    0xffffffffffffff38(%rbp),%rax
    91fa:       48 8d 0d 96 ce 00 00    lea    52886(%rip),%rcx        # 16097
<curwd.8774+0x70>
    9201:       48 89 b5 40 ff ff ff    mov    %rsi,0xffffffffffffff40(%rbp)
    9208:       48 8d 35 9e ce 00 00    lea    52894(%rip),%rsi        # 160ad
<curwd.8774+0x86>


Comment 5 Oded Arbel 2007-01-11 16:30:44 UTC
Created attachment 145365 [details]
Another java crash report

I downgraded back to 2.5-3 to test, the error report might have been generated
after the downgrade. This report should contain traces that match
glibc-2.5-10.fc6

Comment 6 Jakub Jelinek 2007-01-11 19:53:45 UTC
Oops, sorry, I was looking at glibc-2.5-10 rather than glibc-2.5-10.fc6.

This sounds like the same problem as #210748/#215377.  There is a race
condition in the dynamic linker (that has been around in glibc forever), if a
thread program in some threads calls dlopen on many libraries and in other
threads calls functions for the first time with lazy binding on.  Unfortunately
it is quite difficult problem to solve and not to slow down dynamic linking
too much.  rawhide has some initial attempts to solve that, but they still
contain ABBA deadlock possibilities.

As a workaround, you can run the application with LD_BIND_NOW=1.

*** This bug has been marked as a duplicate of 215377 ***