Bug 1688841

Summary: glibc's free() crashes with ulimit -s unlimited when exiting from java -version
Product: [Fedora] Fedora Reporter: Jitka Plesnikova <jplesnik>
Component: java-1.8.0-openjdkAssignee: Severin Gehwolf <sgehwolf>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 31CC: ahughes, aoliva, arjun, codonell, dbhole, dj, fweimer, jerboaa, jvanek, law, loganjerry, mfabian, msrb, mvala, pfrankli, rth, sgehwolf, siddhesh, spacewar, yann
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-24 20:13:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jitka Plesnikova 2019-03-14 15:02:47 UTC
Package cvc4 fails to build from source in Fedora rawhide on i686 with following error:

javac -classpath /builddir/build/BUILD/cvc4-1.6/builds/i686-redhat-linux-gnu/production-abc-proof/src/bindings/CVC4.jar -d . ../../../../../test/system/CVC4JavaTest.java
BUILDSTDERR: pure virtual method called
BUILDSTDERR: terminate called without an active exception
BUILDSTDERR: make[5]: *** [Makefile:1251: CVC4JavaTest.class] Aborted (core dumped)
BUILDSTDERR: make[5]: *** Deleting file 'CVC4JavaTest.class'
make[5]: Leaving directory '/builddir/build/BUILD/cvc4-1.6/builds/i686-redhat-linux-gnu/production-abc-proof/test/system'
BUILDSTDERR: make[5]: Target 'test-suite.log' not remade because of errors.
BUILDSTDERR: make[4]: *** [Makefile:1008: check-TESTS] Error 2
make[4]: Leaving directory '/builddir/build/BUILD/cvc4-1.6/builds/i686-redhat-linux-gnu/production-abc-proof/test/system'
BUILDSTDERR: make[3]: *** [Makefile:1116: check-am] Error 2
BUILDSTDERR: make[3]: Target 'check' not remade because of errors.

For complete build log check Koji build 
https://koji.fedoraproject.org/koji/taskinfo?taskID=33482436

Comment 1 Jerry James 2019-03-14 19:41:36 UTC
I can reproduce in mock, but notice the following:
- This didn't happen during the last build, just over 3 weeks ago, nor any previous build stretching back over months, which is evidence that there is nothing wrong with the CVC4 java code.
- The failure still doesn't happen on x86_64, more evidence of the same.
- The failure happens during an invocation of javac, therefore during compilation, not while running CVC4 code.

In addition, after reproducing with mock, if I mock --shell and run the javac command by hand, it succeeds.  Running the compiled Java test afterwards succeeds, too.  I tried running the javac step, followed by running the test itself, in a loop.  I never got a failure.

I'm open to theories about what might be happening here.

Comment 2 Jerry James 2019-06-13 17:21:08 UTC
Reassigning to the openjdk component, since cvc4 built correctly during the mass rebuild and only failed subsequently, and only fails on i386.  Here is what I know so far.  If I put a breakpoint on __cxa_pure_virtual, GDB shows this backtrace:

#0  0x296ec150 in __cxa_pure_virtual () from /lib/libstdc++.so.6
#1  0x29fec796 in outputStream::print_cr (this=0x16ae281c, 
    format=0x2a226af2 "#")
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:145
#2  0x2a1d8ec4 in VMError::report (this=0x16ae28f8, st=0x16ae281c)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:387
#3  0x2a1da946 in VMError::report_and_die (this=0x16ae28f8)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:987
#4  0x29fe8984 in JVM_handle_linux_signal (sig=<optimized out>, 
    info=<optimized out>, ucVoid=<optimized out>, 
    abort_if_unrecognized=<optimized out>)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#5  0x29fdab81 in signalHandler (sig=11, info=0x16ae2a0c, uc=0x16ae2a8c)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4555
#6  <signal handler called>
#7  __GI___libc_free (mem=0x85008b00) at malloc.c:3102
#8  0x2a9270e1 in __GI__IO_wsetb (f=0x2aa5ee00 <_IO_stdout_>, b=0x0, eb=0x0, 
    a=0) at wgenops.c:97
#9  0x2a93283a in _IO_unbuffer_all () at genops.c:820
#10 _IO_cleanup () at genops.c:867
#11 0x2a8e9d32 in __run_exit_handlers (status=0, 
    listp=0x2aa5e3bc <__exit_funcs>, run_list_atexit=true, run_dtors=true)
    at exit.c:130
#12 0x2a8e9df7 in __GI_exit (status=0) at exit.c:139
#13 0x29d8055b in vm_direct_exit (code=0)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/java.cpp:605
#14 0x2a1e274b in VM_Exit::doit (this=0x2956de90)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/vm_operations.cpp:462
#15 0x2a1e233a in VM_Operation::evaluate (this=0x2956de90)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/vm_operations.cpp:62
#16 0x2a1e038e in VMThread::evaluate_operation (op=0x2956de90, 
    this=<optimized out>)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:377
#17 0x2a1e0d93 in VMThread::loop (this=0x294a4800)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:502
#18 0x2a1e1064 in VMThread::run (this=0x294a4800)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
#19 0x29fdcce5 in java_start (thread=0x294a4800)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:846
#20 0x2a87867e in start_thread (arg=<optimized out>) at pthread_create.c:479
#21 0x2a9a55aa in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:108

Also:

(gdb) up
#1  0x29fec796 in outputStream::print_cr (this=0x16ae281c, 
    format=0x2a226af2 "#")
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:145
145	  write(str, len);
(gdb) print *this
$1 = {<ResourceObj> = {<No data fields>}, 
  _vptr.outputStream = 0x2a3eb41c <vtable for staticBufferStream+8>, 
  _indentation = 0, _width = 80, _position = 0, _newlines = 0, _precount = 0, 
  _stamp = {_counter = 1}}

So we are calling the write method on a staticBufferStream object, but are calling it as a plain outputStream.  The staticBufferStream object is created on the stack in frame 3:

    staticBufferStream sbs(buffer, sizeof(buffer), &out);
    first_error->report(&sbs);

Then in frame 2 we do this:

void VMError::report(outputStream* st) {
    ...
    st->print_cr("#");

But then, in frame 1, we are not in staticBufferStream::print_cr (ostream.cpp:1408).  We are in outputstream::print_cr (ostream.cpp:139), which leads directly to the "pure virtual method called" error.  For some reason, virtual dispatch on the print_cr method failed.  That is bug #1.

Note that javac was done and was trying to exit.  The exit cleanup handlers are called in frame 11, and we are in a handler that makes all existing FILE objects switch to unbuffered mode, so that the output buffers can be freed.  It looks like _IO_wsetb has determined that some stream has not used wide characters, and so attempts to free the wide character buffer for that stream.  However, the value of f->_wide_data->_IO_buf_base (0x85008b00) is bogus, as can be seen from this entry in the process memory map:

82040000-96600000 ---p 00000000 00:00 0 

We apparently have a corrupted f->_wide_data object for some FILE *f.  That is bug #2.

If anyone has any idea why this only happens on i386, and why it only started happening after the mass rebuild finished, I would like to hear from you.

Comment 3 Jerry James 2019-06-13 17:23:15 UTC
If anybody wants to try building cvc4 for themselves, you will need to change cvc4-abc.patch first.  Find the line that reads:

+             PATHS /usr/lib64

and change it to:

+             PATHS /usr/lib64 /usr/lib

I will commit that change tonight.

Comment 4 Jerry James 2019-06-13 17:29:05 UTC
I just discovered why, in comment 1, I noted that I could run mock --shell and get a good javac run.  It is because I did not execute "ulimit -s unlimited" first.  That is at the top of %check for the benefit of s390x, which needs it.  However, it appears to have some kind of bad effect on i386's javac.

For now, I am going to change cvc4.spec so that it only does "ulimit -s unlimited" on s390x, which will hopefully get cvc4 building again.  But somebody needs to figure out why that is making i386 javac go haywire.

Comment 5 Jerry James 2019-06-13 18:06:42 UTC
Here's a small reproducer:

$ mock -r fedora-rawhide-i386 --install java-1.8.0-openjdk
$ mock -r fedora-rawhide-i386 --shell
<mock-chroot> sh-5.0# su - mockbuild
[mockbuild@5d02690a764741e1ba386e04c0abf8ba ~]$ cat > Useless.java << EOF
> class Useless {
>     public static void main(String[] args) {
>         System.out.println("This class is useless.");
>     }
> }
> EOF
[mockbuild@5d02690a764741e1ba386e04c0abf8ba ~]$ javac Useless.java
[mockbuild@5d02690a764741e1ba386e04c0abf8ba ~]$ ulimit -s unlimited
[mockbuild@5d02690a764741e1ba386e04c0abf8ba ~]$ javac Useless.java
pure virtual method called
terminate called without an active exception
Aborted (core dumped)

Comment 6 Severin Gehwolf 2019-06-18 17:21:14 UTC
Here is an even smaller reproducer:

$ mock -r fedora-rawhide-i386 --init
$ mock -r fedora-rawhide-i386 --install java-1.8.0-openjdk
$ mock -r fedora-rawhide-i386 --shell
<mock-chroot> sh-5.0# ulimit -s unlimited
<mock-chroot> sh-5.0# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK Server VM (build 25.212-b04, mixed mode)
pure virtual method called
terminate called without an active exception
Aborted

Comment 7 Severin Gehwolf 2019-06-18 17:39:02 UTC
<mock-chroot> sh-5.0# gdb -ex 'handle SIGSEGV nostop noprint pass' -ex 'handle SIGABRT stop print pass' -ex 'break JavaCalls::call' -ex 'break src/share/vm/utilities/ostream.cpp:145' -ex 'break __libc_start_main' -ex 'run' --args java -version
GNU gdb (GDB) Fedora 8.3.50.20190601-15.fc31
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from java...
Reading symbols from /usr/lib/debug/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/jre/bin/java-1.8.0.212.b04-4.fc31.i386.debug...
Signal        Stop	Print	Pass to program	Description
SIGSEGV       No	No	Yes		Segmentation fault
Signal        Stop	Print	Pass to program	Description
SIGABRT       Yes	Yes	Yes		Aborted
Function "JavaCalls::call" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (JavaCalls::call) pending.
No source file named src/share/vm/utilities/ostream.cpp.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (src/share/vm/utilities/ostream.cpp:145) pending.
Breakpoint 3 at 0x1080
Starting program: /usr/bin/java -version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".

Breakpoint 3, __libc_start_main (main=0x565560a0 <main>, argc=2, argv=0xffffd714, init=0x56556250 <__libc_csu_init>, fini=0x565562c0 <__libc_csu_fini>, rtld_fini=0x2aa90bd0 <_dl_fini>, stack_end=0xffffd70c)
    at ../csu/libc-start.c:137
137	{
Missing separate debuginfos, use: dnf debuginfo-install zlib-1.2.11-15.fc30.i686
(gdb) list 342
337	#else
338	  /* Nothing fancy, just call the function.  */
339	  result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
340	#endif
341	
342	  exit (result);
343	}
(gdb) break 342
Breakpoint 4 at 0x2a8d4f6c: file ../csu/libc-start.c, line 342.
(gdb) c
Continuing.
[New Thread 0x29572b40 (LWP 246)]
warning: Error reading shared library list entry at 0x6b0
warning: Error reading shared library list entry at 0x1a70
warning: Error reading shared library list entry at 0x2580
warning: Error reading shared library list entry at 0x7770
warning: Corrupted shared library list: 0x29447770 != 0x0
[New Thread 0x19c59b40 (LWP 247)]
[New Thread 0x19bd8b40 (LWP 248)]
[New Thread 0x199ffb40 (LWP 249)]
[New Thread 0x196ffb40 (LWP 250)]
[New Thread 0x194ffb40 (LWP 251)]
[New Thread 0x192ffb40 (LWP 252)]
[New Thread 0x190ffb40 (LWP 253)]
[New Thread 0x18effb40 (LWP 254)]
[New Thread 0x162e3b40 (LWP 255)]
[Switching to Thread 0x29572b40 (LWP 246)]

Thread 2 "java" hit Breakpoint 1, JavaCalls::call (result=0x29571e3c, method=..., args=0x29571e48, __the_thread__=0x29447c00)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/runtime/javaCalls.cpp:301
301	void JavaCalls::call(JavaValue* result, methodHandle method, JavaCallArguments* args, TRAPS) {
Missing separate debuginfos, use: dnf debuginfo-install libgcc-9.1.1-2.fc31.i686 libstdc++-9.1.1-2.fc31.i686


Here we are running OpenJDK code.

(gdb) disable 1
(gdb) c
Continuing.
[New Thread 0x1947eb40 (LWP 256)]
[New Thread 0x1927eb40 (LWP 257)]
[New Thread 0x18e7eb40 (LWP 258)]
[New Thread 0x15b8cb40 (LWP 259)]
[New Thread 0x159ffb40 (LWP 260)]
[New Thread 0x156ffb40 (LWP 261)]
[New Thread 0x1567eb40 (LWP 262)]
[New Thread 0x16262b40 (LWP 263)]
[New Thread 0x155fdb40 (LWP 264)]
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK Server VM (build 25.212-b04, mixed mode)
[Thread 0x155fdb40 (LWP 264) exited]
[Thread 0x162e3b40 (LWP 255) exited]
[Thread 0x29572b40 (LWP 246) exited]
[Switching to Thread 0x2a872700 (LWP 242)]

Thread 1 "java" hit Breakpoint 4, __libc_start_main (main=0x565560a0 <main>, argc=2, argv=0xffffd714, init=0x56556250 <__libc_csu_init>, fini=0x565562c0 <__libc_csu_fini>, rtld_fini=0x2aa90bd0 <_dl_fini>, 
    stack_end=0xffffd70c) at ../csu/libc-start.c:342
342	  exit (result);
(gdb) p result
$1 = 0


OK. So we're done with OpenJDK code. exit(0) is being called next.


(gdb) c
Continuing.

Thread 1 "java" hit Breakpoint 2, outputStream::print_cr (this=0xffffcd9c, format=0x2a22aaf2 "#")
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:145
145	  write(str, len);
(gdb) bt
#0  outputStream::print_cr (this=0xffffcd9c, format=0x2a22aaf2 "#") at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/ostream.cpp:145
#1  0x2a1dcec4 in VMError::report (this=0xffffce78, st=0xffffcd9c) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:387
#2  0x2a1de946 in VMError::report_and_die (this=0xffffce78) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:987
#3  0x29fec984 in JVM_handle_linux_signal (sig=<optimized out>, info=<optimized out>, ucVoid=<optimized out>, abort_if_unrecognized=<optimized out>)
    at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
#4  0x29fdeb81 in signalHandler (sig=11, info=0xffffcf8c, uc=0xffffd00c) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.212.b04-4.fc31.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4555
#5  <signal handler called>
#6  __GI___libc_free (mem=0x85008b00) at malloc.c:3102
#7  0x2a929de1 in __GI__IO_wsetb (f=0x2aa60e00 <_IO_stdout_>, b=0x0, eb=0x0, a=0) at wgenops.c:97
#8  0x2a93553a in _IO_unbuffer_all () at genops.c:820
#9  _IO_cleanup () at genops.c:867
#10 0x2a8ecb42 in __run_exit_handlers (status=0, listp=0x2aa603bc <__exit_funcs>, run_list_atexit=true, run_dtors=true) at exit.c:130
#11 0x2a8ecc07 in __GI_exit (status=0) at exit.c:139
#12 0x2a8d4f75 in __libc_start_main (main=0x565560a0 <main>, argc=2, argv=0xffffd714, init=0x56556250 <__libc_csu_init>, fini=0x565562c0 <__libc_csu_fini>, rtld_fini=0x2aa90bd0 <_dl_fini>, stack_end=0xffffd70c)
    at ../csu/libc-start.c:342
#13 0x56556145 in _start ()
(gdb) frame 6
#6  __GI___libc_free (mem=0x85008b00) at malloc.c:3102
3102	  p = mem2chunk (mem);
(gdb) frame 7
#7  0x2a929de1 in __GI__IO_wsetb (f=0x2aa60e00 <_IO_stdout_>, b=0x0, eb=0x0, a=0) at wgenops.c:97
97	    free (f->_wide_data->_IO_buf_base);
(gdb) 


In summary, here is what's happening:
1. __libc_start_main runs "java -version"
2. That returns with an exit status of 0 and returns back to glibc calling exit(0)
3. That exit call seems to end up in a weird state where in __GI__IO_wsetb() it calls
   free() on f->_wide_data->_IO_buf_base (0x85008b00)
4. that mem2chunk(0x85008b00) call seems to trigger a SIGSEGV (sig=11), and since
   the OpenJDK runtime had the signal handler installed previously, it get's called
   again, even though the runtime has already been exited. That's why in outputStream::print_cr()
   we end up in this weird state.
5. Since we end up in the JVMs signal handler it tries to create an error report, but
   that fails with SIGABRT due to the pure virtual method call issue.


So the "pure virtual method called" is a red herring. The issue seems to be somewhere in glibc code which seems to act strangely if run with 'ulimit -s unlimited'. The same OpenJDK code in an F29 i386 mock or F30 i386 mock (1.8.0_212-b04) works fine with ulimit -s unlimited.

Comment 8 Severin Gehwolf 2019-06-18 17:41:10 UTC
Re-assigning to glibc as per comment 7. Reproducer is in comment 8.

Comment 9 Severin Gehwolf 2019-06-18 17:45:00 UTC
Err..

Reproducer is in comment 6.

Comment 10 DJ Delorie 2019-06-18 17:55:00 UTC
mem2chunk doesn't dereference anything; the segfault must be in the surrounding code.  The next line is:

  if (chunk_is_mmapped (p))                       /* release mmapped memory. */

which simply indirects the pointer as passed to free().

There's not much glibc can do if free() is passed an invalid pointer (0x85008b00 looks like wide chars?) and segfaulting is acceptable here.

Comment 11 Yann Droneaud 2019-06-18 20:24:22 UTC
(In reply to Severin Gehwolf from comment #6)
> Here is an even smaller reproducer:
> 
> $ mock -r fedora-rawhide-i386 --init
> $ mock -r fedora-rawhide-i386 --install java-1.8.0-openjdk
> $ mock -r fedora-rawhide-i386 --shell
> <mock-chroot> sh-5.0# ulimit -s unlimited
> <mock-chroot> sh-5.0# java -version
> openjdk version "1.8.0_212"
> OpenJDK Runtime Environment (build 1.8.0_212-b04)
> OpenJDK Server VM (build 25.212-b04, mixed mode)
> pure virtual method called
> terminate called without an active exception
> Aborted

Instead of using ulimit -s, one can run java through valgrind.
Through valgrind, java hit the same error:

[... lots of invalid access outside the thread stack ...]
  openjdk version "1.8.0_212"
  OpenJDK Runtime Environment (build 1.8.0_212-b04)
  OpenJDK Server VM (build 25.212-b04, mixed mode)
  ==35== Thread 1:
  ==35== Invalid free() / delete / delete[] / realloc()
  ==35==    at 0x4836729: free (vg_replace_malloc.c:540)
  ==35==    by 0x48CCDE0: _IO_wsetb (in /usr/lib/libc-2.29.9000.so)
  ==35==    by 0x48D8539: _IO_cleanup (in /usr/lib/libc-2.29.9000.so)
  ==35==    by 0x488FB41: __run_exit_handlers (in /usr/lib/libc-2.29.9000.so)
  ==35==    by 0x488FC06: exit (in /usr/lib/libc-2.29.9000.so)
  ==35==    by 0x4877F74: (below main) (in /usr/lib/libc-2.29.9000.so)
  ==35==  Address 0x85008b00 is not stack'd, malloc'd or (recently) free'd
[... valgrind allows the program to continue ...]
  ==35== 
  pure virtual method called
  terminate called without an active exception
  ==35== 
[... abort + segfault ...]

Comment 12 Yann Droneaud 2019-06-19 06:59:32 UTC
On my system, java start (almost reliably) crashing when setting stack soft limit around 2086440. Unfortunately there's not an exact limit. It's subject to variation between run.
(I was expecting java to start crashing when an odd stack size was set, but this doesn't seems to be the case).

Comment 13 Florian Weimer 2019-06-19 11:29:19 UTC
The immediate cause of this bug is a linker script for /usr/bin/java which does not whitelist _IO_stdin_used.  This symbol is defined by crt1.o, and it must be visible to the dynamic loader.  It is a historic compatibility mechanism which is rather brittle, unfortunately.  I would like to fix that, but it's only i386, so …

Without the _IO-stdin_used export, glibc switches to a legacy implementation of libio because your main program looks like a historic binary from 1998 or so.  Since the legacy implementation has fewer features and JNI libraries might rely on them, fixing this is worthwhile in its own right.

The crash is a bug in glibc which we have to fix separately.  It's a type-mismatch resulting in an out-of-bounds read.  ulimit -s affects the bit pattern of pointer values, and that merely exposes the bug.

Comment 14 Severin Gehwolf 2019-06-19 14:16:07 UTC
(In reply to Florian Weimer from comment #13)
> The immediate cause of this bug is a linker script for /usr/bin/java which
> does not whitelist _IO_stdin_used.  This symbol is defined by crt1.o, and it
> must be visible to the dynamic loader.  It is a historic compatibility
> mechanism which is rather brittle, unfortunately.  I would like to fix that,
> but it's only i386, so …

I'm not entirely clear on which arches need the linker script fixed. Are you saying
we need to use global scope for _IO_stdin_used on i386 only or should we do this
for all arches?

That is:

<mock-chroot> sh-5.0# readelf -s /usr/bin/java | grep _IO
    75: 00002004     4 OBJECT  LOCAL  DEFAULT   19 _IO_stdin_used

Should become:

<mock-chroot> sh-5.0# readelf -s /usr/bin/java | grep _IO
    75: 00002004     4 OBJECT  GLOBAL  DEFAULT   19 _IO_stdin_used

on i386 only or not?

Comment 15 Florian Weimer 2019-06-19 14:46:26 UTC
(In reply to Severin Gehwolf from comment #14)
> (In reply to Florian Weimer from comment #13)
> > The immediate cause of this bug is a linker script for /usr/bin/java which
> > does not whitelist _IO_stdin_used.  This symbol is defined by crt1.o, and it
> > must be visible to the dynamic loader.  It is a historic compatibility
> > mechanism which is rather brittle, unfortunately.  I would like to fix that,
> > but it's only i386, so …
> 
> I'm not entirely clear on which arches need the linker script fixed. Are you
> saying
> we need to use global scope for _IO_stdin_used on i386 only or should we do
> this for all arches?

The symbol needs to show up in “eu-readelf --symbols .dynsym” as GLOBAL.

For Fedora, this only concerns i386.

For OpenJDK upstream, it should be safe to export the symbol unconditionally if it exists, with a tweak to make/hotspot/lib/JvmMapfile.gmk.  I'm trying to figure out the right syntax for the version script for that (you don't want to associate it with a symbol version, although current glibc would be fine with that too).

Comment 16 Florian Weimer 2019-06-19 15:29:04 UTC
Simply adding

  global: _IO_stdin_used;

before the

  local: *;

line should fix this.  It's safe to do this unconditionally on all architectures.

Comment 17 Severin Gehwolf 2019-06-19 15:41:26 UTC
(In reply to Florian Weimer from comment #16)
> Simply adding
> 
>   global: _IO_stdin_used;
> 
> before the
> 
>   local: *;
> 
> line should fix this.  It's safe to do this unconditionally on all
> architectures.

Yes, that's what I have. It looks like an OpenJDK 8 only problem. JDK 11+ seems to work for me:

<mock-chroot> sh-5.0# uname -m
i686
<mock-chroot> sh-5.0# ulimit -s
unlimited
<mock-chroot> sh-5.0# /usr/lib/jvm/java-11-openjdk-11.0.3.7-5.fc31.i386/bin/java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment 18.9 (build 11.0.3+7)
OpenJDK Server VM 18.9 (build 11.0.3+7, mixed mode)


Recent'ish JDK head:

$ eu-readelf --symbols .dynsym /disk/openjdk/upstream-sources/openjdk-head/build/linux-x86_64-server-release/images/jdk/bin/java | grep _IO
   73: 0000000000402000      4 OBJECT  GLOBAL DEFAULT       16 _IO_stdin_used

JDK 11.0.4+8 ea:

$ eu-readelf --symbols .dynsym /home/sgehwolf/Documents/openjdk/portable-linux-build/openjdk-11.0.4+8/bin/java | grep _IO
   57: 0000000000400cc8      4 OBJECT  GLOBAL DEFAULT       16 _IO_stdin_used

The reason for this is that JDK 8 used map files which seem gone for JDK 11+.

Link command line from a JDK 11 build:

/usr/bin/gcc -Wl,--hash-style=both -Wl,-z,defs -Wl,-O1 -m64 -Wl,--allow-shlib-undefined -Wl,--exclude-libs,ALL -Wl,-rpath,\$ORIGIN/../lib/jli -Wl,-rpath,\$ORIGIN/../lib -L/disk/openjdk/upstream-sources/openjdk-11/build/linux-x86_64-normal-server-release/support/modules_libs/java.base/jli -o /disk/openjdk/upstream-sources/openjdk-11/build/linux-x86_64-normal-server-release/support/native/java.base/java_objs/java /disk/openjdk/upstream-sources/openjdk-11/build/linux-x86_64-normal-server-release/support/native/java.base/java/main.o -lz -lpthread -ljli -ldl

Comment 18 Florian Weimer 2019-06-19 15:51:42 UTC
Sorry, I mis-typed.  It should be “eu-readelf --symbols=.dynsym”, with the equal sign.  The symbols you quoted are from an x86-64 build, and those are in .symtab, not .dynsym.

Any suggestions how to perform a 32-bit build from an x86-64 host?  I keep running into problems with GCC 4.8.

Comment 19 Severin Gehwolf 2019-06-19 16:05:03 UTC
(In reply to Florian Weimer from comment #18)
> Sorry, I mis-typed.  It should be “eu-readelf --symbols=.dynsym”, with the
> equal sign.  The symbols you quoted are from an x86-64 build, and those are
> in .symtab, not .dynsym.

Yes. From an x86_64 host, but the result is the same. There are no arch-specific setting for the launchers in 11+ AFAIK.

<mock-chroot> sh-5.0# /usr/lib/jvm/java-11-openjdk-11.0.3.7-5.fc31.i386/bin/java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment 18.9 (build 11.0.3+7)
OpenJDK Server VM 18.9 (build 11.0.3+7, mixed mode)
<mock-chroot> sh-5.0# eu-readelf --symbols=.dynsym /usr/lib/jvm/java-11-openjdk-11.0.3.7-5.fc31.i386/bin/java | grep _IO
   16: 00002004      4 OBJECT  GLOBAL DEFAULT       18 _IO_stdin_used

> Any suggestions how to perform a 32-bit build from an x86-64 host?  I keep
> running into problems with GCC 4.8.

I usually do it in a i386 mock on a x86_64 host.

Comment 20 Florian Weimer 2019-06-19 16:43:47 UTC
I was now able to build current tip on i386 and I can confirm that the export is there (checked javac so far).  So presumably only 8 needs fixing at this point.

We'll also finally fix the glibc bug.

Comment 21 Severin Gehwolf 2019-06-19 17:01:49 UTC
https://src.fedoraproject.org/rpms/java-1.8.0-openjdk/pull-request/74

<mock-chroot> sh-5.0# ulimit -s
unlimited
<mock-chroot> sh-5.0# java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
OpenJDK Server VM (build 25.212-b04, mixed mode)
<mock-chroot> sh-5.0# rpm -q java-1.8.0-openjdk
java-1.8.0-openjdk-1.8.0.212.b04-5.fc31.i686
<mock-chroot> sh-5.0# eu-readelf --symbols=.dynsym /usr/bin/java | grep _IO
    8: 00002004      4 OBJECT  GLOBAL DEFAULT       19 _IO_stdin_used@@SUNWprivate_1.1


I'll work on getting this upstream while it's in review for Fedora.

Comment 22 Severin Gehwolf 2019-06-21 08:46:21 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=35633895

Is a scratch build with the patch. Testers are welcome to try it. Note that the build artefacts will go away in some time (a week or so).

i686 link is:

https://koji.fedoraproject.org/koji/taskinfo?taskID=35633897

Comment 23 Florian Weimer 2019-06-21 10:44:55 UTC
glibc-2.29.9000-29.fc31 is in the rawhide buildroot and offers better compatibility for legacy binaries (or incorrectly linked ones).

Comment 24 Ben Cotton 2019-08-13 16:49:24 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 25 Ben Cotton 2020-11-03 15:11:19 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 26 Ben Cotton 2020-11-24 20:13:06 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.