Bug 1572811 - java crash under gdb
Summary: java crash under gdb
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: java-openjdk
Version: 28
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Deepak Bhole
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-27 23:45 UTC by Dmitri A. Sergatskov
Modified: 2019-01-07 14:09 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-30 15:36:24 UTC


Attachments (Terms of Use)
test code (486 bytes, text/x-csrc)
2018-04-27 23:45 UTC, Dmitri A. Sergatskov
no flags Details

Description Dmitri A. Sergatskov 2018-04-27 23:45:22 UTC
Created attachment 1427887 [details]
test code

Description of problem:
execution of the attached program under gdb crashes on i7 and newer CPUs

Version-Release number of selected component (if applicable):
9.0.4.11-6.fc28

How reproducible:
100%

Steps to Reproduce:
1. compile attached program as:
g++ -ggdb3 -I/usr/lib/jvm/java-9/include -I/usr/lib/jvm/jre-9/include/linux tjava.cc -L/usr/lib/jvm/java-9-openjdk/lib/server -ljvm
2. Run 
LD_LIBRARY_PATH=/usr/lib/jvm/java-9/lib/server ./a.out
(runs fine, the output: VM = 0x7f806ee4fd20)

3. Run under gdb as:
LD_LIBRARY_PATH=/usr/lib/jvm/java-9/lib/server gdb ./a.out

Actual results:
On older Xeon (W3530) and core2 duo it runs fine. On i7 it crashes:
LD_LIBRARY_PATH=/usr/lib/jvm/java-9/lib/server gdb ./a.out 
GNU gdb (GDB) Fedora 8.1-11.fc28
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...done.
(gdb) r
Starting program: /home/dima/scratch/tjava/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffdc60a4f3 in ?? ()
(gdb) bt
#0  0x00007fffdc60a4f3 in ?? ()
#1  0x0000000000000246 in ?? ()
#2  0x00007fffdc60a280 in ?? ()
#3  0x00007ffff7d8cf04 in Abstract_VM_Version::_vm_major_version () from /usr/lib/jvm/java-9/lib/server/libjvm.so
#4  0x00007fffffffc9d0 in ?? ()
#5  0x00007ffff7885bc9 in VM_Version::get_processor_features ()
    at /usr/src/debug/java-9-openjdk-9.0.4.11-6.fc28.x86_64/openjdk/hotspot/src/cpu/x86/vm/vm_version_x86.cpp:530
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) quit
A debugging session is active.

	Inferior 1 [process 5031] will be killed.

Quit anyway? (y or n) y


Expected results:


Additional info:

Comment 1 Dmitri A. Sergatskov 2018-04-27 23:47:44 UTC
Line 530 of /usr/src/debug/java-9-openjdk-9.0.4.11-6.fc28.x86_64/openjdk/hotspot/src/cpu/x86/vm/vm_version_x86.cpp is 

get_cpu_info_stub(&_cpuid_info);

Comment 2 jiri vanek 2018-04-30 09:33:34 UTC
java9 is dead. 

Please replace it by java-openjdk (jdk10) package.
If the issue is still reproducible, please change component. Otherwise close.
TY!

Comment 3 Dmitri A. Sergatskov 2018-04-30 12:56:28 UTC
It is the dame with 8 or 10. Except that I cannot install debuginfos for 10 for 
some reasons:

<<<<
LD_LIBRARY_PATH=/usr/lib/jvm/java-10/lib/server gdb ./a.out
GNU gdb (GDB) Fedora 8.1-11.fc28
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...done.
(gdb) r
Starting program: /home/dima/scratch/tjava/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fffdc6f6520 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install java-openjdk-headless-10.0.0.46-10.fc28.x86_64
(gdb) 
>>>>>

yet:

dnf debuginfo-install java-openjdk-headless-10.0.0.46-10.fc28.x86_64
enabling updates-debuginfo repository
enabling fedora-debuginfo repository
enabling rpmfusion-free-updates-debuginfo repository
enabling rpmfusion-free-debuginfo repository
enabling rpmfusion-nonfree-updates-debuginfo repository
enabling rpmfusion-nonfree-debuginfo repository
Last metadata expiration check: 0:02:48 ago on Mon 30 Apr 2018 07:49:46 AM CDT.
Dependencies resolved.
Nothing to do.
Complete!


I also could not figure out how to change component. Shall I re-file the bug?

Dmitri.

Comment 4 Dmitri A. Sergatskov 2018-04-30 12:58:26 UTC
OK I assume jave-openjdk is a correct component.

Comment 5 jiri vanek 2018-04-30 13:05:50 UTC
That is strange with debuginfo. Sending bug up to jdk developers.

Comment 6 Dmitri A. Sergatskov 2018-04-30 13:34:28 UTC
I also tried with -slowdebug- 
and it is the same:

<<<<
Program received signal SIGSEGV, Segmentation fault.
0x00007fffdbc08540 in ?? ()
Missing separate debuginfos, use: dnf debuginfo-install java-openjdk-headless-slowdebug-10.0.0.46-10.fc28.x86_64
(gdb) bt
#0  0x00007fffdbc08540 in ?? ()
#1  0x0000000000000246 in ?? ()
#2  0x00007fffdbc082a0 in ?? ()
#3  0x0000000000633000 in ?? ()
#4  0x00007fffffffc8f0 in ?? ()
#5  0x00007ffff722a5df in VM_Version::get_processor_features() () from /usr/lib/jvm/java-10/lib/server/libjvm.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 
>>>>

And still cannot install debug symbols:

dnf debuginfo-install java-openjdk-headless-slowdebug-10.0.0.46-10.fc28.x86_64
enabling updates-debuginfo repository
enabling fedora-debuginfo repository
enabling rpmfusion-free-updates-debuginfo repository
enabling rpmfusion-free-debuginfo repository
enabling rpmfusion-nonfree-updates-debuginfo repository
enabling rpmfusion-nonfree-debuginfo repository
Last metadata expiration check: 0:13:58 ago on Mon 30 Apr 2018 08:14:31 AM CDT.
Dependencies resolved.
Nothing to do.
Complete!

Comment 7 Severin Gehwolf 2018-04-30 13:51:53 UTC
This isn't an actual java crash. In fact, one can reproduce this even with "java -version" on a server JVM.

$ gdb -batch -ex run -ex bt -ex kill -ex quit --args /usr/lib/jvm/java-1.8.0-openjdk/bin/java -version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fc9700 (LWP 90426)]

Thread 2 "java" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fc9700 (LWP 90426)]
0x00007fffe10002b4 in ?? ()
#0  0x00007fffe10002b4 in ?? ()
#1  0x0000000000000246 in ?? ()
#2  0x00007fffe1000160 in ?? ()
#3  0x00007ffff71d848c in Abstract_VM_Version::_reserve_for_allocation_prefetch () from /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-1.b10.fc27.x86_64/jre/lib/amd64/server/libjvm.so
#4  0x00007ffff7fc8990 in ?? ()
#5  0x00007ffff6cf28b4 in VM_Version::get_cpu_info_wrapper () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-1.b10.fc27.x86_64/openjdk/hotspot/src/cpu/x86/vm/vm_version_x86.cpp:395
#6  VM_Version::get_processor_features () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-1.b10.fc27.x86_64/openjdk/hotspot/src/cpu/x86/vm/vm_version_x86.cpp:417
#7  0x49656e696c65746e in ?? ()
#8  0x07100800000506e3 in ?? ()
#9  0xbfebfbff7ffafbff in ?? ()
#10 0x01c0003f1c004121 in ?? ()
#11 0x000000000000003f in ?? ()
#12 0x0000000000000000 in ?? ()
Kill the program being debugged? (y or n) [answered Y; input not from terminal]

The reason this is happening is that the JVM uses the SEGV signal for stack overflow detection (stack banging). However, gdb stops on SEGV by default. I suggest using something like this in GDB for JVM debugging:

handle SIGSEGV nostop noprint pass


$ gdb -ex 'handle SIGSEGV nostop noprint pass' -ex run -ex bt -ex kill -ex quit --args /usr/lib/jvm/java-1.8.0-openjdk/bin/java -version
GNU gdb (GDB) Fedora 8.0.1-36.fc27
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/jvm/java-1.8.0-openjdk/bin/java...Reading symbols from /usr/lib/debug/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-1.b10.fc27.x86_64/jre/bin/java-1.8.0.171-1.b10.fc27.x86_64.debug...done.
done.
Signal        Stop	Print	Pass to program	Description
SIGSEGV       No	No	Yes		Segmentation fault
Starting program: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-1.b10.fc27.x86_64/bin/java -version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fc9700 (LWP 90631)]
[New Thread 0x7ffff4275700 (LWP 90632)]
[New Thread 0x7ffff4174700 (LWP 90633)]
[New Thread 0x7fffdf652700 (LWP 90634)]
[New Thread 0x7fffdf551700 (LWP 90635)]
[New Thread 0x7fffdf450700 (LWP 90636)]
[New Thread 0x7fffdf34f700 (LWP 90637)]
[New Thread 0x7fffdf24e700 (LWP 90638)]
[New Thread 0x7fffdf14d700 (LWP 90639)]
[New Thread 0x7fffcf9f9700 (LWP 90640)]
[New Thread 0x7fffcf8f8700 (LWP 90641)]
[New Thread 0x7fffcf7f7700 (LWP 90642)]
[New Thread 0x7fffcf6f6700 (LWP 90643)]
[New Thread 0x7fffcf5f5700 (LWP 90644)]
[New Thread 0x7fffcf4f4700 (LWP 90645)]
[New Thread 0x7fffcf3f3700 (LWP 90646)]
[New Thread 0x7fffcf2f2700 (LWP 90647)]
[New Thread 0x7fffcf1f1700 (LWP 90648)]
[New Thread 0x7fffcf0f0700 (LWP 90649)]
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
[Thread 0x7fffcf0f0700 (LWP 90649) exited]
[Thread 0x7fffcf9f9700 (LWP 90640) exited]
[Thread 0x7ffff7fc9700 (LWP 90631) exited]
[Thread 0x7fffcf1f1700 (LWP 90648) exited]
[Thread 0x7fffcf2f2700 (LWP 90647) exited]
[Thread 0x7fffcf3f3700 (LWP 90646) exited]
[Thread 0x7fffcf4f4700 (LWP 90645) exited]
[Thread 0x7fffcf5f5700 (LWP 90644) exited]
[Thread 0x7fffcf6f6700 (LWP 90643) exited]
[Thread 0x7fffcf8f8700 (LWP 90641) exited]
[Thread 0x7fffdf14d700 (LWP 90639) exited]
[Thread 0x7fffdf24e700 (LWP 90638) exited]
[Thread 0x7fffdf34f700 (LWP 90637) exited]
[Thread 0x7fffdf450700 (LWP 90636) exited]
[Thread 0x7fffdf551700 (LWP 90635) exited]
[Thread 0x7fffdf652700 (LWP 90634) exited]
[Thread 0x7ffff4174700 (LWP 90633) exited]
[Thread 0x7ffff4275700 (LWP 90632) exited]
[Thread 0x7ffff7fcafc0 (LWP 90627) exited]
[Inferior 1 (process 90627) exited normally]
No stack.
The program is not being run.

Does that help?

Comment 8 Dmitri A. Sergatskov 2018-04-30 14:26:56 UTC
Thanks. This is helpful. I am also puzzled that the "crash" occurs only on newer CPUs (Sandy Bridge and newer), not on say Core2 Duo or Nehalem X3530 Xeon I tried.

Dmitri.
--

Comment 9 Severin Gehwolf 2018-04-30 15:36:24 UTC
Closing as not a bug. Feel free to re-open if you think there is a real issue in the JVM/JDK. Thanks!

Comment 10 Stefan Brüns 2019-01-05 20:04:32 UTC
The code forcing the segfault is there to detect a linux kernel bug:

https://github.com/JetBrains/jdk8u_hotspot/commit/d2dc33eab19e26360bbaa15450b0686b69cb282c#diff-ade79d50bf46023a5dd26524c1a7945b

    // Some OSs have a bug when upper 128bits of YMM
    // registers are not restored after a signal processing.
    // Generate SEGV here (reference through NULL)
    // and check upper YMM bits after it.
    //

The bug is actually only present when running on a 32bit application on a 32bit linux kernel on a x86_64 machine.

The bug has been fixed in Linux 4.4-rc2:
https://lore.kernel.org/patchwork/patch/616662/

The code should be probably disabled when the JVM is compiled as a x86_64 application.

Comment 11 Severin Gehwolf 2019-01-07 14:09:40 UTC
(In reply to Stefan Brüns from comment #10)
> The code forcing the segfault is there to detect a linux kernel bug:

Thanks, yes. My earlier comment was referring to the signal handler specific to hotspot. It handles SIGSEGV for various cases, among others. One such case is for the bug you've mentioned:
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/b484b18b9f14/src/os_cpu/linux_x86/vm/os_linux_x86.cpp#l215

The most common case when debugging Java applications natively is to get a SIGSEGV from the stack overflow code, though. Consider debugging the JVM with "java -Xcomp -version". Or with a program which does a null dereference and -Xcomp like this one:

$ cat TestNull.java
public class TestNull {
	public static void main(String[] args) {
		Foo foo = new Foo();
		System.out.println(foo.bar.toString());
	}

	private static class Foo {
		private Integer bar;
	}
}

Then, one can observe even more SIGSEGVs in gdb. Either way, the point was that it's expected for some SEGV needing to be handled in gdb.


Note You need to log in before you can comment on or make changes to this bug.