Bug 1712370

Summary: [F30] java-1.8.0-openjdk SEGVs on i686 in G1 code due to a race condition
Product: [Fedora] Fedora Reporter: Severin Gehwolf <sgehwolf>
Component: java-1.8.0-openjdkAssignee: Severin Gehwolf <sgehwolf>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 30CC: ahughes, dbhole, extras-qa, jerboaa, jvanek, msrb, mvala, omajid
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard:
Fixed In Version: java-1.8.0-openjdk-1.8.0.222.b10-0.fc30 java-1.8.0-openjdk-1.8.0.222.b10-0.fc29 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1683095 Environment:
Last Closed: 2019-08-11 01:12:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
hs_err file produced with JDK 8u212-b04 and reproducer_jdk8.sh
none
JDK 8u compatible patch for TestGCBasher in openjdk-11 sources none

Description Severin Gehwolf 2019-05-21 11:53:18 UTC
+++ This bug was initially created as a clone of Bug #1683095 +++

$ cat /openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/scratch/0/hs_err_pid137.log
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xf73fd94f, pid=137, tid=165
#
# JRE version: OpenJDK Runtime Environment (11.0.3+7) (build 11.0.3+7)
# Java VM: OpenJDK Server VM (11.0.3+7, mixed mode, tiered, g1 gc, linux-x86)
# Problematic frame:
# V  [libjvm.so+0x5b194f]  HeapRegion::block_size(HeapWord const*) const+0x7f
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e" (or dumping to /openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/scratch/0/core.137)
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  S U M M A R Y ------------

Command Line: -Dtest.class.path.prefix=/openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/classes/0/gc/stress/gcbasher/TestGCBasherWithG1.d:/openjdk-11/test/hotspot/jtreg/gc/stress/gcbasher -Dtest.src=/openjdk-11/test/hotspot/jtreg/gc/stress/gcbasher -Dtest.src.path=/openjdk-11/test/hotspot/jtreg/gc/stress/gcbasher -Dtest.classes=/openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/classes/0/gc/stress/gcbasher/TestGCBasherWithG1.d -Dtest.class.path=/openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/classes/0/gc/stress/gcbasher/TestGCBasherWithG1.d -Dtest.vm.opts=-XX:MaxRAMPercentage=6 -XX:OnError=gdb -p $$ -Dtest.tool.vm.opts=-J-XX:MaxRAMPercentage=6 -Dtest.compiler.opts= -Dtest.java.opts= -Dtest.jdk=/openjdk-11/build/linux-x86-normal-server-release/images/jdk -Dcompile.jdk=/openjdk-11/build/linux-x86-normal-server-release/images/jdk -Dtest.timeout.factor=4.0 -Dtest.nativepath=/openjdk-11/build/linux-x86-normal-server-release/images/test/hotspot/jtreg/native -XX:MaxRAMPercentage=6 -Djava.library.path=/openjdk-11/build/linux-x86-normal-server-release/images/test/hotspot/jtreg/native -Xlog:gc*=info -Xmx256m -XX:+UseG1GC com.sun.javatest.regtest.agent.MainWrapper /openjdk-11/build/linux-x86-normal-server-release/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcbasher_TestGCBasherWithG1_java/gc/stress/gcbasher/TestGCBasherWithG1.d/main.0.jta 120000

Host: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 8 cores, 31G, Fedora release 30 (Rawhide)
Time: Tue May  7 14:01:00 2019 CEST elapsed time: 3 seconds (0d 0h 0m 3s)

---------------  T H R E A D  ---------------

Current thread (0xd5c3f800):  GCTaskThread "GC Thread#5" [stack: 0xd3978000,0xd39f8000] [id=165]

Stack: [0xd3978000,0xd39f8000],  sp=0xd39f6ec0,  free space=507k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x5b194f]  HeapRegion::block_size(HeapWord const*) const+0x7f
V  [libjvm.so+0x563edf]  G1ContiguousSpace::block_start(void const*)+0x9f
V  [libjvm.so+0x5ad4f9]  G1RemSet::refine_card_during_gc(signed char*, G1ScanObjsDuringUpdateRSClosure*) [clone .part.0]+0xf9
V  [libjvm.so+0x5b2531]  G1RefineCardClosure::do_card_ptr(signed char*, unsigned int)+0x21
V  [libjvm.so+0x51854a]  DirtyCardQueueSet::apply_closure_to_completed_buffer(CardTableEntryClosure*, unsigned int, unsigned int, bool)+0x13a
V  [libjvm.so+0x51862c]  DirtyCardQueueSet::apply_closure_during_gc(CardTableEntryClosure*, unsigned int)+0x1c
V  [libjvm.so+0x559b2a]  G1CollectedHeap::iterate_dirty_card_closure(CardTableEntryClosure*, unsigned int)+0x3a
V  [libjvm.so+0x5af03e]  G1RemSet::update_rem_set(G1ParScanThreadState*, unsigned int)+0x11e
V  [libjvm.so+0x5af1be]  G1RemSet::oops_into_collection_set_do(G1ParScanThreadState*, unsigned int)+0x1e
V  [libjvm.so+0x568def]  G1ParTask::work(unsigned int)+0x12f
V  [libjvm.so+0xba1620]  GangWorker::loop()+0x70
V  [libjvm.so+0xb07ca7]  Thread::call_run()+0x157
V  [libjvm.so+0x98afe2]  thread_native_entry(Thread*)+0x112
C  [libpthread.so.0+0x7595]  start_thread+0x105

[...]

--- Additional comment from Severin Gehwolf on 2019-05-09 11:52:06 CEST ---

(In reply to Severin Gehwolf from comment #27)
> In a i686 chroot running GCBasher with G1 seems to reproduce quite nicely:

It needs to be run in a loop, with a bound of ~50 iterations. It passes for Parallel GC (-XX:+UseParallelGC).

--- Additional comment from Severin Gehwolf on 2019-05-10 12:10:23 CEST ---

Using -fno-tree-ch seems to fix the issue.

--- Additional comment from Severin Gehwolf on 2019-05-10 16:51:21 CEST ---

Candidate fix so as to get beyond random failures at least:

https://src.fedoraproject.org/rpms/java-11-openjdk/pull-request/45

I'll continue investigation as to what exactly is causing this. GCC bug
or OpenJDK issue by first figuring out which object file causes the problem.

--- Additional comment from Severin Gehwolf on 2019-05-13 10:32:09 CEST ---

The bad object file seems to be:

g1CollectedHeap.o

--- Additional comment from Severin Gehwolf on 2019-05-13 10:34:13 CEST ---

(In reply to Severin Gehwolf from comment #29)
> Using -fno-tree-ch seems to fix the issue.

Commit of the hotspot-tools-find-compile-flag repo which was used to find this:
https://github.com/jerboaa/hotspot-tools-find-compile-flag/commit/69f9eebe17ee723b862966a39eff59f88bb2b015

--- Additional comment from Severin Gehwolf on 2019-05-13 15:18:08 CEST ---

(In reply to Severin Gehwolf from comment #31)
> The bad object file seems to be:
> 
> g1CollectedHeap.o

Commit of the hotspot-tools-find-bad-object repo which was used to find this:
https://github.com/jerboaa/hotspot-tools-find-bad-object/commit/a002bebc26897d8bddf754648a12f7b3931a04a0

--- Additional comment from Severin Gehwolf on 2019-05-13 16:48 CEST ---



--- Additional comment from Severin Gehwolf on 2019-05-13 17:21:17 CEST ---

For the "bad" case, compile the pre-processed file from comment 34 (g1CollectedHeap.o.cpp) with:

/usr/bin/g++ -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -D_REENTRANT -pipe -fno-rtti -fno-exceptions -fvisibility=hidden -fno-strict-aliasing -fno-omit-frame-pointer -fcheck-new -fstack-protector -std=gnu++98 -DSUPPORTS_CLOCK_MONOTONIC -DLINUX -Wpointer-arith -Wsign-compare -Wunused-function -Wundef -Wformat=2 -Wunused-value -Woverloaded-virtual -Wreturn-type -fPIC -DVM_LITTLE_ENDIAN -march=i586 -fno-delete-null-pointer-checks -fno-lifetime-dse -Wno-format-zero-length -Wtype-limits -Wuninitialized -m32 -DPRODUCT -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_linux -DINCLUDE_SUFFIX_CPU=_x86 -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DIA32 -DHOTSPOT_LIB_ARCH='"i386"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED -DINCLUDE_JVMCI=0 -DINCLUDE_AOT=0 -DINCLUDE_ZGC=0 -m32 -g -O3 -fno-PIC -DTHIS_FILE='""' -c -MMD -MF /openjdk-11/build/linux-x86-normal-server-release/hotspot/variant-server/libjvm/objs/g1CollectedHeap.d -o g1CollectedHeap.o g1CollectedHeap.o.cpp

For the "good" case, compile the pre-processed file from comment 34 with (-fno-tree-ch added):

/usr/bin/g++ -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -D_REENTRANT -pipe -fno-rtti -fno-exceptions -fvisibility=hidden -fno-strict-aliasing -fno-omit-frame-pointer -fcheck-new -fstack-protector -std=gnu++98 -DSUPPORTS_CLOCK_MONOTONIC -DLINUX -Wpointer-arith -Wsign-compare -Wunused-function -Wundef -Wformat=2 -Wunused-value -Woverloaded-virtual -Wreturn-type -fPIC -DVM_LITTLE_ENDIAN -march=i586 -fno-delete-null-pointer-checks -fno-lifetime-dse -Wno-format-zero-length -Wtype-limits -Wuninitialized -m32 -DPRODUCT -DTARGET_ARCH_x86 -DINCLUDE_SUFFIX_OS=_linux -DINCLUDE_SUFFIX_CPU=_x86 -DINCLUDE_SUFFIX_COMPILER=_gcc -DTARGET_COMPILER_gcc -DIA32 -DHOTSPOT_LIB_ARCH='"i386"' -DCOMPILER1 -DCOMPILER2 -DDTRACE_ENABLED -DINCLUDE_JVMCI=0 -DINCLUDE_AOT=0 -DINCLUDE_ZGC=0 -m32 -g -O3 -fno-tree-ch -fno-PIC -DTHIS_FILE='""' -c -MMD -MF /openjdk-11/build/linux-x86-normal-server-release/hotspot/variant-server/libjvm/objs/g1CollectedHeap.d -o g1CollectedHeap.o g1CollectedHeap.o.cpp

--- Additional comment from Severin Gehwolf on 2019-05-14 15:32:23 CEST ---

This reproduces with GCC 9.1.1 as well.

Comment 1 Severin Gehwolf 2019-05-21 11:55:52 UTC
# cat reproducer_jdk8.sh 
#!/bin/bash
JDK=/usr/lib/jvm/java-1.8.0-openjdk
GC_BASHER_CLASSES=/gc-basher-classes-jdk8
for i in $(seq 50); do
  ${JDK}/bin/java \
    -cp $GC_BASHER_CLASSES \
    -Xmx256m \
    -XX:+UseG1GC TestGCBasherWithG1 120000
  retval=$?
  echo "iteration $i"
  if [ $retval -ne 0 ]; then
    exit $retval
  fi
done
exit $retval

<mock-chroot> sh-5.0# bash reproducer_jdk8.sh 
iteration 1
iteration 2
iteration 3
iteration 4
iteration 5
iteration 6
iteration 7
iteration 8
iteration 9
iteration 10
iteration 11
iteration 12
iteration 13
iteration 14
iteration 15
iteration 16
iteration 17
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xf71b79e6, pid=850, tid=0xe66ffb40
#
# JRE version: OpenJDK Runtime Environment (8.0_212-b04) (build 1.8.0_212-b04)
# Java VM: OpenJDK Server VM (25.212-b04 mixed mode linux-x86 )
# Problematic frame:
# V  [libjvm.so+0x4349e6]  G1BlockOffsetArrayContigSpace::block_start_unsafe(void const*)+0x76
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# //hs_err_pid850.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
reproducer_jdk8.sh: line 9:   850 Aborted                 (core dumped) ${JDK}/bin/java -cp $GC_BASHER_CLASSES -Xmx256m -XX:+UseG1GC TestGCBasherWithG1 120000
iteration 18

Comment 2 Severin Gehwolf 2019-05-21 11:57:54 UTC
Created attachment 1571596 [details]
hs_err file produced with JDK 8u212-b04 and reproducer_jdk8.sh

Comment 3 Severin Gehwolf 2019-05-21 12:02:06 UTC
Created attachment 1571597 [details]
JDK 8u compatible patch for TestGCBasher in openjdk-11 sources

Apply this patch to JDK 11 sources of gcbasher in test/hotspot/jtreg/gc/stress/gcbasher of jdk-updates/jdk11u, revision e321dd1f202a (tag jdk-11.0.4+3). Then compile basher sources with JDK 8u and you should be able to run the JDK 8 reproducer.

Comment 4 Severin Gehwolf 2019-06-13 15:36:21 UTC
Root cause of this problem is JDK-8225716

Comment 5 Severin Gehwolf 2019-06-26 16:48:50 UTC
OpenJDK version 8u222-b07 (an EA tag) or better will have the root cause fixed.

Comment 6 Fedora Update System 2019-08-01 08:55:33 UTC
FEDORA-2019-3854a1727e has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-3854a1727e

Comment 7 Fedora Update System 2019-08-01 08:55:38 UTC
FEDORA-2019-146b81efba has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-146b81efba

Comment 8 Fedora Update System 2019-08-02 00:55:06 UTC
java-1.8.0-openjdk-1.8.0.222.b10-0.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-146b81efba

Comment 9 Fedora Update System 2019-08-02 01:24:24 UTC
java-1.8.0-openjdk-1.8.0.222.b10-0.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-3854a1727e

Comment 10 Fedora Update System 2019-08-11 01:12:34 UTC
java-1.8.0-openjdk-1.8.0.222.b10-0.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Comment 11 Fedora Update System 2019-08-11 01:41:59 UTC
java-1.8.0-openjdk-1.8.0.222.b10-0.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.