Your package java-1.8.0-openjdk failed to build from source in current rawhide. https://koji.fedoraproject.org/koji/taskinfo?taskID=17726221 For details on mass rebuild see https://fedoraproject.org/wiki/Fedora_26_Mass_Rebuild
Created attachment 1252603 [details] root.log
Created attachment 1252604 [details] state.log
Ouch. Thats sigsev,
99% GCC7 related. Is openjdk even supposed to build by that?
Not sure.. good q for aph. Adding him to cc:
OpenJDK hasn't much been tested with GCC 7, but it should work. We add a bunch of command line switches, which ought to be enough: -fno-delete-null-pointer-checks -fno-lifetime-dse -fno-strict-aliasing I guess I could try debugging this...
(In reply to Andrew Haley from comment #6) > OpenJDK hasn't much been tested with GCC 7, but it should work. We add a > bunch of command line switches, which ought to be enough: > > -fno-delete-null-pointer-checks -fno-lifetime-dse -fno-strict-aliasing > > I guess I could try debugging this... Unluckily all those are already used. The sigsev is 100%
Created attachment 1255630 [details] hs err
(In reply to Andrew Haley from comment #6) > OpenJDK hasn't much been tested with GCC 7, but it should work. We add a > bunch of command line switches, which ought to be enough: > > -fno-delete-null-pointer-checks -fno-lifetime-dse -fno-strict-aliasing > > I guess I could try debugging this... The first two of those were added for GCC 6. -fno-strict-aliasing has been applied for years. This is the first build with GCC 7, which adds new optimisations: https://gcc.gnu.org/gcc-7/changes.html Maybe a start would be building with -fno-store-merging -fno-code-hoisting -fno-ipa-cp -fno-ipa-vrp -fno-split-loops to turn them off?
For reference, the current configure flags (x86_64) are: '--disable-zip-debug-info --with-milestone=fcs --with-update-version=121 --with-build-number=b14 --with-boot-jdk=/usr/lib/jvm/java-openjdk --with-debug-level=release --enable-unlimited-crypto --enable-system-nss --with-zlib=system --with-libjpeg=system --with-giflib=system --with-libpng=system --with-lcms=bundled --with-stdc++lib=dynamic --with-extra-cxxflags=-g -pipe -Wformat -Wno-cpp -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -std=gnu++98 -fno-delete-null-pointer-checks -fno-lifetime-dse --with-extra-cflags=-g -pipe -Wformat -Wno-cpp -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic -std=gnu++98 -Wno-error -fno-delete-null-pointer-checks -fno-lifetime-dse --with-extra-ldflags=-Wl,-z,relro -specs=/usr/lib/rpm/redhat/redhat-hardened-ld --with-num-cores=48' armv7hl built successfully, which suggests it's failing in the JIT.
> > https://gcc.gnu.org/gcc-7/changes.html > > Maybe a start would be building with -fno-store-merging -fno-code-hoisting > -fno-ipa-cp -fno-ipa-vrp -fno-split-loops to turn them off? Those helped. With all four build "passes" on my local machine (just died with not enouh space on device:/, but much later then origianlly) I will try to try which one is crucial one.
(In reply to jiri vanek from comment #11) > > > > https://gcc.gnu.org/gcc-7/changes.html > > > > Maybe a start would be building with -fno-store-merging -fno-code-hoisting > > -fno-ipa-cp -fno-ipa-vrp -fno-split-loops to turn them off? > > Those helped. With all four build "passes" on my local machine (just died > with not enouh space on device:/, but much later then origianlly) > > I will try to try which one is crucial one. Good. We should be able to discover where the actual bug in HotSpot is: while it is possible this is a GCC bug, I greatly doubt it.
So the magical swithc necessary for gcc7 is " -fno-split-loops" I will incude in RPMS.
(In reply to jiri vanek from comment #13) > So the magical swithc necessary for gcc7 is " -fno-split-loops" > > I will incude in RPMS. Thanks. I had a feeling it might be that one. It's the one only enabled by O3, while the rest are O2.
(In reply to Andrew John Hughes from comment #14) > (In reply to jiri vanek from comment #13) > > So the magical swithc necessary for gcc7 is " -fno-split-loops" > > > > I will incude in RPMS. > > Thanks. I had a feeling it might be that one. It's the one only enabled by > O3, while the rest are O2. That's only papering over the bug. There's no reason that this should break the VM, and it indicates a serious bug in either GCC or HotSpot. We must not simply disable this optimization and let it go.
The bug can not be in shanandoah integration forest as we are using it on Intel only, but allarches (except arm32 are failing). It may be in aarch64 integration. I will try that one, then vanila jdk8 and then jdk9. If the bug will be in both vanilas, I will fill upstream bug.
I'll try to narrow down the problem of which object when compiled with the split-loops optimization causes this as I've had some experience with doing this already.
(In reply to Severin Gehwolf from comment #17) > I'll try to narrow down the problem of which object when compiled with the > split-loops optimization causes this as I've had some experience with doing > this already. oook! pure aarch64/8u fails too. Now I'm running pure jdk8u. I wonted to try pure jdk9 after that, and if both "pure upstreams" fails submit the upstream bug. Maybe in menatime you will find the fix in all!
(In reply to jiri vanek from comment #18) > > I wonted to try pure jdk9 after that, and if both "pure upstreams" fails > submit the upstream bug. Submit the bug to which upstream, though?
For jdk8 and 9, is there anything else then https://bugs.openjdk.java.net ?
> pure aarch64/8u fails too. > Now I'm running pure jdk8u. > yup. jdk8u fails too.
Jdk9 dies in much more terrible way. Most repeated errors are: ++ /usr/bin/tee /builddir/build/BUILD/openjdk/build/linux-x86_64-normal-server-release/bootcycle-build/hotspot/variant-server/libjvm/objs/macroAssembler_x86_exp.o.log g++: internal compiler error: Segmentation fault (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <http://bugzilla.redhat.com/bugzilla> for instructions. g++: internal compiler error: Killed (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <http://bugzilla.redhat.com/bugzilla> for instructions. gmake[5]: *** [lib/CompileJvm.gmk:203: /builddir/build/BUILD/openjdk/build/linux-x86_64-normal-server-release/bootcycle-build/hotspot/variant-server/libjvm/objs/escape.o] Error 4 gmake[5]: *** Deleting file '/builddir/build/BUILD/openjdk/build/linux-x86_64-normal-server-release/bootcycle-build/hotspot/variant-server/libjvm/objs/escape.o' gmake[5]: *** [lib/CompileJvm.gmk:203: /builddir/build/BUILD/openjdk/build/linux-x86_64-normal-server-release/bootcycle-build/hotspot/variant-server/libjvm/objs/ad_x86_clone.o] Error 4 + exitcode=4 === Output from failing command(s) repeated here === make/Init.gmk:320: Building on-failure /usr/bin/printf "* For target hotspot_variant-server_libjvm_objs_ad_x86_clone.o:\n" + /usr/bin/printf '* For target hotspot_variant-server_libjvm_objs_ad_x86_clone.o:\n' * For target hotspot_variant-server_libjvm_objs_ad_x86_clone.o: make/Init.gmk:320: Building on-failure (/usr/bin/grep -v -e "^Note: including file:" < /builddir/build/BUILD/openjdk/build/linux-x86_64-normal-server-release/bootcycle-build/make-support/failure-logs/hotspot_variant-server_libjvm_objs_ad_x86_clone.o.log || true) | /usr/bin/head -n 12 + /usr/bin/grep -v -e '^Note: including file:' + /usr/bin/head -n 12 {standard input}: Assembler messages: {standard input}:15680: Warning: end of file not at end of a line; newline inserted {standard input}:15706: Error: unknown pseudo-op: `.cfi' {standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive g++: internal compiler error: Killed (program cc1plus) Please submit a full bug report, with preprocessed source if appropriate. See <http://bugzilla.redhat.com/bugzilla> for instructions. make/Init.gmk:320: Building on-failure It went so bad that it even shut down my Session. I guess the compilation went so bad that it somehow deadlocked or owerfloved and my chroot run out of ram. I will postpond the bug report until sewerin make his decission.
-fno-split-loops do not solve the build issue for 9.
fyi: https://bugs.openjdk.java.net/browse/JDK-8175296
According to my script[1], the bad object seems to be psParallelCompact.o I'll try to dig a bit deeper, next. [1] https://github.com/jerboaa/hotspot-tools-find-bad-object
(In reply to Severin Gehwolf from comment #27) > According to my script[1], the bad object seems to be psParallelCompact.o > > I'll try to dig a bit deeper, next. > > [1] https://github.com/jerboaa/hotspot-tools-find-bad-object Compiling the file with -fsanitize-undefined might be instructive.
I've compiled psParallelCompact.o with -fsanitize=undefined -fno-sanitize-recover then relinked the JVM with the same options and ran the reproducer: $ bash reproducer.sh $(pwd)/build/jdk8.build/hotspot/linux_amd64_compiler2/fastdebug/ $(pwd)/build/jdk8.build/images/j2sdk-image/ /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:230:53: runtime error: reference binding to misaligned address 0x7f7c551fdb02 for type 'const struct oop', which requires 8 byte alignment 0x7f7c551fdb02: note: pointer points here 24 08 49 be 00 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] ^
(In reply to Severin Gehwolf from comment #29) > I've compiled psParallelCompact.o with -fsanitize=undefined > -fno-sanitize-recover then relinked the JVM with the same options and ran > the reproducer: > > $ bash reproducer.sh > $(pwd)/build/jdk8.build/hotspot/linux_amd64_compiler2/fastdebug/ > $(pwd)/build/jdk8.build/images/j2sdk-image/ > /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/ > hotspot/src/share/vm/oops/oop.inline.hpp:230:53: runtime error: reference > binding to misaligned address 0x7f7c551fdb02 for type 'const struct oop', > which requires 8 byte alignment > 0x7f7c551fdb02: note: pointer points here > 24 08 49 be 00 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] > ^ Gosh. That's not at all good. I don't think there's any where in HotSpot where an OOP should be misaligned, so this is worthy of investigation.
Removing blocked bug since there is a successful rawhide build[1]. The eventual fix might be different, but it's at least self-building with the -fno-split-loops work-around. Root cause investigation is on-going. [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=860652
Created attachment 1256501 [details] Pre-processed psParallelCompact.cpp
(In reply to Andrew Haley from comment #30) > (In reply to Severin Gehwolf from comment #29) > > I've compiled psParallelCompact.o with -fsanitize=undefined > > -fno-sanitize-recover then relinked the JVM with the same options and ran > > the reproducer: > > > > $ bash reproducer.sh > > $(pwd)/build/jdk8.build/hotspot/linux_amd64_compiler2/fastdebug/ > > $(pwd)/build/jdk8.build/images/j2sdk-image/ > > /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/ > > hotspot/src/share/vm/oops/oop.inline.hpp:230:53: runtime error: reference > > binding to misaligned address 0x7f7c551fdb02 for type 'const struct oop', > > which requires 8 byte alignment > > 0x7f7c551fdb02: note: pointer points here > > 24 08 49 be 00 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] > > ^ > > Gosh. That's not at all good. I don't think there's any where in HotSpot > where an OOP should be misaligned, so this is worthy of investigation. Note that this behaves the same for compilations with "-fno-split-loops" and without that flag. Might be a red herring. Not sure. Something to look into either way.
(In reply to Severin Gehwolf from comment #33) > Note that this behaves the same for compilations with "-fno-split-loops" and > without that flag. Might be a red herring. Not sure. Something to look into > either way. Yes. IME it's always worth looking at very unexpected things like this.
Thread 11 "java" hit Breakpoint 3, 0x00007ffff45111f4 in __ubsan_handle_type_mismatch () from /lib64/libubsan.so.0 /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:230:53: runtime error: reference binding to misaligned address 0x7fffe09a78c2 for type 'const struct oop', which requires 8 byte alignment 0x7fffe09a78c2: note: pointer points here 24 08 49 be 10 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] ^ 0x00007ffff68287e9 in oopDesc::load_heap_oop (p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:230 230 inline oop oopDesc::load_heap_oop(oop* p) { return *p; } (gdb) where #0 0x00007ffff68287e9 in oopDesc::load_heap_oop (p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:230 #1 PSParallelCompact::adjust_pointer<oop> (p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp:1372 #2 PSParallelCompact::AdjustPointerClosure::do_oop (this=this@entry=0x7ffff7196d30 <PSParallelCompact::_adjust_pointer_closure>, p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:827 #3 0x00007ffff66f996e in nmethod::oops_do (this=this@entry=0x7fffe09a7650, f=0x7ffff7196d30 <PSParallelCompact::_adjust_pointer_closure>, allow_zombie=allow_zombie@entry=false) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/nmethod.cpp:2249 #4 0x00007ffff63744f7 in nmethod::oops_do (f=<optimized out>, this=0x7fffe09a7650) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/nmethod.hpp:630 #5 CodeBlobToOopClosure::do_nmethod (nm=0x7fffe09a7650, this=0x7fffd7ffe530) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/memory/iterator.cpp:51 #6 CodeBlobToOopClosure::do_code_blob (this=0x7fffd7ffe530, cb=0x7fffe09a7650) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/memory/iterator.cpp:60 #7 0x00007ffff5f973cc in CodeCache::blobs_do (f=f@entry=0x7fffd7ffe530) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/codeCache.cpp:328 #8 0x00007ffff6820950 in PSParallelCompact::adjust_roots () at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2476 #9 0x00007ffff6835229 in PSParallelCompact::invoke_no_policy (maximum_heap_compaction=<optimized out>) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2089 #10 0x00007ffff6839a39 in PSParallelCompact::invoke_no_policy (maximum_heap_compaction=<optimized out>) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2253 #11 0x00007ffff6853635 in PSScavenge::invoke () at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psScavenge.cpp:249 #12 0x00007ffff679514a in ParallelScavengeHeap::failed_mem_allocate (this=this@entry=0x7ffff001dd30, size=1697) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp:448 #13 0x00007ffff6aaad0d in VM_ParallelGCFailedAllocation::doit (this=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/vmPSOperations.cpp:48 #14 0x00007ffff6ad0796 in VM_Operation::evaluate (this=this@entry=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vm_operations.cpp:62 #15 0x00007ffff6acd830 in VMThread::evaluate_operation (this=this@entry=0x7ffff010f800, op=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:377 #16 0x00007ffff6ace59d in VMThread::loop (this=this@entry=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:502 #17 0x00007ffff6acea96 in VMThread::run (this=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276 #18 0x00007ffff6757b22 in java_start (thread=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:782 #19 0x00007ffff7bbe5bd in start_thread () from /lib64/libpthread.so.0 #20 0x00007ffff72cec3f in clone () from /lib64/libc.so.6 Backtrace when one of the misaligned accesses happens.
another one: Thread 11 "java" hit Breakpoint 3, 0x00007ffff45111f4 in __ubsan_handle_type_mismatch () from /lib64/libubsan.so.0 /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oopsHierarchy.hpp:92:36: runtime error: member call on misaligned address 0x7fffe09a78c2 for type 'const struct oop', which requires 8 byte alignment 0x7fffe09a78c2: note: pointer points here 24 08 49 be 10 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] ^ 0x00007ffff68287d5 in oop::oop (o=..., this=0x7fffd7ffe300) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oopsHierarchy.hpp:92 92 oop(const oop& o) { set_obj(o.obj()); } (gdb) bt #0 0x00007ffff68287d5 in oop::oop (o=..., this=0x7fffd7ffe300) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oopsHierarchy.hpp:92 #1 oopDesc::load_heap_oop (p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:230 #2 PSParallelCompact::adjust_pointer<oop> (p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp:1372 #3 PSParallelCompact::AdjustPointerClosure::do_oop (this=this@entry=0x7ffff7196d30 <PSParallelCompact::_adjust_pointer_closure>, p=0x7fffe09a78c2) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:827 #4 0x00007ffff66f996e in nmethod::oops_do (this=this@entry=0x7fffe09a7650, f=0x7ffff7196d30 <PSParallelCompact::_adjust_pointer_closure>, allow_zombie=allow_zombie@entry=false) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/nmethod.cpp:2249 #5 0x00007ffff63744f7 in nmethod::oops_do (f=<optimized out>, this=0x7fffe09a7650) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/nmethod.hpp:630 #6 CodeBlobToOopClosure::do_nmethod (nm=0x7fffe09a7650, this=0x7fffd7ffe530) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/memory/iterator.cpp:51 #7 CodeBlobToOopClosure::do_code_blob (this=0x7fffd7ffe530, cb=0x7fffe09a7650) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/memory/iterator.cpp:60 #8 0x00007ffff5f973cc in CodeCache::blobs_do (f=f@entry=0x7fffd7ffe530) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/code/codeCache.cpp:328 #9 0x00007ffff6820950 in PSParallelCompact::adjust_roots () at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2476 #10 0x00007ffff6835229 in PSParallelCompact::invoke_no_policy (maximum_heap_compaction=<optimized out>) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2089 #11 0x00007ffff6839a39 in PSParallelCompact::invoke_no_policy (maximum_heap_compaction=<optimized out>) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp:2253 #12 0x00007ffff6853635 in PSScavenge::invoke () at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/psScavenge.cpp:249 #13 0x00007ffff679514a in ParallelScavengeHeap::failed_mem_allocate (this=this@entry=0x7ffff001dd30, size=1697) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp:448 #14 0x00007ffff6aaad0d in VM_ParallelGCFailedAllocation::doit (this=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/vmPSOperations.cpp:48 #15 0x00007ffff6ad0796 in VM_Operation::evaluate (this=this@entry=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vm_operations.cpp:62 #16 0x00007ffff6acd830 in VMThread::evaluate_operation (this=this@entry=0x7ffff010f800, op=0x7ffff7fdfd90) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:377 #17 0x00007ffff6ace59d in VMThread::loop (this=this@entry=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:502 #18 0x00007ffff6acea96 in VMThread::run (this=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276 #19 0x00007ffff6757b22 in java_start (thread=0x7ffff010f800) at /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:782 #20 0x00007ffff7bbe5bd in start_thread () from /lib64/libpthread.so.0 #21 0x00007ffff72cec3f in clone () from /lib64/libc.so.6
(In reply to Severin Gehwolf from comment #33) > (In reply to Andrew Haley from comment #30) > > (In reply to Severin Gehwolf from comment #29) > > > I've compiled psParallelCompact.o with -fsanitize=undefined > > > -fno-sanitize-recover then relinked the JVM with the same options and ran > > > the reproducer: > > > > > > $ bash reproducer.sh > > > $(pwd)/build/jdk8.build/hotspot/linux_amd64_compiler2/fastdebug/ > > > $(pwd)/build/jdk8.build/images/j2sdk-image/ > > > /builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/ > > > hotspot/src/share/vm/oops/oop.inline.hpp:230:53: runtime error: reference > > > binding to misaligned address 0x7f7c551fdb02 for type 'const struct oop', > > > which requires 8 byte alignment > > > 0x7f7c551fdb02: note: pointer points here > > > 24 08 49 be 00 00 00 ca 06 00 00 00 4c 89 74 24 38 4c [...] > > > ^ > > > > Gosh. That's not at all good. I don't think there's any where in HotSpot > > where an OOP should be misaligned, so this is worthy of investigation. > > Note that this behaves the same for compilations with "-fno-split-loops" and > without that flag. Might be a red herring. Not sure. Something to look into > either way. Do you get the same thing with GCC 6?
(In reply to Andrew John Hughes from comment #37) > Do you get the same thing with GCC 6? This is a rawhide mock I'm working with and there is only gcc 7. I don't know whether this is also happening with GCC 6. It would take me a while to find out. I might find out later.
(In reply to Severin Gehwolf from comment #38) > (In reply to Andrew John Hughes from comment #37) > > Do you get the same thing with GCC 6? AFAIK its not reproducible in any other (gcc6) fedora => gcc7 only.
(In reply to jiri vanek from comment #39) > (In reply to Severin Gehwolf from comment #38) > > (In reply to Andrew John Hughes from comment #37) > > > Do you get the same thing with GCC 6? > > AFAIK its not reproducible in any other (gcc6) fedora => gcc7 only. Have you actually tried this? He asked about the runtime error when compiled with -fsanitize=undefined on GCC 6. Not whether the bootcycle build segfaults. There is a difference.
I've narrowed it down to function PSParallelCompact::decrement_destination_counts in psParallelCompact.cpp. When I make gcc not do the split loops optimization for that function the reproducer passes.
Created attachment 1256869 [details] Patch for decrement_destination_counts to disable split-loops optimization
Created attachment 1256871 [details] Disassembly of decrement_destination_counts that's broken by split-loops
Created attachment 1256872 [details] Disassembly of decrement_destination_counts without split-loops (working)
Andrew, I'm out of my depth here :-/ What now? Does this give us enough info to tell what the actual problem is?
(In reply to Severin Gehwolf from comment #40) > (In reply to jiri vanek from comment #39) > > (In reply to Severin Gehwolf from comment #38) > > > (In reply to Andrew John Hughes from comment #37) > > > > Do you get the same thing with GCC 6? > > > > AFAIK its not reproducible in any other (gcc6) fedora => gcc7 only. > > Have you actually tried this? He asked about the runtime error when compiled > with -fsanitize=undefined on GCC 6. Not whether the bootcycle build > segfaults. There is a difference. - a/java-1.8.0-openjdk.spec +++ b/java-1.8.0-openjdk.spec @@ -1328,8 +1328,8 @@ export CFLAGS="$CFLAGS -mieee" # We use ourcppflags because the OpenJDK build seems to # pass EXTRA_CFLAGS to the HotSpot C++ compiler... # Explicitly set the C++ standard as the default has changed on GCC >= 6 -EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error -fno-delete-null-pointer-checks -fno-lifetime-dse" -EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks -fno-lifetime-dse" +EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error -fno-delete-null-pointer-checks -fno-lifetime-dse -fsanitize=undefined" +EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks -fno-lifetime-dse -fsanitize=undefined" %ifarch %{power64} ppc # fix rpmlint warnings EXTRA_CFLAGS="$EXTRA_CFLAGS -fno-strict-aliasing" @@ -1372,7 +1372,7 @@ bash ../../configure \ --with-stdc++lib=dynamic \ --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \ --with-extra-cflags="$EXTRA_CFLAGS" \ - --with-extra-ldflags="%{ourldflags}" \ + --with-extra-ldflags="%{ourldflags} -fsanitize=undefined" \ --with-num-cores="$NUM_PROC" cat spec.gmk rpms for f24 and f25 builds fine.
(In reply to jiri vanek from comment #47) > (In reply to Severin Gehwolf from comment #40) > > (In reply to jiri vanek from comment #39) > > > (In reply to Severin Gehwolf from comment #38) > > > > (In reply to Andrew John Hughes from comment #37) > > > > > Do you get the same thing with GCC 6? > > > > > > AFAIK its not reproducible in any other (gcc6) fedora => gcc7 only. > > > > Have you actually tried this? He asked about the runtime error when compiled > > with -fsanitize=undefined on GCC 6. Not whether the bootcycle build > > segfaults. There is a difference. > > - a/java-1.8.0-openjdk.spec > +++ b/java-1.8.0-openjdk.spec > @@ -1328,8 +1328,8 @@ export CFLAGS="$CFLAGS -mieee" > # We use ourcppflags because the OpenJDK build seems to > # pass EXTRA_CFLAGS to the HotSpot C++ compiler... > # Explicitly set the C++ standard as the default has changed on GCC >= 6 > -EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error > -fno-delete-null-pointer-checks -fno-lifetime-dse" > -EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks > -fno-lifetime-dse" > +EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error > -fno-delete-null-pointer-checks -fno-lifetime-dse -fsanitize=undefined" > +EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks > -fno-lifetime-dse -fsanitize=undefined" > %ifarch %{power64} ppc > # fix rpmlint warnings > EXTRA_CFLAGS="$EXTRA_CFLAGS -fno-strict-aliasing" > @@ -1372,7 +1372,7 @@ bash ../../configure \ > --with-stdc++lib=dynamic \ > --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \ > --with-extra-cflags="$EXTRA_CFLAGS" \ > - --with-extra-ldflags="%{ourldflags}" \ > + --with-extra-ldflags="%{ourldflags} -fsanitize=undefined" \ > --with-num-cores="$NUM_PROC" > > cat spec.gmk > > > > > rpms for f24 and f25 builds fine. How about -fsanitize=undefined -fno-sanitize-recover ?
HTH @@ -1329,7 +1329,7 @@ export CFLAGS="$CFLAGS -mieee" # pass EXTRA_CFLAGS to the HotSpot C++ compiler... # Explicitly set the C++ standard as the default has changed on GCC >= 6 EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error -fno-delete-null-pointer-checks -fno-lifetime-dse" -EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks -fno-lifetime-dse" +EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks -fno-lifetime-dse -fsanitize=undefined -fno-sanitize-recover" %ifarch %{power64} ppc # fix rpmlint warnings EXTRA_CFLAGS="$EXTRA_CFLAGS -fno-strict-aliasing" @@ -1372,7 +1372,7 @@ bash ../../configure \ --with-stdc++lib=dynamic \ --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \ --with-extra-cflags="$EXTRA_CFLAGS" \ - --with-extra-ldflags="%{ourldflags}" \ + --with-extra-ldflags="%{ourldflags} -fsanitize=undefined -fno-sanitize-recover" \ --with-num-cores="$NUM_PROC" cat spec.gmk 450MB build log: http://raven.brq.redhat.com/openjdk/rebuildFor1423751/jdk8FsanitizeUndefined.log.tar.xz
(In reply to jiri vanek from comment #47) > > - a/java-1.8.0-openjdk.spec > +++ b/java-1.8.0-openjdk.spec > @@ -1328,8 +1328,8 @@ export CFLAGS="$CFLAGS -mieee" > # We use ourcppflags because the OpenJDK build seems to > # pass EXTRA_CFLAGS to the HotSpot C++ compiler... > # Explicitly set the C++ standard as the default has changed on GCC >= 6 > -EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error > -fno-delete-null-pointer-checks -fno-lifetime-dse" > -EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks > -fno-lifetime-dse" > +EXTRA_CFLAGS="%ourcppflags -std=gnu++98 -Wno-error > -fno-delete-null-pointer-checks -fno-lifetime-dse -fsanitize=undefined" > +EXTRA_CPP_FLAGS="%ourcppflags -std=gnu++98 -fno-delete-null-pointer-checks > -fno-lifetime-dse -fsanitize=undefined" > %ifarch %{power64} ppc > # fix rpmlint warnings > EXTRA_CFLAGS="$EXTRA_CFLAGS -fno-strict-aliasing" > @@ -1372,7 +1372,7 @@ bash ../../configure \ > --with-stdc++lib=dynamic \ > --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \ > --with-extra-cflags="$EXTRA_CFLAGS" \ > - --with-extra-ldflags="%{ourldflags}" \ > + --with-extra-ldflags="%{ourldflags} -fsanitize=undefined" \ > --with-num-cores="$NUM_PROC" I think that would build everything with -fsanitize=undefined. Not a good idea.
(In reply to Andrew John Hughes from comment #37) > Do you get the same thing with GCC 6? FWIW, finding the needle in a haystack (from Jiri's log): $ grep -n 'runtime error' jdk8FsanitizeUndefined.log | wc -l 820676
Created attachment 1256969 [details] Reproducer (works after a mock rebuild) Run as: 1. copy the reproducer into the rawhide mock (into dir builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/) 2. mock -r fedora-rawhide-x86_64 --shell 3. cd builddir/build/BUILD/java-1.8.0-openjdk-1.8.0.121-5.b14.fc26.x86_64/openjdk/ 4. bash reproducer.sh $(pwd)/build/jdk8.build/hotspot/linux_amd64_compiler2/fastdebug/ $(pwd)/build/jdk8.build/images/j2sdk-image/
The crash is in the parallel compaction routines where an object is split across two regions. What happens here is that different threads process the two regions, so each copies part of the object. These do not have their references updated until the end of collection: instead, each region has a pointer to any object that straddles the region boundary, and they get updated at the end in a cleanup pass. In this case, the pointer is pointing to a partial object.
And: this fault is in the middle of a 189721-character string which (I think) holds an entire Java source file.
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle. Changing version to '26'.
(In reply to Andrew Haley from comment #59) > The crash is in the parallel compaction routines where an object is split > across two regions. What happens here is that different threads process the > two regions, so each copies part of the object. These do not have their > references updated until the end of collection: instead, each region has a > pointer to any object that straddles the region boundary, and they get > updated at the end in a cleanup pass. In this case, the pointer is pointing > to a partial object. I'm not sure why the split-loops optimization exhibits this bug. Does this mean it's OK to use -fno-split-loops and this needs to get upstream?
(In reply to Severin Gehwolf from comment #62) > > I'm not sure why the split-loops optimization exhibits this bug. Does this > mean it's OK to use -fno-split-loops and this needs to get upstream? No. I've been debugging it for some time. It's either a bug in gcc or a bug in hotspot, and we need to know which. If it's desperately urgent to get a working build into Fedora, then we can use -fno-split-loops as a temporary workaround.
(In reply to Andrew Haley from comment #63) > (In reply to Severin Gehwolf from comment #62) > > > > I'm not sure why the split-loops optimization exhibits this bug. Does this > > mean it's OK to use -fno-split-loops and this needs to get upstream? > > No. I've been debugging it for some time. It's either a bug in gcc or a > bug in hotspot, and we need to know which. > > If it's desperately urgent to get a working build into Fedora, then we can > use -fno-split-loops as a temporary workaround. Thanks! Current F26+ builds use -fno-split-loops work-around so it's not urgent.
Xref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79943
Since we now know it was a GCC bug, -fno-split-loops was the correct fix. The bug was in that optimization code. Even so, we should get this flag removed again once a newer GCC is in rawhide. The following build should have the fix (once it completes): https://koji.fedoraproject.org/koji/buildinfo?buildID=866355 Jiri, would you please take care of that?
Sure. TY!
All work here is done. Gcc is fixed, and rpms built by it submitted as updates