The following piece of code produce incorrect result on java -version openjdk version "1.8.0_60" OpenJDK Runtime Environment (build 1.8.0_60-b16) OpenJDK 64-Bit Server VM (build 25.60-b16, mixed mode) If I uncomment first System.err.println or print k (currently commented out) inside the loop in the second System.err.println - no such problem arise. java -Xint also produce correct results. It is hard for me to present the unit case (code base is huge), I am presenting the relevant part: ---------------- public double [] getX(final double [] moms) { final double [] r=new double [moms.length-1]; final double [] rmoms=new double [moms.length]; //System.err.println("INIT:rmoms="+rmoms+" moms="+moms+" r="+r); for(int k=0;k<r.length;k++) { final double [] tmp=new double [k+1]; tmp[k]=1; setxP(rmoms,tmp); System.err.println("k="/*+k*/+" rmoms="+rmoms+" moms="+moms+" r="+r); r[k]=TMatr.mulVV(k+2,rmoms,moms); // Line 544 !!! Arguments are null !!! } return r; } k= rmoms=[D@4784013e moms=[D@3a91d146 r=[D@6f952d6c k= rmoms=[D@5965844d moms=[D@3a095ec0 r=[D@6d4a65c6 k= rmoms=null moms=null r=null Exception in thread "main" java.lang.NullPointerException at com.polytechnik.utils.TMatr.mulVV(TMatr.java:269) at com.polytechnik.utils.TMatr.mulVV(TMatr.java:113) at YY.getX(YY.java:544) .............. ---------------------------------------
In java 1.7 this worked OK. Also, other issues in java 1.7 I was able to solve using options -XX:-UseLoopPredicate -XX:+UseSerialGC -Xshare:off In this java 1.8 these options do not help.
The bug is that hotspot made the arguments rmoms,moms on line 544 being equal to null.
Option -XX:InlineSmallCode=0 DOES HELP and unit test passes OK Option -XX:FreqInlineSize=0 DOES HELP and unit test passes OK Option -XX:LoopUnrollLimit=0 DOES NOT HELP, same problem.
Created attachment 1040981 [details] working code I managed to reproduce the problem --------- passes OK java -XX:InlineSmallCode=0 YY y=6405.9478662121455 ------------ fails java YY Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88) ------------------
Created attachment 1040983 [details] .class file .class file
With uncommented line 32 System.err.println("k="/*+k*/+" rmoms="+rmoms+" moms="+moms+" r="+r); java YY ............ k= rmoms=[D@1efe439d moms=[D@be68757 r=[D@7d446ed1 k= rmoms=[D@12d2ce03 moms=[D@7e5c856f r=[D@413f69cc k= rmoms=null moms=null r=null Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88)
I tried to report this bug to http://openjdk.java.net/ - no luck, they do not accept postings from outside. Can somebody escalate this issue upstream.
Assigning to Andrew to investigate
On my computers: java -XX:InlineSmallCode=927 passes OK java -XX:InlineSmallCode=928 FAILS
Tried on a completely different computer with different java8 version java -version openjdk version "1.8.0_45" OpenJDK Runtime Environment (build 1.8.0_45-b14) OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode) same thing k= rmoms=[D@4b1d6571 moms=[D@1b835480 r=[D@3549bca9 k= rmoms=[D@4b1d6571 moms=[D@1b835480 r=[D@3549bca9 k= rmoms=[D@4b1d6571 moms=[D@1b835480 r=[D@3549bca9 k= rmoms=[D@4b1d6571 moms=[D@1b835480 r=[D@3549bca9 k= rmoms=[D@4f25b795 moms=[D@6fb365ed r=[D@6e950bcf k= rmoms=null moms=null r=null Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88) $ java -version openjdk version "1.8.0_45" OpenJDK Runtime Environment (build 1.8.0_45-b14) OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)
After some effort tried to post this bug to http://bugs.java.com , got a response (Review ID: JI-9021717) - hotspot incorrect inline optimization From my past experience at sun/oracle almost nobody watches bug reports. In this sense redhat bug trucking system is better, especially if people can escalate the bug report upstream.
oracle java /usr/local/jdk1.8.0_45/bin/java -version java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) Also has a problem, but it is harder to trigger (you need to comment line 32: //System.err.println("k="/*+k*/+" rmoms="+rmoms+" moms="+moms+" r="+r); then the problem does show up, but not always. sometimes: /usr/local/jdk1.8.0_45/bin/java YY Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88) but sometimes the test passes. /usr/local/jdk1.8.0_45/bin/java YY2 y=6405.9478662121455 While openjdk always trigger the problem (and no need to comment line 32)
Tried rpm -q java-1.8.0-openjdk java-1.8.0-openjdk-1.8.0.45-40.b14.fc21.x86_64 The problem exists, but it is harder to trigger now. Run the test as (set -ex ; while [ 1 ]; do java YY ; done ) ...... eventually you would get .... + '[' 1 ']' + java YY y=6405.9478662121455 + '[' 1 ']' + java YY Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88) java -version openjdk version "1.8.0_45" OpenJDK Runtime Environment (build 1.8.0_45-b14) OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)
I am still investigating the behaviour when running the reproducer, trying to work out what is going wrong. The error appears to happen when the C2 compiler compiles code for YY::getX which includes inline code for the calls to mulVV(I[D[D) and mulvv(I[DI[DI[D). I was suspicious that this might be a specific bug that has been fixed in JDK9. I ran the reproducer on JDK9 and it never generated an exception. However, it also never seems to C2 compile YY::getX. So,the problem may be the biug that has been fixed in JDK9 or it may exist in JDK9 and just have been bypassed. I am able to print the generated code for YY:getX and am still examining it to try to understand in what circumstances it might hit an NPE. It may be a problem with loop unrolling dropping off then end of the loop range but this is hard to tell since I have not been able to obtain much info regarding what the compiler is doing. Enabling tracing of the C2 compiler's IR also seems to inhibit C2 compilation of YY:getX. I will persevere.
I am suspicious that this problem relates to deoptimization of methods. When I run with -XX:+PrintCompilation I always seem to see this output when the exception occurs . . . 654 19 4 YY::mulVV (39 bytes) made not entrant Exception in thread "main" 665 23 3 java.util.Random::nextInt (74 bytes) made not entrant java.lang.NullPointerException at YY.mulVV(YY.java:79) at YY.mulVV(YY.java:71) at YY.getX(YY.java:34) When the error occur this happens just after getX is compiled which is why I thought the problem related to that method. Looking at the x86 code generated for both getX and the inner mulVV method I cannot see anything wrong with the code and certainly nothing which might account for an NPE. If there is a problem at deopt then this might lead to one or more slots in the interpreted frame being populated with a NULL pointer in place of the array pointer held in the C2 frame. I cannot account for this sudden change in the value of both moms and rmoms by any other rational path. Unfortunately I cannot reproduce the error consistently enough on my debug build of jdk8 to be able to investigate this hypothesis. Yesterday the program was throwing an NPE about half the time. Today it did so for a few runs this morning and then has not done so ever since. I'm going to have to put this on hold until we can get a more reliable reproducer.
1. After long testing I got an error even with -XX:InlineSmallCode=0 (hard to trigger, but happens) 2. Tried to diag compilation this sequence works OK 78 16 3 YY::setxP (9 bytes) made not entrant 79 36 4 java.util.Random::nextInt (74 bytes) 80 26 3 java.util.Random::nextInt (74 bytes) made not entrant 85 37 4 YY::getX (64 bytes) 93 30 % 3 YY::runTests @ -2 (97 bytes) made not entrant 94 38 % 4 YY::runTests @ 32 (97 bytes) this sequence fails 60 29 4 YY::getX (64 bytes) 63 30 4 YY::setaxbP (107 bytes) 68 31 % 3 YY::runTests @ 67 (97 bytes) 69 22 3 YY::getX (64 bytes) made not entrant Exception in thread "main" java.lang.NullPointerException 70 32 3 java.lang.String::getChars (62 bytes) ========== Original Compilation info: ==================================== ----------- Works OK java -XX:InlineSmallCode=100000 -XX:+PrintCompilation YY 47 1 3 java.lang.String::hashCode (55 bytes) 50 2 3 java.lang.String::charAt (29 bytes) 51 3 3 java.lang.String::equals (81 bytes) 51 4 3 java.lang.Object::<init> (1 bytes) 51 5 3 java.lang.String::length (6 bytes) 52 6 3 java.lang.String::indexOf (70 bytes) 55 7 n 0 java.lang.System::arraycopy (native) (static) 55 8 3 java.lang.Math::min (11 bytes) 55 9 1 java.lang.Object::<init> (1 bytes) 55 4 3 java.lang.Object::<init> (1 bytes) made not entrant 56 10 3 java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes) 56 11 1 java.lang.ref.Reference::get (5 bytes) 59 12 n 0 sun.misc.Unsafe::compareAndSwapLong (native) 60 13 3 java.util.Random::next (47 bytes) 60 15 3 java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes) 60 14 3 java.util.concurrent.atomic.AtomicLong::get (5 bytes) 60 18 3 YY::mulVV (39 bytes) 60 19 3 YY::setaxbP (107 bytes) 61 16 3 YY::setxP (9 bytes) 61 17 3 java.util.Random::nextDouble (24 bytes) 61 20 3 YY::setxP (14 bytes) 61 21 3 YY::mulVV (9 bytes) 61 23 4 YY::mulVV (39 bytes) 61 22 3 YY::getX (64 bytes) 61 24 4 YY::setaxbP (107 bytes) 61 25 4 java.util.Random::next (47 bytes) 62 26 3 java.util.Random::nextInt (74 bytes) 62 13 3 java.util.Random::next (47 bytes) made not entrant 62 27 4 java.util.Random::nextDouble (24 bytes) 63 17 3 java.util.Random::nextDouble (24 bytes) made not entrant 65 28 4 YY::getX (64 bytes) 66 18 3 YY::mulVV (39 bytes) made not entrant 66 19 3 YY::setaxbP (107 bytes) made not entrant 66 24 4 YY::setaxbP (107 bytes) made not entrant 66 29 4 YY::setaxbP (107 bytes) 69 30 % 3 YY::runTests @ 67 (97 bytes) 70 31 3 YY::runTests (97 bytes) 72 32 % 4 YY::runTests @ 67 (97 bytes) 73 22 3 YY::getX (64 bytes) made not entrant 73 28 4 YY::getX (64 bytes) made not entrant 73 33 % 4 YY::getX @ 15 (64 bytes) 73 34 4 YY::mulVV (9 bytes) 75 21 3 YY::mulVV (9 bytes) made not entrant 75 35 4 YY::setxP (9 bytes) 78 16 3 YY::setxP (9 bytes) made not entrant 79 36 4 java.util.Random::nextInt (74 bytes) 80 26 3 java.util.Random::nextInt (74 bytes) made not entrant 85 37 4 YY::getX (64 bytes) 93 30 % 3 YY::runTests @ -2 (97 bytes) made not entrant 94 38 % 4 YY::runTests @ 32 (97 bytes) 10124 32 % 4 YY::runTests @ -2 (97 bytes) made not entrant 10125 39 1 java.lang.ThreadLocal::access$400 (5 bytes) 10126 40 3 sun.misc.FDBigInteger::mult (64 bytes) 10126 41 3 sun.misc.FDBigInteger::<init> (30 bytes) y=6405.9478662121455 ------------------- -------------- FAILS --------------- java -XX:InlineSmallCode=100000 -XX:+PrintCompilation YY 42 1 3 java.lang.String::hashCode (55 bytes) 44 2 3 java.lang.String::charAt (29 bytes) 45 3 3 java.lang.String::equals (81 bytes) 45 4 3 java.lang.Object::<init> (1 bytes) 45 5 3 java.lang.String::length (6 bytes) 45 6 3 java.lang.String::indexOf (70 bytes) 48 7 n 0 java.lang.System::arraycopy (native) (static) 48 8 3 java.lang.Math::min (11 bytes) 48 9 1 java.lang.Object::<init> (1 bytes) 48 4 3 java.lang.Object::<init> (1 bytes) made not entrant 48 10 3 java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes) 48 11 1 java.lang.ref.Reference::get (5 bytes) 51 12 n 0 sun.misc.Unsafe::compareAndSwapLong (native) 51 13 3 java.util.Random::next (47 bytes) 52 15 3 java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes) 52 14 3 java.util.concurrent.atomic.AtomicLong::get (5 bytes) 52 17 3 java.util.Random::nextDouble (24 bytes) 52 18 3 YY::mulVV (39 bytes) 52 19 3 YY::setaxbP (107 bytes) 52 16 3 YY::setxP (9 bytes) 53 20 3 YY::setxP (14 bytes) 53 21 3 YY::mulVV (9 bytes) 53 23 4 YY::mulVV (39 bytes) 53 22 3 YY::getX (64 bytes) 53 24 4 YY::setaxbP (107 bytes) 53 25 4 java.util.Random::next (47 bytes) 54 26 3 java.util.Random::nextInt (74 bytes) 54 13 3 java.util.Random::next (47 bytes) made not entrant 54 27 4 java.util.Random::nextDouble (24 bytes) 56 17 3 java.util.Random::nextDouble (24 bytes) made not entrant 57 18 3 YY::mulVV (39 bytes) made not entrant 57 19 3 YY::setaxbP (107 bytes) made not entrant 58 24 4 YY::setaxbP (107 bytes) made not entrant 58 28 % 4 YY::setaxbP @ 18 (107 bytes) 60 29 4 YY::getX (64 bytes) 63 30 4 YY::setaxbP (107 bytes) 68 31 % 3 YY::runTests @ 67 (97 bytes) 69 22 3 YY::getX (64 bytes) made not entrant Exception in thread "main" java.lang.NullPointerException 70 32 3 java.lang.String::getChars (62 bytes) at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88) 70 33 3 java.lang.AbstractStringBuilder::append (50 bytes)
Created attachment 1045164 [details] nore arguments added I added one more argument to public double [] getX(final double [] moms,final double zzDouble) and uncommented System.err.println("k="/*+k*/+" rmoms="+rmoms+" moms="+moms+" r="+r+" zzDouble="+zzDouble); After some time the program failed (see below) note the value of zzDouble=0. I tried adding more arguments of different type: int zzInt & String zzString I did not manage to reproduce the problem with zzInt & zzString, but do managed (after some time) to reproduce with double type. So for some reason the stack in getX is probably gets 0 cleared. k= rmoms=[D@632ceb35 moms=[D@1c93f6e1 r=[D@1800a575 zzDouble=2.6711 k= rmoms=[D@632ceb35 moms=[D@1c93f6e1 r=[D@1800a575 zzDouble=2.6711 k= rmoms=[D@632ceb35 moms=[D@1c93f6e1 r=[D@1800a575 zzDouble=2.6711 k= rmoms=[D@1458ed9c moms=[D@10a9d961 r=[D@130e116b zzDouble=2.6711 k= rmoms=null moms=null r=null zzDouble=0.0 Exception in thread "main" java.lang.NullPointerException at YY.mulVV(YY.java:78) at YY.mulVV(YY.java:70) at YY.getX(YY.java:33) at YY.runTests(YY.java:14) at YY.main(YY.java:88)
Also, looks like the problem is easier to reproduce under light CPU/disk load. For high CPU/disk load looks it is much harder to trigger the problem.
Andrew, is it possible to escalate the issue to oracle hotspot developers? What I always liked with Linux - I was always being able to reach Linus Torvalds with reasonable bug report. With Sun/Oracle - I have never being able to reach anyone remotely knowledgeable. Can you escalate the issue? The problem can be reproduced for example on Oracle JDK /usr/local/jdk1.8.0_45/bin/java -version java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) This way you would save your time - and let oracle people dig their shit.
Ok, I have found the answer here. It not only explains your problem but also why I originally saw the error being reproduced but stopped seeing it yesterday. I first reproduced the problem with an Oracle jdk8u, 1.8.0_31-b13, and then with a debug version built from my local tree, 1.8.0_60-bxx where I thought I had built from the latest code (xx=21) but clearly that was not the case (unfortunately private builds don't clearly report the _bxx value). When I rebuilt the latest tree yesterday I ended up with 1.8.0_60-b21 and that stopped reproducing the error. Winding the hotspot code back to 1.8.0_60-b17 causes the bug to reproduce once again. That also explains why the error no longer shows on 9 -- the problem was fixed in 1.8.0_60_b18 and also in 9. The relevant JIRA is this https://bugs.openjdk.java.net/browse/JDK-8060036 In summary, an error in the way the compiler manages type information causes the C2 compiled code for the inner version of mulVV to erroneously fail an array range check and deoptimise the compiled code -- I would guess at the point where the array loop index hits -1. The code reverts to running that call in the interpreter. Unfortunately, the local variables in the C2 compiled frame -- icluding the array references -- get lost and end up being replaced with NULL or zero values in the interpreter frame. The loop continues iterating and trips over the NULL array references. That also squares with your extra local variable z being reset to 0.0. So, what do you need to do? Once you upgrade to a version of OpenJDK >= 1.8.0_60_b18 this problem should go away.
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle. Changing version to '23'. (As we did not run this process for some time, it could affect also pre-Fedora 23 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23
Tried java -version openjdk version "1.8.0_60" OpenJDK Runtime Environment (build 1.8.0_60-b24) OpenJDK 64-Bit Server VM (build 25.60-b23, mixed mode) seems to be working fine java YY y=6405.9478662121455
This message is a reminder that Fedora 23 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 23. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '23'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 23 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora is now on u111 so this should have been fixed a long time ago.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days