Bug 690262 - jdk loop unrolling is wrong
Summary: jdk loop unrolling is wrong
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: java-1.6.0-openjdk
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Deepak Bhole
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-23 18:11 UTC by Need Real Name
Modified: 2012-08-07 15:56 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-07 15:56:06 UTC
Type: ---


Attachments (Terms of Use)
Unit test to see incorrect hotspot loop unrolling. (2.59 KB, text/plain)
2011-03-23 18:14 UTC, Need Real Name
no flags Details
updated test (3.50 KB, text/x-java)
2011-03-24 07:55 UTC, Need Real Name
no flags Details
proper latest test case (3.50 KB, text/x-java)
2011-03-24 07:58 UTC, Need Real Name
no flags Details
Latest test (3.55 KB, text/plain)
2011-03-24 08:09 UTC, Need Real Name
no flags Details
test results (11.45 KB, text/plain)
2011-03-24 08:11 UTC, Need Real Name
no flags Details
Same problem does exist with long (3.44 KB, text/plain)
2011-03-24 12:05 UTC, Need Real Name
no flags Details
the results with longs instead of doublea (25.43 KB, text/plain)
2011-03-24 12:08 UTC, Need Real Name
no flags Details
simplified test (3.04 KB, text/plain)
2011-03-24 12:33 UTC, Need Real Name
no flags Details
errors (22.01 KB, text/plain)
2011-03-24 17:59 UTC, Need Real Name
no flags Details
New TF.java file (5.43 KB, text/plain)
2011-03-24 19:31 UTC, Deepak Bhole
no flags Details
the failure with your version (47.71 KB, text/plain)
2011-03-24 21:10 UTC, Need Real Name
no flags Details
New TF.java file (5.89 KB, text/x-java)
2011-03-24 21:32 UTC, Deepak Bhole
no flags Details
updated TF.java (5.86 KB, text/plain)
2011-03-25 05:44 UTC, Need Real Name
no flags Details
TFsimple1.java (3.21 KB, text/plain)
2011-03-25 05:47 UTC, Need Real Name
no flags Details
test with extra diag (5.76 KB, text/plain)
2011-03-25 07:39 UTC, Need Real Name
no flags Details
the failure with TFo.java (98.61 KB, text/plain)
2011-03-25 07:40 UTC, Need Real Name
no flags Details
where the problem started (846 bytes, text/plain)
2011-03-25 07:56 UTC, Need Real Name
no flags Details
repeated sa calculations (1.31 KB, text/plain)
2011-03-25 08:23 UTC, Need Real Name
no flags Details
failure 1 (assembler dump) (331.73 KB, application/x-gzip)
2011-03-26 17:06 UTC, Need Real Name
no flags Details
second failure (332.94 KB, application/x-gzip)
2011-03-26 17:07 UTC, Need Real Name
no flags Details
no failure, java terminated by kill (302.56 KB, application/x-gzip)
2011-03-26 17:08 UTC, Need Real Name
no flags Details
TFO.java (5.80 KB, application/zip)
2011-03-26 17:14 UTC, Need Real Name
no flags Details
The results with TFsimple.java (100.56 KB, application/zip)
2011-03-26 20:55 UTC, Need Real Name
no flags Details
new reproducer (hopefully) (3.96 KB, text/plain)
2011-03-28 16:36 UTC, Denis Lila
no flags Details
new reproducer results. (65.85 KB, text/plain)
2011-03-28 18:49 UTC, Need Real Name
no flags Details
hotspot failure (22.30 KB, text/plain)
2011-03-28 19:53 UTC, Need Real Name
no flags Details
an error with -XX:-Inline (155.54 KB, application/x-gzip)
2011-03-28 21:20 UTC, Need Real Name
no flags Details
a failure with LoopUnrollLimit=0 (688.31 KB, application/zip)
2011-03-29 11:18 UTC, Need Real Name
no flags Details
TFSsimple.java (3.04 KB, text/plain)
2011-04-05 04:33 UTC, Need Real Name
no flags Details
proper TFSsimple.java (2.89 KB, text/plain)
2011-04-05 04:35 UTC, Need Real Name
no flags Details

Description Need Real Name 2011-03-23 18:11:26 UTC
JDK loop unrolling is wrong on all JDKs I tried,
including , but not limited to 
java-1.6.0-openjdk-devel-1.6.0.0-50.1.8.7.fc13.x86_64

java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)

java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

ava version "1.7.0-ea"
Java(TM) SE Runtime Environment (build 1.7.0-ea-b134)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b04, mixed mode)




start the unit test attached TF.java
start several instances (there was no failures with 1 instance)
shell command to run:
for x in A B C D E F G H I J K L ; do java TF & done

the code would eventually randomly fail:

Exception in thread "main" java.lang.RuntimeException: TEST=3080 Elements [3][3] differ
m1[0][0]=70.80493904142715	m2[0][0]=70.80493904142715
m1[0][1]=5.505800716979796	m2[0][1]=5.505800716979796
m1[0][2]=1.6437479313186212	m2[0][2]=1.6437479313186212
m1[0][3]=1.4019663061342367	m2[0][3]=1.4019663061342367
m1[0][4]=1.534046915882465	m2[0][4]=1.534046915882465
m1[1][0]=67.71546800405181	m2[1][0]=67.71546800405181
m1[1][1]=18.46031213801821	m2[1][1]=18.46031213801821
m1[1][2]=15.461729052867444	m2[1][2]=15.461729052867444
m1[1][3]=16.116727575946555	m2[1][3]=16.116727575946555
m1[1][4]=17.944598246769843	m2[1][4]=17.944598246769843
m1[2][0]=139.25078042329778	m2[2][0]=139.25078042329778
m1[2][1]=46.3761808781519	m2[2][1]=46.3761808781519
m1[2][2]=40.648023996975034	m2[2][2]=40.648023996975034
m1[2][3]=42.57028104530857	m2[2][3]=42.57028104530857
m1[2][4]=47.41650732453458	m2[2][4]=47.41650732453458
m1[3][0]=342.5093615611222	m2[3][0]=342.5093615611222
m1[3][1]=96.6730861908069	m2[3][1]=96.6730861908069
m1[3][2]=81.46054748408591	m2[3][2]=81.46054748408591
m1[3][3]=84.98153472770015	m2[3][3]=84.89837472770016
m1[3][4]=94.62787551705485	m2[3][4]=94.62787551705485

	at TF.checkm(TF.java:99)
	at TF.runTest(TF.java:119)
	at TF.main(TF.java:131)
Exception in thread "main" java.lang.RuntimeException: TEST=25440 Elements [1][0] differ
m1[0][0]=70.80493904142715	m2[0][0]=70.80493904142715
m1[0][1]=5.505800716979796	m2[0][1]=5.505800716979796
m1[0][2]=1.6437479313186212	m2[0][2]=1.6437479313186212
m1[0][3]=1.4019663061342367	m2[0][3]=1.4019663061342367
m1[0][4]=1.534046915882465	m2[0][4]=1.534046915882465
m1[1][0]=67.71546800405181	m2[1][0]=67.70929080334516
m1[1][1]=18.46031213801821	m2[1][1]=18.459956042302434
m1[1][2]=15.461729052867444	m2[1][2]=15.461711457635538
m1[1][3]=16.116727575946555	m2[1][3]=16.11672681521524
m1[1][4]=17.944598246769843	m2[1][4]=17.94459821753404
m1[2][0]=139.25078042329778	m2[2][0]=139.25155700955827
m1[2][1]=46.3761808781519	m2[2][1]=46.376225645847406
m1[2][2]=40.648023996975034	m2[2][2]=40.64802620901508
m1[2][3]=42.57028104530857	m2[2][3]=42.5702811409463
m1[2][4]=47.41650732453458	m2[2][4]=47.416507328210045
m1[3][0]=342.5093615611222	m2[3][0]=342.50926279905303
m1[3][1]=96.6730861908069	m2[3][1]=96.67308049749185
m1[3][2]=81.46054748408591	m2[3][2]=81.46054720277053
m1[3][3]=84.98153472770015	m2[3][3]=84.98153471553745
m1[3][4]=94.62787551705485	m2[3][4]=94.62787551658742

	at TF.checkm(TF.java:99)
	at TF.runTest(TF.java:119)
	at TF.main(TF.java:131)
Exception in thread "main" java.lang.RuntimeException: TEST=77816 Elements [1][0] differ
m1[0][0]=70.80493904142715	m2[0][0]=70.80493904142715
m1[0][1]=5.505800716979796	m2[0][1]=5.505800716979796
m1[0][2]=1.6437479313186212	m2[0][2]=1.6437479313186212
m1[0][3]=1.4019663061342367	m2[0][3]=1.4019663061342367
m1[0][4]=1.534046915882465	m2[0][4]=1.534046915882465
m1[1][0]=67.71546800405181	m2[1][0]=67.70929080334516
m1[1][1]=18.46031213801821	m2[1][1]=18.459956042302434
m1[1][2]=15.461729052867444	m2[1][2]=15.461711457635538
m1[1][3]=16.116727575946555	m2[1][3]=16.11672681521524
m1[1][4]=17.944598246769843	m2[1][4]=17.94459821753404
m1[2][0]=139.25078042329778	m2[2][0]=139.25155700955827
m1[2][1]=46.3761808781519	m2[2][1]=46.376225645847406
m1[2][2]=40.648023996975034	m2[2][2]=40.64802620901508
m1[2][3]=42.57028104530857	m2[2][3]=42.5702811409463
m1[2][4]=47.41650732453458	m2[2][4]=47.416507328210045
m1[3][0]=342.5093615611222	m2[3][0]=342.50926279905303
m1[3][1]=96.6730861908069	m2[3][1]=96.67308049749185
m1[3][2]=81.46054748408591	m2[3][2]=81.46054720277053
m1[3][3]=84.98153472770015	m2[3][3]=84.98153471553745
m1[3][4]=94.62787551705485	m2[3][4]=94.62787551658742

	at TF.checkm(TF.java:99)
	at TF.runTest(TF.java:119)
	at TF.main(TF.java:131)

[10]   Exit 1                  java TF
[12]   Exit 1                  java TF
[15]   Exit 1                  java TF

Comment 1 Need Real Name 2011-03-23 18:14:45 UTC
Created attachment 487107 [details]
Unit test to see incorrect hotspot loop unrolling.

This is a test case to see incorrect loop unrolling by hotspot.
The matrices m1 & m2 inside runTest must be identical,
if at some point they are not - an exception is thrown

Comment 2 Need Real Name 2011-03-23 18:16:29 UTC
This is hotspot problem, generated .class files are OK

Comment 3 Need Real Name 2011-03-23 20:40:55 UTC
From observing the structure of matrix differences (single element differ)
 - the most likely reason 
is that during a switch from one type of loop execution to another
hotspot either executed first operator of the loop body twice 
or not executed it at all.

Comment 4 Deepak Bhole 2011-03-23 20:55:07 UTC
I am still analysing the source, but given how close the numbers are, this looks more like a bug related to double handling than loop unrolling.

Also, how long does it take to crash usually? I had it running for 10+ minutes and it didn't crash. Had to stop because laptop cpu got to 98C though :)

Comment 5 Deepak Bhole 2011-03-23 21:43:27 UTC
Looking at more of the code, this does look like a double comparison issue, and not related to loop unrolling. Specifically, in the code where you do:

        if(m1[i][j]!=m2[i][j])
        {
            iif=i;
            jif=j;
            break oloop;
        }

Can you please change it to:

        if(Double.compare(m1[i][j]!=m2[i][j]) != 0)
        {
            iif=i;
            jif=j;
            break oloop;
        }

and try again? == is not an equivalence operator for double values. Please see Joe Darcy's blog for a more detailed description of this:
http://blogs.sun.com/darcy/entry/notions_of_floating_point_equality

Comment 6 Deepak Bhole 2011-03-23 21:44:40 UTC
(In reply to comment #5)
> 
>         if(Double.compare(m1[i][j]!=m2[i][j]) != 0)

Sorry, that should be:

if(Double.compare(m1[i][j], m2[i][j]) != 0)

Comment 7 Deepak Bhole 2011-03-24 02:27:16 UTC
On further thought, this may be more than a rounding error. Nonetheless, can you please try the Double.compare() to eliminate that possibility and see if it still errors out? I've had the code (original with "!=") running on my desktop box (q-core i7 920 @ 3Ghz) for the past 3+ hours and so far it hasn't thrown any error.

Comment 8 Need Real Name 2011-03-24 07:55:30 UTC
Created attachment 487241 [details]
updated test

I am attaching updated TF.java where the problem is 
1. easier to reproduce
2. numbers difference is not small , for example

m1[1][1]=40377.57397780233	m2[1][1]=36281.57397780233


the code is started as 
n=0 ; while [ $n -lt 20 ]; do java TF &  n=$[ $n + 1 ] ;    done

the attached test supersedes old TF.java


Exception in thread "main" java.lang.RuntimeException: TEST=29469 Elements [1][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=38329.57397780235
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=30044 Elements [1][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=38329.57397780235
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=31890 Elements [0][0] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.888586395702
m1[0][1]=5716.6127269548415	m2[0][1]=5716.5952705629015
m1[0][2]=5624.415186604344	m2[0][2]=5624.397987441957
m1[0][3]=5751.072547006512	m2[0][3]=5751.054968098871
m1[0][4]=6057.8359619079365	m2[0][4]=6057.817467518348
m1[1][0]=43190.181781667496	m2[1][0]=43190.040840718015
m1[1][1]=40377.57397780233	m2[1][1]=40377.45114596819
m1[1][2]=39371.04889321945	m2[1][2]=39370.93072927575
m1[1][3]=40845.525454463925	m2[1][3]=40845.40313214674
m1[1][4]=44402.20207354617	m2[1][4]=44402.06916947626
m1[2][0]=103991.08438810863	m2[2][0]=103990.76428588785
m1[2][1]=97000.16811187661	m2[2][1]=96999.8766918958
m1[2][2]=94491.77180358718	m2[2][2]=94491.4892293681
m1[2][3]=98165.6040876451	m2[2][3]=98165.31074992764
m1[2][4]=107029.22874182602	m2[2][4]=107028.90898282987
m1[3][0]=192961.576069273	m2[3][0]=192960.99331829284
m1[3][1]=179643.57896832644	m2[3][1]=179643.04135665382
m1[3][2]=174920.49439552703	m2[3][2]=174919.97190336738
m1[3][3]=181800.08676866323	m2[3][3]=181799.543903905
m1[3][4]=198405.9217295142	m2[3][4]=198405.32933814573

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=33184 Elements [3][3] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.57397780233
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=165416.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=34764 Elements [0][0] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.888586395702
m1[0][1]=5716.6127269548415	m2[0][1]=5716.5952705629015
m1[0][2]=5624.415186604344	m2[0][2]=5624.397987441957
m1[0][3]=5751.072547006512	m2[0][3]=5751.054968098871
m1[0][4]=6057.8359619079365	m2[0][4]=6057.817467518348
m1[1][0]=43190.181781667496	m2[1][0]=43190.040840718015
m1[1][1]=40377.57397780233	m2[1][1]=40377.45114596819
m1[1][2]=39371.04889321945	m2[1][2]=39370.93072927575
m1[1][3]=40845.525454463925	m2[1][3]=40845.40313214674
m1[1][4]=44402.20207354617	m2[1][4]=44402.06916947626
m1[2][0]=103991.08438810863	m2[2][0]=103990.76428588785
m1[2][1]=97000.16811187661	m2[2][1]=96999.8766918958
m1[2][2]=94491.77180358718	m2[2][2]=94491.4892293681
m1[2][3]=98165.6040876451	m2[2][3]=98165.31074992764
m1[2][4]=107029.22874182602	m2[2][4]=107028.90898282987
m1[3][0]=192961.576069273	m2[3][0]=192960.99331829284
m1[3][1]=179643.57896832644	m2[3][1]=179643.04135665382
m1[3][2]=174920.49439552703	m2[3][2]=174919.97190336738
m1[3][3]=181800.08676866323	m2[3][3]=181799.543903905
m1[3][4]=198405.9217295142	m2[3][4]=198405.32933814573

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=35158 Elements [1][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40373.57397780233
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=41942 Elements [0][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.608241494241
m1[0][2]=5624.415186604344	m2[0][2]=5624.414714771186
m1[0][3]=5751.072547006512	m2[0][3]=5751.07251441535
m1[0][4]=6057.8359619079365	m2[0][4]=6057.835960128156
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.56976564972
m1[1][2]=39371.04889321945	m2[1][2]=39371.04845013602
m1[1][3]=40845.525454463925	m2[1][3]=40845.525423858606
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207187484
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16217862988
m1[2][2]=94491.77180358718	m2[2][2]=94491.77117945894
m1[2][3]=98165.6040876451	m2[2][3]=98165.60404453438
m1[2][4]=107029.22874182602	m2[2][4]=107029.22873947177
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.56782488193
m1[3][2]=174920.49439552703	m2[3][2]=174920.4932233293
m1[3][3]=181800.08676866323	m2[3][3]=181800.08668769544
m1[3][4]=198405.9217295142	m2[3][4]=198405.92172509263

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=44978 Elements [1][4] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.57397780233
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20205538556
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=55166 Elements [1][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=38329.57397780235
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=63456 Elements [1][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=38329.57397780235
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=64653 Elements [0][2] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5496.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.57397780233
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=181800.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=76351 Elements [0][1] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.608241494241
m1[0][2]=5624.415186604344	m2[0][2]=5624.414714771186
m1[0][3]=5751.072547006512	m2[0][3]=5751.07251441535
m1[0][4]=6057.8359619079365	m2[0][4]=6057.835960128156
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.56976564972
m1[1][2]=39371.04889321945	m2[1][2]=39371.04845013602
m1[1][3]=40845.525454463925	m2[1][3]=40845.525423858606
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207187484
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16217862988
m1[2][2]=94491.77180358718	m2[2][2]=94491.77117945894
m1[2][3]=98165.6040876451	m2[2][3]=98165.60404453438
m1[2][4]=107029.22874182602	m2[2][4]=107029.22873947177
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.56782488193
m1[3][2]=174920.49439552703	m2[3][2]=174920.4932233293
m1[3][3]=181800.08676866323	m2[3][3]=181800.08668769544
m1[3][4]=198405.9217295142	m2[3][4]=198405.92172509263

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)
Exception in thread "main" java.lang.RuntimeException: TEST=25721 Elements [3][3] differ
m1[0][0]=7021.774830987034	m2[0][0]=7021.774830987034
m1[0][1]=6671.004497011981	m2[0][1]=6671.004497011981
m1[0][2]=6562.907481207491	m2[0][2]=6562.907481207491
m1[0][3]=6711.11205485301	m2[0][3]=6711.11205485301
m1[0][4]=7070.144759580944	m2[0][4]=7070.144759580944
m1[1][0]=50533.65411709927	m2[1][0]=50533.65411709927
m1[1][1]=47238.60662216227	m2[1][1]=47238.60662216227
m1[1][2]=46060.29654012994	m2[1][2]=46060.29654012994
m1[1][3]=47785.9864299652	m2[1][3]=47785.9864299652
m1[1][4]=51948.73401721851	m2[1][4]=51948.73401721851
m1[2][0]=121696.16159646476	m2[2][0]=121696.16159646476
m1[2][1]=113509.99686878055	m2[2][1]=113509.99686878055
m1[2][2]=110574.13409857947	m2[2][2]=110574.13409857947
m1[2][3]=114873.9804417375	m2[2][3]=114873.9804417375
m1[2][4]=125248.01961714865	m2[2][4]=125248.01961714865
m1[3][0]=225825.05076021465	m2[3][0]=225825.05076021465
m1[3][1]=210234.96043019896	m2[3][1]=210234.96043019896
m1[3][2]=204708.61835611783	m2[3][2]=204708.61835611783
m1[3][3]=212760.64466901263	m2[3][3]=196376.64466901263
m1[3][4]=232196.23583147067	m2[3][4]=232196.23583147067

	at TF.checkm(TF.java:138)
	at TF.runTest(TF.java:18)
	at TF.multipleTests(TF.java:149)
	at TF.main(TF.java:160)

Comment 9 Need Real Name 2011-03-24 07:58:42 UTC
Created attachment 487244 [details]
proper latest test case

proper latest test case

Comment 10 Need Real Name 2011-03-24 08:09:27 UTC
Created attachment 487246 [details]
Latest test

The test is started as 
n=0 ; while [ $n -lt 20 ]; do java TF &  n=$[ $n + 1 ] ;    done
the results are in test.txt
the number differences are not small 

m1[0][2]=5624.415186604344      m2[0][2]=5368.415186604342

comparison code - as you suggested.
As a side note: there was no error reporting for long time,
then, in another window I started the 
find /  
command - errors immediately came out

Comment 11 Need Real Name 2011-03-24 08:11:01 UTC
Created attachment 487247 [details]
test results

test results for a test started as 
n=0 ; while [ $n -lt 20 ]; do java TF &  n=$[ $n + 1 ] ;    done

Comment 12 Need Real Name 2011-03-24 12:05:41 UTC
Created attachment 487295 [details]
Same problem does exist with long

This is a much wider & much stronger problem.
It does exist not only with double, but with long also.
(it takes much more time to trigger the problem with longs)
Test code & results are attached.

Comment 13 Need Real Name 2011-03-24 12:08:50 UTC
Created attachment 487304 [details]
the results with longs instead of doublea

the results with longs instead of double

Comment 14 Need Real Name 2011-03-24 12:33:13 UTC
Created attachment 487310 [details]
simplified test

This is TFsimple.java a much simplified TF.java 
which nevertheless, show the same problem with incorrect calculations.
But it takes much longer to trigger.

the function calc is just this:

 static double [][] calc(final int a,final int b,final F [] f,
			    final double xx,final double yy,
			    final double [][] mx)	
    {
	final double [][] m=new double [a][b];
	double wa=1;
	for(int k=0;k<a;k++)
	{
	    double wb=wa;
	    for(int l=0;l<b;l++)
	    {
		m[k][l]+=wb*mx[k][l];
		wb*=(l+0.7822332)*2;
	    }
	    wa*=(k+0.67221333);
	}
	return m;
    }

Comment 15 Deepak Bhole 2011-03-24 14:31:01 UTC
I've run the code for over 12 hours now, and still no error. I am trying the latest test case now (so far no errors there either).

What sort of hardware are you running on test on? And have you been able to reproduce this on other hardware?

Comment 16 Need Real Name 2011-03-24 17:07:44 UTC
I am running Fedora on 
AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
stepping	: 3
cpu MHz		: 3007.708
cache size	: 1024 KB

8Gb memory, 64 bit java last tested on 
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
but various other versions are also tried.

I just run memtest for about 2 hours - no problem found.

I just started the TFsimple.java on a very different computer - 
20 instances at once:

n=0 ; while [ $n -lt 20 ]; do java TFsimple &  n=$[ $n + 1 ] ;    done >diag_1.txt 2>&1


Not that many for the computer: 
Debian, 8 way Xeon 32 Gb memory
Linux forum 2.6.26-2-amd64 #1 SMP Tue Jan 25 05:59:43 UTC 2011 x86_64 GNU/Linux

java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)


The system got destabilized & became non-responding, 
so I did power reset.

diag1 show nothing special. I just started 2 instances instead of 20.

Comment 17 Deepak Bhole 2011-03-24 17:32:00 UTC
I've been running the last version posted version of the test (with double) for over 3 hours now and it has still not error'd (8 instances). 

Any other machine you can try on to eliminate hardware as the problem? Another one of my co-workers (Denis Lila, on cc) tried it for 2.5+ hours on his machine too -- there were no errors.

Comment 18 Need Real Name 2011-03-24 17:59:47 UTC
Created attachment 487401 [details]
errors

My computer (AMX athlon) got destabilized with 3 instances of TFsimple running 
after 1 hour. 

Debian is still running 3 instances.
looks like TF.java - is more likely give just an error and
TFsimple.java is more likely to give 
destabilization after an error.

attached a trace for 
 java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)


cat /tmp/diag_simple.txt 
[1] 2748
[2] 2749
[3] 2750
[4] 2751
[5] 2752
[6] 2753
[7] 2754
[8] 2755
[9] 2756
[10] 2757
[11] 2758
[12] 2759
[13] 2760
[14] 2761
[15] 2762
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
Running single thread code until test failure
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f62a7d6d5ea, pid=2749, tid=140061692933888
#
# JRE version: 6.0_24-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x6295ea]
#
# An error report file with more information is saved as:
# /home/mal/hs_err_pid2749.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#
Exception in thread "main" java.lang.RuntimeException: TEST=62725 Elements [2][3] differ
m1[0][0]=0.1700774336283186	m2[0][0]=0.1700774336283186
m1[0][1]=0.11166978133704736	m2[0][1]=0.11166978133704736
m1[0][2]=0.2518757394821027	m2[0][2]=0.2518757394821027
m1[0][3]=1.0251160550185991	m2[0][3]=1.0251160550185991
m1[0][4]=6.11267581491528	m2[0][4]=6.11267581491528
m1[1][0]=1.043826838481748	m2[1][0]=1.043826838481748
m1[1][1]=0.6853579121009847	m2[1][1]=0.6853579121009847
m1[1][2]=1.5458526814816618	m2[1][2]=1.5458526814816618
m1[1][3]=6.291508684952188	m2[1][3]=6.291508684952188
m1[1][4]=37.51570643105274	m2[1][4]=37.51570643105274
m1[2][0]=3.2998209696571212	m2[2][0]=3.2998209696571212
m1[2][1]=2.1666030482227314	m2[2][1]=2.1666030482227314
m1[2][2]=4.886861408711588	m2[2][2]=4.886861408711588
m1[2][3]=19.88917272867012	m2[2][3]=17.70913952378951
m1[2][4]=118.59736711957807	m2[2][4]=118.59736711957807
m1[3][0]=12.971299713493547	m2[3][0]=12.971299713493547
m1[3][1]=8.516721894032319	m2[3][1]=8.516721894032319
m1[3][2]=19.209813069734487	m2[3][2]=19.209813069734487
m1[3][3]=78.18255077754401	m2[3][3]=78.18255077754401
m1[3][4]=466.1955931200476	m2[3][4]=466.1955931200476

	at TFsimple.checkm(TFsimple.java:108)
	at TFsimple.runTest(TFsimple.java:18)
	at TFsimple.multipleTests(TFsimple.java:119)
	at TFsimple.main(TFsimple.java:130)

Comment 19 Deepak Bhole 2011-03-24 18:12:05 UTC
The SIGSEGV is a different issue as it implies a crash in the VM. Additionally it is in closed source code so we cannot do anything about it (although feel free to open a bug if you see it with OpenJDK).

Is the error posted in comment 18 on the same AMD 6000+ box where the rest have been or a different AMD box?

Comment 20 Denis Lila 2011-03-24 18:17:55 UTC
Hi.

Could you try modifying your test (preferably the long version) to keep track of the last F[] (just pass f as an argument to the checker and on failure do
for (F ff : f) {
    out.printf("new F(%dL, %dL, %dL),%n", ff.x, ff.y, ff.z);
}
This will make it easy to initialize the failing array - just copy what's printed into TF.java)? The F[] is the only random part of the program, and if the specific numbers produced are a problem (perhaps they reveal a bad assumption about runtime ranges in hotspot...?) running the test over and over again with that same F[] should be enough to reproduce it. If the numbers aren't what's causing it, then removing the randomness shouldn't be a problem*

Also, try running with -XX:+PrintCompilation. I suspect OSR compilation may be to blame here, and that should tell us whether any of that is going on.

Another thing that would be very helpful would be if you ran with -XX:+PrintAssembly, but this won't just work. You'll have to build openjdk7 and then follow the instructions here: http://wikis.sun.com/display/HotSpotInternals/PrintAssembly

* this is very important: if you do remove the randomness and go with a constant F[] that crashed on a previous run, make sure to run the test with -XX:CompileCommand=exclude,TF,multipleTests
If you don't do that, hotspot will notice that F[] is constant, and will do a whole lot of constant propagation throughout the program, and it will almost certainly prevent the crash. Right now, I don't think it can propagate constants from interpreted methods to compiled methods.

Comment 21 Need Real Name 2011-03-24 18:31:44 UTC
Is there any place I can download the latest pre-build openjdk 7 with PrintAssembly compiled-in. preferrably in .tar.gz so I can try it on Debian.
I think this would testing simplify a lot.

Comment 22 Denis Lila 2011-03-24 18:35:13 UTC
Hi.

Before you do any of what I suggested, could you please clarify whether you were able to reproduce the wrong computation on more than 1 machine?

I don't think there are any pre built jdks with PrintAssembly, but PrintCompilation should be available on all of them.

Comment 23 Need Real Name 2011-03-24 18:47:36 UTC
Till now I observed the problem on 1 computer with various jdks
(openjdk, Oracle/Sun, various versions).
I tested on 2 computers 

Fedora - observer error messages & crashes.

Debian - in 10 minutes with just 15-20 TFsimple.java started 
(nothing for 8way 32Gb Debian)
the computer become completely non-responding & requires remote reboot.
No error messages observed, but they may be missed, the computer is remote.
It is running OK with 3 instances of TFsimple.java

One more Fedora 15 is available to me, but this is a production server,
I cannot run experiments there.

Comment 24 Deepak Bhole 2011-03-24 18:53:16 UTC
If the debian machine is rebootable, can you try with a Fedora Livedisk? It would be really nice to be able to eliminate hardware as the problem before attempting deeper and more cumbersome diagnosis.

Comment 25 Deepak Bhole 2011-03-24 18:56:24 UTC
Ooops, scratch that. Just noticed that you mentioned that the computer is remote :(

Comment 26 Need Real Name 2011-03-24 18:57:36 UTC
This is the results with TF.java (in contrast with TFsimle.java 
the TF.java does not cause operating system destabilization).
(some messages from error format are from the code code inserted to checker 

 for(int i=0;i<f1.length;i++)
{
final F ff1=f1[i],ff2=f2[i];
System.out.printf("new F(%dL, %dL, %dL),%n", ff1.x, ff1.y, ff1.z);
System.out.printf("new F(%dL, %dL, %dL),%n", ff2.x, ff2.y,ff2.z);
}


cat /tmp/hh_A 
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  2       java.lang.Double::doubleToLongBits (33 bytes)
  3       TF::checkm (418 bytes)
---   n   java.lang.Object::clone
  2%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
---   n   sun.misc.Unsafe::compareAndSwapLong
  3   made not entrant  TF::checkm (418 bytes)
new F(Exception in thread "main" java.util.IllegalFormatConversionException: d != java.lang.Double
	at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
	at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
	at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
	at java.util.Formatter.format(Formatter.java:2433)
	at java.io.PrintStream.format(PrintStream.java:920)
	at java.io.PrintStream.printf(PrintStream.java:821)
	at TF.checkm(TF.java:143)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:157)
	at TF.main(TF.java:168)
cat  /tmp/hh_C 
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  2%      TF::calc @ 218 (359 bytes)
  1       TF::calc (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (418 bytes)
---   n   java.lang.Object::clone
  3%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (418 bytes)
new F(Exception in thread "main" java.util.IllegalFormatConversionException: d != java.lang.Double
	at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
	at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
	at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
	at java.util.Formatter.format(Formatter.java:2433)
	at java.io.PrintStream.format(PrintStream.java:920)
	at java.io.PrintStream.printf(PrintStream.java:821)
	at TF.checkm(TF.java:143)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:157)
	at TF.main(TF.java:168)

cat  /tmp/hh_D
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2%      TF::calc @ 218 (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (418 bytes)
---   n   java.lang.Object::clone
  3%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (418 bytes)
new F(Exception in thread "main" java.util.IllegalFormatConversionException: d != java.lang.Double
	at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
	at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
	at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
	at java.util.Formatter.format(Formatter.java:2433)
	at java.io.PrintStream.format(PrintStream.java:920)
	at java.io.PrintStream.printf(PrintStream.java:821)
	at TF.checkm(TF.java:143)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:157)
	at TF.main(TF.java:168)
cat  /tmp/hh_B 
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (418 bytes)
---   n   java.lang.Object::clone
  2%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (418 bytes)
new F(Exception in thread "main" java.util.IllegalFormatConversionException: d != java.lang.Double
	at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:3999)
	at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2709)
	at java.util.Formatter$FormatSpecifier.print(Formatter.java:2661)
	at java.util.Formatter.format(Formatter.java:2433)
	at java.io.PrintStream.format(PrintStream.java:920)
	at java.io.PrintStream.printf(PrintStream.java:821)
	at TF.checkm(TF.java:143)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:157)
	at TF.main(TF.java:168)

Comment 27 Need Real Name 2011-03-24 18:59:56 UTC
with commented System.out code:
cat  /tmp/hh_A 
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  2%      TF::calc @ 218 (359 bytes)
  1       TF::calc (359 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  2       java.lang.Double::doubleToLongBits (33 bytes)
  3       TF::checkm (326 bytes)
---   n   java.lang.Object::clone
  3%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (326 bytes)
Exception in thread "main" java.lang.RuntimeException: TEST=7162 Elements [1][1] differ
m1[0][0]=7706.7278164451545	m2[0][0]=7706.7278164451545
m1[0][1]=7320.609551430136	m2[0][1]=7320.609551430136
m1[0][2]=7201.232432903037	m2[0][2]=7201.232432903037
m1[0][3]=7363.6815207085365	m2[0][3]=7363.6815207085365
m1[0][4]=7757.530582167053	m2[0][4]=7757.530582167053
m1[1][0]=55438.31850924297	m2[1][0]=55438.31850924297
m1[1][1]=51824.39619900315	m2[1][1]=47728.39619900315
m1[1][2]=50530.755773340454	m2[1][2]=50530.755773340454
m1[1][3]=52423.69762827249	m2[1][3]=52423.69762827249
m1[1][4]=56990.29847098225	m2[1][4]=56990.29847098225
m1[2][0]=133505.03622530942	m2[2][0]=133505.03622530942
m1[2][1]=124527.3917768878	m2[2][1]=124527.3917768878
m1[2][2]=121304.10831809939	m2[2][2]=121304.10831809939
m1[2][3]=126020.68723619048	m2[2][3]=126020.68723619048
m1[2][4]=137401.1744967179	m2[2][4]=137401.1744967179
m1[3][0]=247738.9864061055	m2[3][0]=247738.9864061055
m1[3][1]=230644.9616565764	m2[3][1]=230644.9616565764
m1[3][2]=224572.93003040878	m2[3][2]=224572.93003040878
m1[3][3]=233404.58633299996	m2[3][3]=233404.58633299996
m1[3][4]=254725.65094484514	m2[3][4]=254725.65094484514

	at TF.checkm(TF.java:147)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:158)
	at TF.main(TF.java:169)

 cat  /tmp/hh_B
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (326 bytes)
---   n   java.lang.Object::clone
  2%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (326 bytes)
Exception in thread "main" java.lang.RuntimeException: TEST=50827 Elements [0][0] differ
m1[0][0]=7706.7278164451545	m2[0][0]=7706.716008346415
m1[0][1]=7320.609551430136	m2[0][1]=7320.598202052335
m1[0][2]=7201.232432903037	m2[0][2]=7201.221250764572
m1[0][3]=7363.6815207085365	m2[0][3]=7363.670091676419
m1[0][4]=7757.530582167053	m2[0][4]=7757.518557928863
m1[1][0]=55438.31850924297	m2[1][0]=55438.22686371967
m1[1][1]=51824.39619900315	m2[1][1]=51824.31721228213
m1[1][2]=50530.755773340454	m2[1][2]=50530.67911095596
m1[1][3]=52423.69762827249	m2[1][3]=52423.61812134418
m1[1][4]=56990.29847098225	m2[1][4]=56990.21206504163
m1[2][0]=133505.03622530942	m2[2][0]=133504.82810230204
m1[2][1]=124527.3917768878	m2[2][1]=124527.20303750761
m1[2][2]=121304.10831809939	m2[2][2]=121303.92473653423
m1[2][3]=126020.68723619048	m2[2][3]=126020.49653899744
m1[2][4]=137401.1744967179	m2[2][4]=137400.96660552503
m1[3][0]=247738.9864061055	m2[3][0]=247738.60752429755
m1[3][1]=230644.9616565764	m2[3][1]=230644.61273405756
m1[3][2]=224572.93003040878	m2[3][2]=224572.59044236693
m1[3][3]=233404.58633299996	m2[3][3]=233404.2334013237
m1[3][4]=254725.65094484514	m2[3][4]=254725.2657997465

	at TF.checkm(TF.java:147)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:158)
	at TF.main(TF.java:169)

cat  /tmp/hh_C
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2%      TF::calc @ 218 (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (326 bytes)
---   n   java.lang.Object::clone
  3%      TF::runTest @ 26 (77 bytes)
  4       TF::runTest (77 bytes)
  3   made not entrant  TF::checkm (326 bytes)
Exception in thread "main" java.lang.RuntimeException: TEST=24262 Elements [3][3] differ
m1[0][0]=7612.943180925791	m2[0][0]=7612.943180925791
m1[0][1]=7232.091931343198	m2[0][1]=7232.091931343198
m1[0][2]=7114.651836443036	m2[0][2]=7114.651836443036
m1[0][3]=7275.303415709812	m2[0][3]=7275.303415709812
m1[0][4]=7664.617050258952	m2[0][4]=7664.617050258952
m1[1][0]=54797.25164657019	m2[1][0]=54797.25164657019
m1[1][1]=51222.70137181502	m2[1][1]=51222.70137181502
m1[1][2]=49944.34143399005	m2[1][2]=49944.34143399005
m1[1][3]=51815.50758683226	m2[1][3]=51815.50758683226
m1[1][4]=56329.43295526017	m2[1][4]=56329.43295526017
m1[2][0]=131964.4910261504	m2[2][0]=131964.4910261504
m1[2][1]=123086.10360065899	m2[2][1]=123086.10360065899
m1[2][2]=119901.11262812998	m2[2][2]=119901.11262812998
m1[2][3]=124563.47117926647	m2[2][3]=124563.47117926647
m1[2][4]=135812.6917897872	m2[2][4]=135812.6917897872
m1[3][0]=244880.80619995148	m2[3][0]=244880.80619995148
m1[3][1]=227974.9084076055	m2[3][1]=227974.9084076055
m1[3][2]=221977.9207674185	m2[3][2]=221977.9207674185
m1[3][3]=230708.52176553395	m2[3][3]=214324.52176553395
m1[3][4]=251783.71798452607	m2[3][4]=251783.71798452607

	at TF.checkm(TF.java:147)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:158)
	at TF.main(TF.java:169)

Comment 28 Deepak Bhole 2011-03-24 19:31:27 UTC
Created attachment 487413 [details]
New TF.java file

Okay, let us first eliminate loop unrolling being the issue. Can you try running the attached file?

It is the same code as in comment 10, but with some extra print statements that get printed if an error occurs.

We know that the value at a given index is modified 640 times in total, in 5 sets of 128. If a loop is being skipped, the print statement won't have it.

Furthermore, the debug statement can be analysed to see if there was an addition error.

Comment 29 Need Real Name 2011-03-24 19:42:36 UTC
OK, i will run your code now.

answering your question whether the failure occur on the same F [] ?
no - I propagated the index k from   
static void multipleTests(final java.util.Random r)
to checker - the failure occurs at different k
below two failures at k=0 & k=3, the Fs would be different.



cat  /tmp/hh_C
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2%      TF::calc @ 156 (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (336 bytes)
---   n   java.lang.Object::clone
  3%      TF::runTest @ 26 (79 bytes)
  4       TF::runTest (79 bytes)
  3   made not entrant  TF::checkm (336 bytes)
Exception in thread "main" java.lang.RuntimeException: F try=3 calculation try=29260 Elements [0][1] differ
m1[0][0]=7021.774830987034	m2[0][0]=7021.774830987034
m1[0][1]=6671.004497011981	m2[0][1]=6671.002964197445
m1[0][2]=6562.907481207491	m2[0][2]=6562.9073157314315
m1[0][3]=6711.11205485301	m2[0][3]=6711.112043123023
m1[0][4]=7070.144759580944	m2[0][4]=7070.144758923587
m1[1][0]=50533.65411709927	m2[1][0]=50533.65411709927
m1[1][1]=47238.60662216227	m2[1][1]=47238.60533837808
m1[1][2]=46060.29654012994	m2[1][2]=46060.29640153812
m1[1][3]=47785.9864299652	m2[1][3]=47785.98642014094
m1[1][4]=51948.73401721851	m2[1][4]=51948.73401666795
m1[2][0]=121696.16159646476	m2[2][0]=121696.16159646476
m1[2][1]=113509.99686878055	m2[2][1]=113509.9952559613
m1[2][2]=110574.13409857947	m2[2][2]=110574.13392446646
m1[2][3]=114873.9804417375	m2[2][3]=114873.98042939528
m1[2][4]=125248.01961714865	m2[2][4]=125248.01961645696
m1[3][0]=225825.05076021465	m2[3][0]=225825.05076021465
m1[3][1]=210234.96043019896	m2[3][1]=210234.95772861727
m1[3][2]=204708.61835611783	m2[3][2]=204708.61806446672
m1[3][3]=212760.64466901263	m2[3][3]=212760.64464833855
m1[3][4]=232196.23583147067	m2[3][4]=232196.23583031207

	at TF.checkm(TF.java:147)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:158)
	at TF.main(TF.java:169)


cat  /tmp/hh_A
Running single thread code until test failure
  1%      TF::calc @ 233 (359 bytes)
  1       TF::calc (359 bytes)
  2       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  3       TF::checkm (336 bytes)
---   n   java.lang.Object::clone
  2%      TF::runTest @ 26 (79 bytes)
  3   made not entrant  TF::checkm (336 bytes)
Exception in thread "main" java.lang.RuntimeException: F try=0 calculation try=75033 Elements [3][3] differ
m1[0][0]=6015.906748342842	m2[0][0]=6015.906748342842
m1[0][1]=5716.6127269548415	m2[0][1]=5716.6127269548415
m1[0][2]=5624.415186604344	m2[0][2]=5624.415186604344
m1[0][3]=5751.072547006512	m2[0][3]=5751.072547006512
m1[0][4]=6057.8359619079365	m2[0][4]=6057.8359619079365
m1[1][0]=43190.181781667496	m2[1][0]=43190.181781667496
m1[1][1]=40377.57397780233	m2[1][1]=40377.57397780233
m1[1][2]=39371.04889321945	m2[1][2]=39371.04889321945
m1[1][3]=40845.525454463925	m2[1][3]=40845.525454463925
m1[1][4]=44402.20207354617	m2[1][4]=44402.20207354617
m1[2][0]=103991.08438810863	m2[2][0]=103991.08438810863
m1[2][1]=97000.16811187661	m2[2][1]=97000.16811187661
m1[2][2]=94491.77180358718	m2[2][2]=94491.77180358718
m1[2][3]=98165.6040876451	m2[2][3]=98165.6040876451
m1[2][4]=107029.22874182602	m2[2][4]=107029.22874182602
m1[3][0]=192961.576069273	m2[3][0]=192961.576069273
m1[3][1]=179643.57896832644	m2[3][1]=179643.57896832644
m1[3][2]=174920.49439552703	m2[3][2]=174920.49439552703
m1[3][3]=181800.08676866323	m2[3][3]=165416.08676866323
m1[3][4]=198405.9217295142	m2[3][4]=198405.9217295142

	at TF.checkm(TF.java:147)
	at TF.runTest(TF.java:19)
	at TF.multipleTests(TF.java:158)
	at TF.main(TF.java:169)

Comment 30 Denis Lila 2011-03-24 19:53:26 UTC
> answering your question whether the failure occur on the same F [] ?
> no - I propagated the index k from   
> static void multipleTests(final java.util.Random r)
> to checker - the failure occurs at different k
> below two failures at k=0 & k=3, the Fs would be different.

I didn't mean "same F within the F array". I meant "same F array". You create many random F arrays and I wanted to print every element of the one that fails and try using that array for all the tests.

Comment 31 Need Real Name 2011-03-24 20:04:03 UTC
1. Your code is running. no error yet. Hopefully there will be some by the morning.

2. Answering your question from 2011-03-24 15:53:26 EDT 
I meant exactly this. I propagated "k" from multipleTests to checker.
Array elements depend on k. 
The failures occurred at k=0 & k=3, i.e. the failures occurred 
at different elements of the F[] ,
i.e. the failure does not depend on F[] elements specifics.

Comment 32 Need Real Name 2011-03-24 21:10:25 UTC
Created attachment 487427 [details]
the failure with your version

This is the first failure report of your code.
other 4 instances are still running.

Comment 33 Deepak Bhole 2011-03-24 21:17:44 UTC
The value was changed 640 times according to the log, which means that loops are not being skipped, nor are extra ones being added. I am trying to verify the additions now.

Comment 34 Need Real Name 2011-03-24 21:23:06 UTC
I modified your code to save debugStr corresponding to matrix m1.
in case of an error two debugStr (corresponding to m1 & m2) 
should be now printed.
This will simplify the comparison.

Comment 35 Deepak Bhole 2011-03-24 21:32:35 UTC
Created attachment 487428 [details]
New TF.java file

Thanks for the quick responses! The calculations add up as well. Unfortunately that does not eliminate the possibility of hw error during arithmetic. 

Can you please try the file attached with this post? In addition to printing the calculations done for the new array, it also prints the original set of calculations.

Comment 36 Deepak Bhole 2011-03-24 21:34:09 UTC
Doh! Please ignore comment 35.. I didn't see comment 34.. you are already thinking along the same lines as me :)

Comment 37 Need Real Name 2011-03-25 05:11:04 UTC
The modified code, left to calculate for entire night lead the computer 
non-responsive & power rest was required. 
Debian box also also went non-responding to ssh, only to ping, exactly as it was last time. So no results produced.

What else can be the problem?
evidently memory allocator of axb array.
final double [][] m=new double [a][b];

This would explain:
1. Destabilisation of my computer & Debian.
2. Difficulty of reproduction on other people computers.
3. problem with long instead of double.
   it is unlikely to have the same problem in both long & double arithmetic.
4. Random nature of the problem.

On my workstation and on Debian I modified TFSimple.java 
to replace all 
new double [a][b] by a call to:

static double [][] newdouble2(final int a, final int b)
{
	final double [][] m=new double [a][];
	for(int i=a;--i>=0;)
	{
	    m[i]=new double [b];
	}
	return m;
}

and started 20 instances of this modified TFSimple1.java on my workstation.
previously 20 instances would destabilise my workstation 
in less than 1 minute and
and a reboot would required. 
Now for more than 40 minutes it is running OK.

but one TFsimle1.java failed anyway.

Exception in thread "main" java.lang.RuntimeException: TEST=26728 Elements [0][4] differ
m1[0][0]=0.1700774336283186	m2[0][0]=0.1700774336283186
m1[0][1]=0.11166978133704736	m2[0][1]=0.11166978133704736
m1[0][2]=0.2518757394821027	m2[0][2]=0.2518757394821027
m1[0][3]=1.0251160550185991	m2[0][3]=1.0251160550185991
m1[0][4]=6.11267581491528	m2[0][4]=5.654225128796633
m1[1][0]=1.043826838481748	m2[1][0]=1.043826838481748
m1[1][1]=0.6853579121009847	m2[1][1]=0.6853579121009847
m1[1][2]=1.5458526814816618	m2[1][2]=1.5458526814816618
m1[1][3]=6.291508684952188	m2[1][3]=6.291508684952188
m1[1][4]=37.51570643105274	m2[1][4]=37.51570643105274
m1[2][0]=3.2998209696571212	m2[2][0]=3.2998209696571212
m1[2][1]=2.1666030482227314	m2[2][1]=2.1666030482227314
m1[2][2]=4.886861408711588	m2[2][2]=4.886861408711588
m1[2][3]=19.88917272867012	m2[2][3]=19.88917272867012
m1[2][4]=118.59736711957807	m2[2][4]=118.59736711957807
m1[3][0]=12.971299713493547	m2[3][0]=12.971299713493547
m1[3][1]=8.516721894032319	m2[3][1]=8.516721894032319
m1[3][2]=19.209813069734487	m2[3][2]=19.209813069734487
m1[3][3]=78.18255077754401	m2[3][3]=78.18255077754401
m1[3][4]=466.1955931200476	m2[3][4]=466.1955931200476

	at TFsimple1.checkm(TFsimple1.java:118)
	at TFsimple1.runTest(TFsimple1.java:28)
	at TFsimple1.multipleTests(TFsimple1.java:129)
	at TFsimple1.main(TFsimple1.java:140)

[2]   Exit 1                  java TFsimple1

Comment 38 Need Real Name 2011-03-25 05:44:30 UTC
Created attachment 487461 [details]
updated TF.java

this is the TF.java I just started

Comment 39 Need Real Name 2011-03-25 05:47:16 UTC
Created attachment 487462 [details]
TFsimple1.java

this TFsimple1.java does not destabilize my workstation
(20 instances of the TFsimple.java would destabilize in less than 1 minute)

but this TFsimple1.java did produce one error anyway.

The difference between TFsimple.java & TFsimple1.java is in 
new double [a][b]

Comment 40 Need Real Name 2011-03-25 07:39:07 UTC
Created attachment 487493 [details]
test with extra diag

the test with extra diag

Comment 41 Need Real Name 2011-03-25 07:40:25 UTC
Created attachment 487494 [details]
the failure with TFo.java

the failure results with TFo.java

Comment 42 Need Real Name 2011-03-25 07:56:15 UTC
Created attachment 487495 [details]
where the problem started

This is where the problem started:


m1: (original)
Loop 5 Added 95.7778766848038 at [2][1]. sa=2.036801632444434 (waq+1.1)=2.612425807401708 xx=0.101 yy=0.072 New value=60665.69720562991


m2: (repeated)
Loop 5 Added 95.77787662549497 at [2][1]. sa=2.0368016311831787 (waq+1.1)=2.612425807401708 xx=0.101 yy=0.072 New value=60665.6972055706

m2 is incorrect:
java T1
 sa=2.036801632444434 RES=95.7778766848038

Comment 43 Need Real Name 2011-03-25 08:23:57 UTC
Created attachment 487503 [details]
repeated sa calculations

The code in T1.java does repeated calculation,
it was optimized away, I will look later on that.
But anyway correct SA=2.036801632444434,
which sometimes incorrectly calculated.

java  -XX:+PrintCompilation  T1
SA=2.036801632444434
  1       T1::getsa (139 bytes)
  2       java.util.Random::next (47 bytes)
---   n   sun.misc.Unsafe::compareAndSwapLong
  3       java.util.concurrent.atomic.AtomicLong::get (5 bytes)
  4       java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)
---   n   java.lang.Double::doubleToRawLongBits (static)
  5       java.lang.Double::doubleToLongBits (33 bytes)
---   n   java.lang.Object::clone
  6       java.lang.Double::compare (54 bytes)
  1%      T1::runTests @ 53 (131 bytes)

Comment 44 Deepak Bhole 2011-03-25 13:21:03 UTC
Ah, so it is an arithmetic issue. It is very improbable that a compiler optimization may cause such an error. This is looking more like an hw issue.

Can you try running it again, this time disabling the JIT compiler with:

-Djava.compiler=NONE

That will make it a lot slower, but it will eliminate the compiler as being the issue. Once we know it can happen with interpreter only, we can perhaps retry with something like gij.

Comment 45 Need Real Name 2011-03-25 16:10:32 UTC
I just started 5 instances with 
java -Djava.compiler=NONE -XX:+PrintCompilation TFo
But I expect no problem when JIT is not enabled.

I think this issue is very unlikely to be an arithmetic.
Reasons - the issue is 1) random and strongly linked to loops 2) same problem does also happen with longs, not only with doubles.

the value of "sa" is obtained as a sum over two loop : ks & ls (which are easily optimizable by jit),
and most likely reason - jit compiler did something bad with loop 
unrolling/optimization, generating wrong machine code which in some 
instances give just wrong values result, in other may cause 
system destabilization via memory or CPU overuse (like in that tests on Debian).


To see that I tried to see assembly, by 
trying to build openjdk-7-ea-src-b135-24_mar_2011.zip
 with -XX:+PrintAssembly but it gives me an error 
BUILD FAILED
/rdata/o1/openjdk/jaxp/build-defs.xml:70: ERROR: Cannot find source for project jaxp.
but jaxp the source is there in openjdk7

Comment 46 Deepak Bhole 2011-03-25 17:59:51 UTC
The reason I am disinclined to believe that this is a compiler issue is because we have been unable to reproduce this on any other hardware. I had the test running overnight last night (total of another 20+ hours on top of previous 12+) and I was still unable to trigger it.

Of course that does not preclude the possibility that it is a bug introduced by the compiler when it optimizes specifically for the hw you have. That's why I'd like to see what happens with -Djava.compiler=NONE.

Comment 47 Need Real Name 2011-03-25 18:10:33 UTC
The case 
-Djava.compiler=NONE
is still running still no problem.
I am compiling openjdk-7-ea-src-b135-24_mar_2011.zip
 now, trying to get -XX:+PrintAssembly 
option working

Comment 48 Deepak Bhole 2011-03-25 18:11:31 UTC
Great! I take it you figured out the jaxp error? I was just looking into it -- you need to do 'make ALLOW_DOWNLOADS=true' so that it will download it.

Comment 49 Need Real Name 2011-03-26 17:06:32 UTC
Created attachment 487902 [details]
failure 1 (assembler dump)

failure with assemble dump.
The problem, as we identified earlier, is incorrect calculation 
of sa inside the loop:
            double sa=0,wa=1;
            for(int ks=k;ks>=0;ks--)
            {          
            double sb=0,wb=1;
            for(int ls=l;ls>=0;ls--)
            {
                sb+=mx[ks][ls]*wb;
                wb*=xx;
                wb*=ls+1;
            }
            sa+=sb*wa;
            wa*=yy;
            wa*=ks+1;
            }
// HERE sa may be incorrect !!!
            m[k][l]+=sa*(l+2)*(k+4)*(waq+1.1);

I am attaching the assembler dump.

Comment 50 Need Real Name 2011-03-26 17:07:27 UTC
Created attachment 487903 [details]
second failure

second failure

Comment 51 Need Real Name 2011-03-26 17:08:16 UTC
Created attachment 487906 [details]
no failure, java terminated by kill

no failure, java terminated by kill

Comment 52 Need Real Name 2011-03-26 17:14:07 UTC
Created attachment 487916 [details]
TFO.java

This is TFO.java & class files.
assembler dump is attached in previous posts.

The goal, as I see now, is to identify the likehood of hardware failure
against hotspot bug.

Same code with JIT disabled was running for >12 hour without a problem.

I attached assembler bug, we probably know where the problem is
(double loop calculating sa) now one familiar with the hotspot 
need to check the asm dump.

Comment 53 Need Real Name 2011-03-26 17:29:18 UTC
Regarding hh_A.txt
the problem is with sa calculation:

CORRECT:--Loop 5 Added 1197.8708043516913 at [3][2]. sa=2.3885752977674133 (waq+1.1)=17.910718659530968 xx=0.101 yy=0.072 New value=27577.43591104506
vs
WRONG:-Loop 5 Added 1197.8708008520678 at [3][2]. sa=2.388575290789103 (waq+1.1)=17.910718659530968 xx=0.101 yy=0.072 New value=27577.435907545434

The correct sa=2.3885752977674133

this code is called many times inside other loops,
and once in 3-4 hours give wrong sa.



            double sa=0,wa=1;
            for(int ks=k;ks>=0;ks--)
            {          
            double sb=0,wb=1;
            for(int ls=l;ls>=0;ls--)
            {
                sb+=mx[ks][ls]*wb;
                wb*=xx;
                wb*=ls+1;
            }
            sa+=sb*wa;
            wa*=yy;
            wa*=ks+1;
            }

the matrix mx does not depend on anything, fixed for all tests.
and the value of sa depend on outside loop indices k & l only,
sa does not depend on anything else.

The problem: sa sometimes is calculated INCORRECTLY.
in interpreter mode (no JIT) I did not catch a failure yet.

Comment 54 Need Real Name 2011-03-26 17:40:22 UTC
One more reason why this is unlikely hardware bug:
I did run hundreds of unit tests for internal java code -
no problem at all. 
The only issue arise when there are two loops one inside another & JIT is enabled.
as in sa calculation above. No problem in any other situation.

Comment 55 Need Real Name 2011-03-26 17:43:46 UTC
Forgot to tell: 
the last tests are run with:

java -version
openjdk version "1.7.0-internal"
OpenJDK Runtime Environment (build 1.7.0-internal-root_2011_03_26_02_25-b00)
OpenJDK 64-Bit Server VM (build 21.0-b05, mixed mode)


as 


 for x in A B C D E ; do java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly   TFo >hh_$x 2>&1 & done

Comment 56 Need Real Name 2011-03-26 17:44:55 UTC
and the openjdk source used: 
openjdk-7-ea-src-b135-24_mar_2011.zip

Comment 57 Need Real Name 2011-03-26 18:21:16 UTC
looking at hh_A.txt output at the end - we see that 
the last run before failure (the data saved into debugStr printed at the end)
the sa at[k=3][l=2] was calculated 128 times total.
only the 27-th time the sa was calculated incorrectly to 
[3][2]. sa=2.388575290789103
all other times it was equal to correct value 
[3][2]. sa=2.3885752977674133

I expect there should be different asm used to calculate sa 
at k=3,k=2 at 27-th time, probably some loop optimization triggered,
and the transition between loop execution strategies 
may be broken.

Comment 58 Need Real Name 2011-03-26 18:47:04 UTC
By the way - the last JIT code generated related to function F.calc is asm inlining 
4 loops at once:
hh_A.txt:58122: ; - TFo::calc@87 (line 86)


openjdk did loop inline, generated new asm,
which did loop unrolling near hh_A.txt:58122
and this asm fails in some situation, not always.

once this loop inline was done - the 27th calculation 
of sa  at [k=3][l=2] became incorrect & the program failed,
right after last loop inline JIT newest generated asm.

Comment 59 Need Real Name 2011-03-26 19:04:22 UTC
Given the fact that loop unrolled code calculates sa incorrectly quite seldom 
and the problem is hard to reproduce, as well as that it popped up 
right after new asm unrolling was generated -  
I think the problem is that loop unrolling code is RACY in some sense. 
It probably has synchronization wrong and variable/registry
got clobbered in the middle of asm generation or execution.

Comment 60 Need Real Name 2011-03-26 20:47:05 UTC
By the way in other test results: hh_D.txt.gz
exactly the same pattern - unroll loops 
TFo::calc@87 (line 86) and then failure.

I do not think that just plain incorrect unrolling would not be caught by other people.
This is a race almost for sure.
locking is broken in either in 
1. JIT generator itself.
or 
2. The machine code generated by JIT generator.

Comment 61 Need Real Name 2011-03-26 20:55:06 UTC
Created attachment 487958 [details]
The results with TFsimple.java

Below are the results with TFsimple.java 
(much simpler code (no strings used), much less to analyse, but less data collected).

Same pattern - loop unroll and then test failure.

Comment 62 Denis Lila 2011-03-28 16:36:08 UTC
Created attachment 488194 [details]
new reproducer (hopefully)

Hi.

Can you please try running the attached file?
It is the same as TFsimple.java with a few modifications. I removed the F array and the F class because they weren't being used, and I made mx a static double[][]. Also, when you run it, please add the -XX:CompileOnly=TF::runTest,TF::calc option.

It will be interesting to see whether this fails, and also the assembler printouts should be shorter.

Comment 63 Need Real Name 2011-03-28 18:46:11 UTC
I just started it about 30 minutes ago as
for x in A B C D E F ; do java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:CompileOnly=TFD1::runTest,TFD1::calc   TFD1 >TFD1_$x 2>&1 & sleep 4 ; done

(renamed to TFD1 to avoid confusion with other versions ).
No problem yet. The TFsimple.java did give an error faster.
Your code is very unlikely to give an error because JIT does not compile 
the function "calc" regulary (and asm output does not grow).

In failed versions JIT did regular compilation of calc,
eventually some of them did fail. and the asm output did grow permanently.

I think this is a race almost for sure.
It is unlikely that plain wrong JIT output would not be caught by other people.

From my experience with linux - fixing a race is the organizational problem.
One need to reach the maintainer, who is familiar with locking in the specific
module. Any other method is just plain ineffective and may take weeks.

This is why I never liked SUN - I never being able to reach the mantainer
to fix the problem.

Comment 64 Need Real Name 2011-03-28 18:49:25 UTC
Created attachment 488225 [details]
new reproducer results.

Unlikely to give an error because asm output does not grow.
In failed versions JIT did compile the calc() regularly, 
what eventually produced an error. In this version no new asm generated
for long time.

Comment 65 Need Real Name 2011-03-28 19:05:41 UTC
I think if we tell hotspot JIT to be much more aggressive in loop unrolling
e.g. force it to recompile & unroll every 10-th call of calc.
In this case JIT compilations would go permanently and there are would be much higher chances to catch a race.

Once we identified this is a race -
for successful race fixing we need somehow to reach 
the maintainer/author of loop unrolling code at openjdk or oracle.
Anything else is just plain ineffective in case of race fixing.

Comment 66 Denis Lila 2011-03-28 19:14:16 UTC
(In reply to comment #64)
> Created attachment 488225 [details]
> new reproducer results.
> 
> Unlikely to give an error because asm output does not grow.
> In failed versions JIT did compile the calc() regularly, 

I'm not sure what you mean by "compiled regularly". I looked at every attachment you posted. In TFsimple.java, calc() is JIT compiled exactly once, and in the others it's compiled exactly twice.

You say that when it crashes the asm output grows. How does it grow? Does it output a lot of asm early, then stop for a while, and then it outputs more near the crash?

> what eventually produced an error. In this version no new asm generated
> for long time.

That's what I expected. Can you try running what I posted again, but this time without the -XX:CompileOnly... command?


Also, your hypothesis that it's a race sounded very reasonable at first, but now I'm not so sure. As far as I understand this (and correct me if I'm wrong), the code is getting compiled, it's running for a long time correctly (1 hour+), and then it breaks. If there was some race in the compilation and wrong code was produced, we'd be seeing a mistake immediately after the compilation. And I don't see how there can be a race in generated code, since there's no shared data here.

Comment 67 Need Real Name 2011-03-28 19:32:27 UTC
I just started with -XX:CompileOnly.

take for example TFo.java, I posted logs above.
>it's running for a long time correctly (1 hour+)
10-30 minutes correctly, then failure.
right before the failure where was always loop unrolling:

https://bugzilla.redhat.com/attachment.cgi?id=487902
file line 58122: ; - TFo::calc@87 (line 86)

https://bugzilla.redhat.com/attachment.cgi?id=487903
file 57868: ; - TFo::calc@87 (line 86)

after thing compilation a failure occur (probably because of a race),
all other asm is fininshing the output.

Comment 68 Need Real Name 2011-03-28 19:39:32 UTC
Most likely race I see -
the code produced by JIT (compiled code) may be racy by itself.
It breaks eventually. 

In TFo.java it was easy to catch, because the failure occured 
very soon after the JIT loop unrollng.
But it did produce some number of correct result before the failure.

for example first 
26 calculations at [k=3][l-2]. give correct sa=2.3885752977674133
then the 27-th calculation at [k=3][l=2]. give wrong sa=2.388575290789103

Comment 69 Need Real Name 2011-03-28 19:53:19 UTC
Created attachment 488237 [details]
hotspot failure

One of TFD1 processes produced hotspot failure 
in _multianewarray2_Java 
But this may be unrelated to problem of different apparence of the race.



tack: [0x00007fb206c21000,0x00007fb206d22000],  sp=0x00007fb206d206a0,  free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x7d75bf]  typeArrayKlass::allocate(int, Thread*)+0x7f
V  [libjvm.so+0x69664f]  objArrayKlass::multi_allocate(int, int*, Thread*)+0x15f
V  [libjvm.so+0x72c67a]  OptoRuntime::multianewarray2_C(klassOopDesc*, int, int, JavaThread*)+0x3a

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_multianewarray2_Java
J  TFD1.calc(IIDD[[DLTFD1$ComputationLog;)[[D
J  TFD1.runTest(III)V
j  TFD1.multipleTests()V+12
j  TFD1.main([Ljava/lang/String;)V+4
v  ~StubRoutines::call_stub

Comment 70 Denis Lila 2011-03-28 20:23:08 UTC
(In reply to comment #69)
> Created attachment 488237 [details]
> hotspot failure
> 
> One of TFD1 processes produced hotspot failure 
> in _multianewarray2_Java 
> But this may be unrelated to problem of different apparence of the race.

Now this is definitely a bug. You're probably right that it's unrelated.

Anyway, some VM options that may be of interest to you:

-XX:+CompileTheWorld
-XX:+TraceIterativeGVN
-XX:+DeoptimizeALot
-XX:+PrintInlining
-XX:-Inline // this will turn off inlining. If you can't reproduce it with this, that means something might be wrong with inlining.
-XX:+VerifyLoopOptimizations

Comment 71 Need Real Name 2011-03-28 20:48:04 UTC
Started TFo (most easy to reproduce) with -XX:-Inline no problem yet (15 minutes). Typically they fail in 5-20 minutes. I will leave TFo running until tomorrow.

Our internal unit tests (where I originally found the problem)
passed OK with options -XX:-Inline.

options -XX:+CompileTheWorld -XX:+TraceIterativeGVN -XX:+DeoptimizeALot -XX:+PrintInlining -XX:+VerifyLoopOptimizations 
are not recognized by openjdk7

Comment 72 Denis Lila 2011-03-28 21:00:30 UTC
(In reply to comment #71)
...
> options -XX:+CompileTheWorld -XX:+TraceIterativeGVN -XX:+DeoptimizeALot
> -XX:+PrintInlining -XX:+VerifyLoopOptimizations 
> are not recognized by openjdk7

That's because you need a debug build of openjdk7. For that, just do "make debug" instead of "make".

Comment 73 Denis Lila 2011-03-28 21:05:22 UTC
Also, -XX:+VerifyIterativeGVN is very good for spotting JIT mistakes.

Comment 74 Need Real Name 2011-03-28 21:20:56 UTC
Created attachment 488276 [details]
an error with -XX:-Inline

A failure with:
java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:-Inline TFo

Comment 75 Need Real Name 2011-03-28 21:34:40 UTC
In this failure sa is always correct sa=0.1700774336283186

error is in waq calculated

incorrect:
-Loop 3 Added 8.597971821578138 at [0][0]. New value=15.580084180784024

correct:
+Loop 3 Added 9.295708174139355 at [0][0]. New value=16.27782053334524


all other waq are correct:

-Loop 3 Added 0.0025153796731071836 at [0][0]. New value=29.080013812172588
+Loop 3 Added 0.0025153796731071836 at [0][0]. New value=30.727116245989492

Comment 76 Need Real Name 2011-03-28 21:38:33 UTC
Strangely, the pattern is always: for loop and then two indices array += something

for(....)
{
 m[k][l]+=wb;

also, is function inline & loop unroll ins the same -XX:-Inline 
or different jdk options?

Comment 77 Deepak Bhole 2011-03-28 22:14:32 UTC
What is the corresponding source code for the output in comment 74?

Also, can you try again, this time only supply -XX:LoopUnrollLimit=0

That should prevent unrolling

Comment 78 Need Real Name 2011-03-29 06:02:16 UTC
TFO.java is Comment 52  2011-03-26 13:14:07 EDT 

https://bugzilla.redhat.com/attachment.cgi?id=487493

Comment 79 Need Real Name 2011-03-29 11:18:28 UTC
Created attachment 488415 [details]
a failure with LoopUnrollLimit=0

a failure with LoopUnrollLimit=0

Comment 80 Need Real Name 2011-03-29 12:10:59 UTC
My second guess about the problem is multiarray memory allocator.
because:
1. there was a failure in ~RuntimeStub::_multianewarray2_Java ( comment 69 )
This may be unrelated, but may be a hint.
2. The problem started to show when the 2D & 3D arrays were added to the code.
3. All failing sums are the sums over a double array.

Also note my comment 37 above which did not produce any changes in behavior.

Comment 81 Deepak Bhole 2011-03-29 16:54:17 UTC
That is odd. Can you try again, this time printing the value of sa each time it is modified in any of the loops, and what components make it up? I'd do it myself if it was reproducible here :/

e.g.

            double sa=0,wa=1;
            for(int ks=k;ks>=0;ks--)
            {          
            double sb=0,wb=1;
            for(int ls=l;ls>=0;ls--)
            {
                sb+=mx[ks][ls]*wb;
                wb*=xx;
                wb*=ls+1;
            }
            sa+=sb*wa;
            wa*=yy;
            wa*=ks+1;
            }

becomes:
            double sa=0,wa=1;
            for(int ks=k;ks>=0;ks--)
            {          
            double sb=0,wb=1;
            for(int ls=l;ls>=0;ls--)
            {
                sb+=mx[ks][ls]*wb;
                wb*=xx;
                wb*=ls+1;
            }

            debugStr[k][l].append("Incrementing sa=" + sa + " by " + sb*wa + " to be " + sa+sb*wa);

            sa+=sb*wa;
            wa*=yy;
            wa*=ks+1;
            }


This will tell us which specific part is incorrect, and then we can try to narrow it down from there by printing when sb, wa, etc. are modified. If you want, you can also print the subcomponents initially too.. it will make the log bigger, but then it wont have to be narrowed down later

Comment 82 Need Real Name 2011-03-29 19:32:57 UTC
I will do this tomorrow.

Comment 83 Need Real Name 2011-03-29 19:46:37 UTC
One more thing: I just looked at that my old workstation (where failures occur)
which for >4 hours was running 8 instances of 

java -Djava.compiler=NONE TFo 


Kernel crashed in one of java processes.
uname -a
Linux comp90 2.6.34.8-68.fc13.x86_64 #1 SMP Thu Feb 17 15:03:58 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I do not have a serial port nulmodem cable handy to catch the trace.

but anyway - can this problem be Linux smp race?
this would explain strange behaviour.

Interestingly - there was no single sign of problem with that 
old workstation, besides java.

I think then the most straighforward way for me - trying to update
it to F15, then try to reproduce java failure.

Comment 84 Need Real Name 2011-03-29 19:49:46 UTC
By the way - Debian (8core 32Gb)  
uname -a
Linux forum 2.6.26-2-amd64 #1 SMP Tue Jan 25 05:59:43 UTC 2011 x86_64 GNU/Linux

java -version
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)


locks when trying to run 20 instances of TFSimple
only pressing "reset" helps.

Comment 85 Deepak Bhole 2011-03-29 20:06:18 UTC
I am not sure why the Debian box is locking up just on something like high load. It is unfortunate that the error cannot be verified there.

As for the F15 update -- I am curious to see that too now. Hopefully it doesn't freeze, and it doesn't trigger there either.

Is the box capable of KVM virtualization? You might want to give that a shot too.

Comment 86 Need Real Name 2011-04-04 17:37:45 UTC
I was very busy with other things, so I did not at this much more.

But I some kind managed to get some kind abnormal (but not exact failure)  on a 
4 cores, 24GB, Fedora (old, but this does not matter)
Linux cc 2.6.27.41-170.2.117.fc10.x86_64 #1 SMP Thu Dec 10 10:36:29 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

with TFsimple.java from Comment 14 2011-03-24 08:33:13 EDT 

The results:
start 20 instances of TFsimple

n=0 ; while [ $n -lt 20 ]; do java -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly TFsimple >diag_$n &  n=$[ $n + 1 ] ;    done

it works perfectly fine for about 10-15 minutes (just high load), then the computer goes out of memory freezes, messages about >120 sec in interrupt in console, out of memory, etc.
No reaction on the keyboard, only pressing "reset" button helps.

The question is: why after about 10 minutes java processes start consuming that much memory on a such a non-memory intensive code as TFsimple.java

Comment 87 Need Real Name 2011-04-05 04:33:37 UTC
Created attachment 489908 [details]
TFSsimple.java

Code simplification on the same computer where all other tests were run,
the OS is not updated yet.
[][] arrays are removed, only [] left.
The test was run with hotspot disabled
java -XX:+PrintAssembly -Djava.compiler=NONE

The result (after few hours)

OpenJDK 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output
VM option '+PrintAssembly'
Running single thread code until test failure
Exception in thread "main" java.lang.RuntimeException: TEST=38917 Elements [3] differ
m1[0]=287.5015945415851	m2[0]=287.5015945415851
m1[1]=193.26240424710878	m2[1]=193.26240424710878
m1[2]=323.1759685698639	m2[2]=323.1759685698639
m1[3]=863.5951311480515	m2[3]=863.5951311480515

	at TFSsimple.checkm(TFSsimple.java:102)
	at TFSsimple.runTest(TFSsimple.java:19)
	at TFSsimple.multipleTests(TFSsimple.java:113)
	at TFSsimple.main(TFSsimple.java:124)

strange thing - m1[3] & m2[3] are identical.
they were checked at 
line 87: if(Double.compare(m1[i], m2[i]) != 0)
for some reason the value of array element were was taken wrong (most probably)
at line 87, but it was OK on lines 99 & 100.

Does this make any sense?

Comment 88 Need Real Name 2011-04-05 04:35:24 UTC
Created attachment 489909 [details]
proper TFSsimple.java

proper file TFSsimple.java

Comment 89 Denis Lila 2011-04-05 13:12:17 UTC
Hi.

I think what you have above is definitely a case of 2 different doubles seeming identical but having the same string representation. To confirm this, please run the test again, but replace

s.append("m1["+i+"]="+m1[i]+"\t");
s.append("m2["+i+"]="+m2[i]+"\n");

with

s.append("m1["+i+"]=" + Double.doubleToRawLongBits(m1[i]) + "\t");
s.append("m2["+i+"]=" + Double.doubleToRawLongBits(m2[i]) + "\n");


If I'm right and this inconsistency is due to simply the imprecise nature of finite space binary->decimal rational number conversion, then this last test conclusively rules out the optimizer or code generator as the problem. Since you can reproduce it with both the compiler and interpreter, and you can only reproduce it on one machine, the problem is almost certainly your hardware.

Perhaps you could try running some other computationally intensive task, like prime95 for a few hours?

Comment 90 Fedora End Of Life 2012-08-07 15:56:08 UTC
This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.