Bug 520518

Summary: dlamch is miscompiled
Product: [Fedora] Fedora Reporter: Dmitri A. Sergatskov <dasergatskov>
Component: lapackAssignee: Tom "spot" Callaway <tcallawa>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11CC: alex, pasteur, tcallawa
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 3.2.1-3.fc10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-06 20:38:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 510841    

Description Dmitri A. Sergatskov 2009-08-31 21:29:01 UTC
Description of problem:
dlamch.f is compiled with -Os and that seems to cause problems
It should be compiled with -O0 (or even -ffloat-store -O0 if
that is not redundant)

Version-Release number of selected component (if applicable):
lapack-3.1.1

How reproducible:
100%

Steps to Reproduce:
1. compile dlamchtst.f program from lapack distribution:
gfortran -o dlamchtst dlamchtst.f -llapack
2. run ./dlamchtst
  
Actual results:

  Epsilon                      =   2.22044604925031308E-016
  Safe minimum                 =   2.22507385850720138E-308
  Base                         =    2.0000000000000000     
  Precision                    =   4.44089209850062616E-016
  Number of digits in mantissa =    53.000000000000000     
  Rounding mode                =    0.0000000000000000     
  Minimum exponent             =   -1021.0000000000000     
  Underflow threshold          =   2.22507385850720138E-308
  Largest exponent             =    1024.0000000000000     
  Overflow threshold           =   1.79769313486231571E+308


Expected results:

  Epsilon                      =   1.11022302462515654E-016
  Safe minimum                 =   2.22507385850720138E-308
  Base                         =    2.0000000000000000     
  Precision                    =   2.22044604925031308E-016
  Number of digits in mantissa =    53.000000000000000     
  Rounding mode                =    1.0000000000000000     
  Minimum exponent             =   -1021.0000000000000     
  Underflow threshold          =   2.22507385850720138E-308
  Largest exponent             =    1024.0000000000000     
  Overflow threshold           =   1.79769313486231571E+308
  Reciprocal of safe minimum   =   4.49423283715578977E+307

(Note the Epsilon, Precision, and Rounding mode)

Additional info:

This error propagates to atlas-sse and atlas-sse2 packages
since they built against lapack.

Comment 1 Dmitri A. Sergatskov 2009-08-31 21:35:21 UTC
That same bug is present in Fedora-12-alpha

Comment 2 Tom "spot" Callaway 2009-08-31 21:43:04 UTC
Hmm. Looking at what is in rawhide:
[spot@pterodactyl INSTALL]$ rpm -q lapack
lapack-3.2.1-2.fc12.x86_64

[spot@pterodactyl INSTALL]$ gfortran -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -o dlamchtst dlamchtst.f -llapack
[spot@pterodactyl INSTALL]$ ./dlamchtst 
  Epsilon                      =   1.11022302462515654E-016
  Safe minimum                 =   2.22507385850720138E-308
  Base                         =    2.0000000000000000     
  Precision                    =   2.22044604925031308E-016
  Number of digits in mantissa =    53.000000000000000     
  Rounding mode                =    1.0000000000000000     
  Minimum exponent             =   -1021.0000000000000     
  Underflow threshold          =   2.22507385850720138E-308
  Largest exponent             =    1024.0000000000000     
  Overflow threshold           =   1.79769313486231571E+308
  Reciprocal of safe minimum   =   4.49423283715578977E+307

That all looks good to me. (Note that I used the optflags that were used to compile lapack). Try updating to the newest lapack update (the alpha didn't get the 3.2.1 update in time).

Comment 3 Dmitri A. Sergatskov 2009-08-31 21:52:13 UTC
This bug is i*86 specific, I do not see problem on x86_64 (Fedora 11) either.

See https://bugzilla.redhat.com/show_bug.cgi?id=520275
for related info.

Comment 4 Dmitri A. Sergatskov 2009-08-31 21:55:45 UTC
Actually this bug report stemmed from the investigation into:
https://bugzilla.redhat.com/show_bug.cgi?id=510841
(issues with building octave and octave-forge on rawhide)

Comment 5 Tom "spot" Callaway 2009-08-31 22:03:21 UTC
Update to modern CPU architectures? :)

Okay, I'll get an i686 F12 virt instance up and see what I can do.

Comment 6 Dmitri A. Sergatskov 2009-08-31 22:09:31 UTC
"Update to modern CPU architectures? :)"

I did :). I had one hell of a time trying to find i686
computer to reproduce this bug. Perhaps Fedora will
deprecate this x86 POS soon.

Cheers.

Dmitri.
--

Comment 7 Tom "spot" Callaway 2009-09-04 13:02:30 UTC
Hmm. I can reproduce this now on i686, but in my testing, using -O0 doesn't fix it.

Comment 8 Tom "spot" Callaway 2009-09-04 19:17:17 UTC
So, I'm not making any real progress trying to fix this issue. No matter what I change the compilation flags on dlamsch.f to, the shared library tests return different values than the static test binaries. Even more confusing, the shared library dlamchtst results were apparently, once correct, see:

 https://bugzilla.redhat.com/show_bug.cgi?id=138447

Any suggestions on how to proceed here would be greatly appreciated.

Comment 9 Dmitri A. Sergatskov 2009-09-04 19:39:59 UTC
Are you sure you linking to the right library, not the one from atlas
(that is also broken since it is compiled against lapack's)?
It worked for me on F11. Perhaps we need to add -ffloat-store as well.
Jakub Jelinek will know for sure.

Comment 10 Dmitri A. Sergatskov 2009-09-04 19:44:29 UTC
OK, the numbers quoted by Terje Røsten (terjeros.no) 
are incorrect -- this is probably the reason the Os were used in
lapack all along. The correct IEEE754 numbers are:

  Epsilon                      =   1.11022302462515654E-016
  Safe minimum                 =   2.22507385850720138E-308
  Base                         =    2.0000000000000000     
  Precision                    =   2.22044604925031308E-016
  Number of digits in mantissa =    53.000000000000000     
  Rounding mode                =    1.0000000000000000     
  Minimum exponent             =   -1021.0000000000000     
  Underflow threshold          =   2.22507385850720138E-308
  Largest exponent             =    1024.0000000000000     
  Overflow threshold           =   1.79769313486231571E+308
  Reciprocal of safe minimum   =   4.49423283715578977E+307

Comment 11 Tom "spot" Callaway 2009-09-04 20:54:42 UTC
Sorry about that. atlas-sse got dragged in by numpy and was skewing my testing.

Once I confirm that -O0 is all that is necessary, I'll make updates. Thanks for your help and patience.

Comment 12 Fedora Update System 2009-09-04 22:04:06 UTC
lapack-3.2.1-3.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/lapack-3.2.1-3.fc10

Comment 13 Fedora Update System 2009-09-04 22:04:11 UTC
lapack-3.2.1-3.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/lapack-3.2.1-3.fc11

Comment 14 Dmitri A. Sergatskov 2009-09-04 23:55:50 UTC
This bug actually a burning issue for rawhide -- it prevents
octave-3.2.2 and later (baseline version in rawhide) to build 
correctly. 
The octave 3.0.5 (baseline in F11) uses its own heuristic
(rather than dlamch) to find out FP math parameters; so 
it is less affected by this bug.

Comment 15 Tom "spot" Callaway 2009-09-05 00:13:25 UTC
Rawhide is already fixed, I did it at the same time.

Comment 16 Fedora Update System 2009-09-06 20:37:56 UTC
lapack-3.2.1-3.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2009-09-06 20:45:32 UTC
lapack-3.2.1-3.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 18 Alex Lancaster 2009-09-07 04:42:28 UTC
I'm not 100% clear but wasn't this lapack build supposed to fix the warnings (which are soon to be treated as errors) in octave forge such as:

warning: lo_ieee_init: unrecognized floating point format!

If so, it doesn't appear that it has, see task here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=1659097

and full build log here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=1659097

for build of octave-forge against new lapack and recompiled octave (against new lapack).

Comment 19 Dmitri A. Sergatskov 2009-09-07 04:54:33 UTC
Are you sure atlas's lapack did not get linked in?
The atlas has to be rebuild against fixed lapack.

I also do not understand why it has -mtune=atom switch in there...

Comment 20 Alex Lancaster 2009-09-07 05:09:56 UTC
(In reply to comment #19)
> Are you sure atlas's lapack did not get linked in?

I'm not sure if it did or not, I haven't changed anything in octave/octave-forge related to atlas as I wasn't entirely sure what needs to happen.  

> The atlas has to be rebuild against fixed lapack.

I will do that now.  Once atlas is rebuilt, does octave then need to be rebuilt as per bug #513381 comment #3 or is that a separate issue?

> I also do not understand why it has -mtune=atom switch in there...  

Where, in octave-forge or in octave?

Comment 21 Dmitri A. Sergatskov 2009-09-07 05:24:13 UTC
If you do "ldd /usr/bin/octave" does it
show /usr/lib/atlas(-sse)/liblapack ?

I am looking at root.log and it shows atlas-sse as 
dependence. 

I see mtune=atom in build.log

+ '[' -f configure ']'
+ CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables'

Perhaps it is benign (since march is still i686).

Comment 22 Alex Lancaster 2009-09-07 05:31:46 UTC
(In reply to comment #21)
> If you do "ldd /usr/bin/octave" does it
> show /usr/lib/atlas(-sse)/liblapack ?

Running on my usual F-11 with the old octave-3.0.5-1.fc11.x86_64 (not the new rawhide package as I don't have access to a rawhide machine to install it) it shows:

$ ldd /usr/bin/octave|grep lapack
	liblapack.so.3 => /usr/lib64/atlas/liblapack.so.3 (0x000000363b000000)

> I am looking at root.log and it shows atlas-sse as 
> dependence. 
> 
> I see mtune=atom in build.log
> 
> + '[' -f configure ']'
> + CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i686 -mtune=atom
> -fasynchronous-unwind-tables'
> 
> Perhaps it is benign (since march is still i686).  

If this isn't a problem that the lapack package can or needs to solve, perhaps we should move this discussion back to bug #510841.

Comment 23 Dmitri A. Sergatskov 2009-09-07 05:34:23 UTC
I filled a bug report against ATLAS:

https://bugzilla.redhat.com/show_bug.cgi?id=521579

Comment 24 Alex Lancaster 2009-09-07 05:42:33 UTC
(In reply to comment #23)
> I filled a bug report against ATLAS:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=521579  

rawhide build in progress:

http://koji.fedoraproject.org/koji/buildinfo?buildID=130835