Bug 164153

Summary: paranoia.c reports floating point problems
Product: Red Hat Enterprise Linux 3 Reporter: gpk
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: benl, drepper.fsp, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-09 17:51:27 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Description Flags
glibc-i386-pow.patch none

Description gpk 2005-07-25 09:35:15 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616

Description of problem:
Paranoia.c is a venerable and respected test program for floating
point operations.  It can be found at

Compiling it as follows and running it yields:

Diagnosis resumes after milestone Number 40          Page: 6
Checking rounding on multiply, divide and add/subtract.
* is neither chopped nor correctly rounded.
/ is neither chopped nor correctly rounded.
Addition/Subtraction neither rounds nor chops.
Sticky bit used incorrectly or not at all.
FLAW:  lack(s) of guard digits or failure(s) to correctly round or chop
(noted above) count as one flaw in the final tally below.


Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1.
DEFECT:  Calculated 7.38905609548934539e+00 for
        (1 + (-1.11022302462515654e-16) ^ (-1.80143985094819840e+16);
        differs from correct value by -3.44130679508225512e-09 .
        This much error may spoil financial
        calculations involving tiny interest rates.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.download from http://www.netlib.org/paranoia/paranoia.c
2. gcc -o paranoia paranoia.c -lm

3. ./paranoia

Actual Results:  Much text, including above error message.

Expected Results:  Much text,without above error message.

Additional info:

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2.3/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --host=i386-redhat-linux
Thread model: posix
gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-52)

The computer is single processor

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 7
cpu MHz         : 2405.522
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 4797.23
Comment 1 Jakub Jelinek 2005-09-08 03:34:36 EDT
*** Bug 167781 has been marked as a duplicate of this bug. ***
Comment 2 Jakub Jelinek 2005-09-08 03:37:54 EDT
I agree the error from pow (0.99999999999999989, -18014398509481984.0)
on i?86 is big even when using 80-bit precision, but libm is always a compromise
between accurracy and speed.  If accurracy is the only factor, then users should
use GMP with MPFR.  The C double implementation of pow is < 1ulp at least on
the above, but the i?86 assembly implementation is on the other side on the
above input 50% faster.

The above 50% are btw. comparing i386 .S pow speed with x86_64 .c pow speed
on the same machine, so i386 .c would likely be even slower (as it can't use
SSE/SSE2 unlike x86-64 version, unless there is SSE2 specific libm).
Comment 3 Jakub Jelinek 2005-09-16 07:11:55 EDT
The arguments used by paranoia for
pow (0.99999999999999989, -18014398509481984.0)
very likely will not show up in any real-world app, the exponent is
really to high for the result to make any sense.

Some libm functions (e.g. most of double functions originating
from IBM accurate precision library, but a few others as well) attempt
to have <=0.5 accuracy for all inputs, others are within a few ulps
accurate on all inputs and others are accurate within a few ulps only
for reasonable arguments (e.g. pow/i386 in this case).  It all depends
on which exact function on which exact architecture.  It is really
hard to generalize.

Some information about libm precision can be found in libm-test-ulps
and some e.g. in:

As for calling libm functions with rounding precision other than
3 on i?86, simply avoid that.  That is not supported, pow is not the
only function that relies on the excess precision for accurate
results.  Just use -mfpmath=sse -msse2 to compile code that
relies on no excess precision, or temporarily switch rounding precision
to 3 and back around calling all libm functions.

That said, from my testing so far it seems sysdeps/i386/fpu/e_pow.S
gives consistantly less accurate results when using the algorithm
for integral y than when using the general algorithm for non-integral
y if y is about 2^31 or higher.  If that's the case, perhaps a patch like
the following (untested) one could be applied.  This wouldn't slow down
the usual cases (pow called either with integral y, |y| < 2^31, or
non-integral y (and neither |y| >= 2^63)), actually would
even make the case of |y| < 2^31 case faster.

--- sysdeps/i386/fpu/e_pow.S.jj 2005-05-04 20:12:50.000000000 +0200
+++ sysdeps/i386/fpu/e_pow.S    2005-09-16 12:08:49.000000000 +0200
@@ -49,9 +49,9 @@ one:  .double 1.0
 limit: .double 0.29
-       ASM_TYPE_DIRECTIVE(p63,@object)
-p63:   .byte 0, 0, 0, 0, 0, 0, 0xe0, 0x43
+       ASM_TYPE_DIRECTIVE(p31,@object)
+p31:   .byte 0, 0, 0, 0, 0, 0, 0xe0, 0x41

 #ifdef PIC
 #define MO(op) op##@GOTOFF(%ecx)
@@ -99,10 +99,10 @@ ENTRY(__ieee754_pow)

        fxch                    // y : x

-       /* fistpll raises invalid exception for |y| >= 1L<<63.  */
+       /* fistpl raises invalid exception for |y| >= 1L<<31.  */
        fld     %st             // y : y : x
        fabs                    // |y| : y : x
-       fcompl  MO(p63)         // y : x
+       fcompl  MO(p31)         // y : x
        jnc     2f
@@ -110,8 +110,8 @@ ENTRY(__ieee754_pow)
        /* First see whether `y' is a natural number.  In this case we
           can use a more precise algorithm.  */
        fld     %st             // y : y : x
-       fistpll (%esp)          // y : x
-       fildll  (%esp)          // int(y) : y : x
+       fistpl  (%esp)          // y : x
+       fildl   (%esp)          // int(y) : y : x
        fucomp  %st(1)          // y : x
@@ -122,25 +122,20 @@ ENTRY(__ieee754_pow)
        cfi_adjust_cfa_offset (-4)
        popl    %edx
        cfi_adjust_cfa_offset (-4)
-       orl     $0, %edx
+       orl     $0, %eax
        fstp    %st(0)          // x
        jns     4f              // y >= 0, jump
        fdivrl  MO(one)         // 1/x          (now referred to as x)
        negl    %eax
-       adcl    $0, %edx
-       negl    %edx
 4:     fldl    MO(one)         // 1 : x

-6:     shrdl   $1, %edx, %eax
+6:     shrl    $1, %eax
        jnc     5f
        fmul    %st(1)          // x : ST*x
 5:     fmul    %st(0), %st     // x*x : ST*x
-       shrl    $1, %edx
-       movl    %eax, %ecx
-       orl     %edx, %ecx
        jnz     6b
        fstp    %st(0)          // ST*x
@@ -156,6 +151,7 @@ ENTRY(__ieee754_pow)
 31:    fstp    %st(1)

+32:    subl    $8, %esp
        cfi_adjust_cfa_offset (8)
        .align ALIGNARG(4)
 2:     /* y is a real number.  */
Comment 7 Jakub Jelinek 2005-11-22 16:19:09 EST
Created attachment 121374 [details]

Patch that both makes paranoia quiet and doesn't introduce make check
regressions.  The drawback is that it increases libm.so size by more than
2.7KB, not sure if that's the price we want to pay for it, especially when
it is for values nobody uses in real-world apps.
The patch uses the assembly for |y| < INT_MAX and for bigger exponents
falls back to more accurate C implementation (I chose the old e_pow.c from Sun
instead of IBM e_pow.c, because the latter would not enlarge libm.so by
2.7KB, but by several dozens KB because of all the dependencies and tables).