Bug 164153
Summary: | paranoia.c reports floating point problems | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | gpk | ||||
Component: | glibc | Assignee: | Jakub Jelinek <jakub> | ||||
Status: | CLOSED WONTFIX | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.0 | CC: | benl, drepper, tao | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-02-09 22:51:27 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
gpk
2005-07-25 13:35:15 UTC
*** Bug 167781 has been marked as a duplicate of this bug. *** I agree the error from pow (0.99999999999999989, -18014398509481984.0) on i?86 is big even when using 80-bit precision, but libm is always a compromise between accurracy and speed. If accurracy is the only factor, then users should use GMP with MPFR. The C double implementation of pow is < 1ulp at least on the above, but the i?86 assembly implementation is on the other side on the above input 50% faster. The above 50% are btw. comparing i386 .S pow speed with x86_64 .c pow speed on the same machine, so i386 .c would likely be even slower (as it can't use SSE/SSE2 unlike x86-64 version, unless there is SSE2 specific libm). The arguments used by paranoia for pow (0.99999999999999989, -18014398509481984.0) very likely will not show up in any real-world app, the exponent is really to high for the result to make any sense. Some libm functions (e.g. most of double functions originating from IBM accurate precision library, but a few others as well) attempt to have <=0.5 accuracy for all inputs, others are within a few ulps accurate on all inputs and others are accurate within a few ulps only for reasonable arguments (e.g. pow/i386 in this case). It all depends on which exact function on which exact architecture. It is really hard to generalize. Some information about libm precision can be found in libm-test-ulps files: http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/i386/fpu/libm-test-ulps?cvsroot=glibc http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/x86_64/fpu/libm-test-ulps?cvsroot=glibc and some e.g. in: http://people.redhat.com/drepper/libm/index.html As for calling libm functions with rounding precision other than 3 on i?86, simply avoid that. That is not supported, pow is not the only function that relies on the excess precision for accurate results. Just use -mfpmath=sse -msse2 to compile code that relies on no excess precision, or temporarily switch rounding precision to 3 and back around calling all libm functions. That said, from my testing so far it seems sysdeps/i386/fpu/e_pow.S gives consistantly less accurate results when using the algorithm for integral y than when using the general algorithm for non-integral y if y is about 2^31 or higher. If that's the case, perhaps a patch like the following (untested) one could be applied. This wouldn't slow down the usual cases (pow called either with integral y, |y| < 2^31, or non-integral y (and neither |y| >= 2^63)), actually would even make the case of |y| < 2^31 case faster. --- sysdeps/i386/fpu/e_pow.S.jj 2005-05-04 20:12:50.000000000 +0200 +++ sysdeps/i386/fpu/e_pow.S 2005-09-16 12:08:49.000000000 +0200 @@ -49,9 +49,9 @@ one: .double 1.0 ASM_TYPE_DIRECTIVE(limit,@object) limit: .double 0.29 ASM_SIZE_DIRECTIVE(limit) - ASM_TYPE_DIRECTIVE(p63,@object) -p63: .byte 0, 0, 0, 0, 0, 0, 0xe0, 0x43 - ASM_SIZE_DIRECTIVE(p63) + ASM_TYPE_DIRECTIVE(p31,@object) +p31: .byte 0, 0, 0, 0, 0, 0, 0xe0, 0x41 + ASM_SIZE_DIRECTIVE(p31) #ifdef PIC #define MO(op) op##@GOTOFF(%ecx) @@ -99,10 +99,10 @@ ENTRY(__ieee754_pow) fxch // y : x - /* fistpll raises invalid exception for |y| >= 1L<<63. */ + /* fistpl raises invalid exception for |y| >= 1L<<31. */ fld %st // y : y : x fabs // |y| : y : x - fcompl MO(p63) // y : x + fcompl MO(p31) // y : x fnstsw sahf jnc 2f @@ -110,8 +110,8 @@ ENTRY(__ieee754_pow) /* First see whether `y' is a natural number. In this case we can use a more precise algorithm. */ fld %st // y : y : x - fistpll (%esp) // y : x - fildll (%esp) // int(y) : y : x + fistpl (%esp) // y : x + fildl (%esp) // int(y) : y : x fucomp %st(1) // y : x fnstsw sahf @@ -122,25 +122,20 @@ ENTRY(__ieee754_pow) cfi_adjust_cfa_offset (-4) popl %edx cfi_adjust_cfa_offset (-4) - orl $0, %edx + orl $0, %eax fstp %st(0) // x jns 4f // y >= 0, jump fdivrl MO(one) // 1/x (now referred to as x) negl %eax - adcl $0, %edx - negl %edx 4: fldl MO(one) // 1 : x fxch -6: shrdl $1, %edx, %eax +6: shrl $1, %eax jnc 5f fxch fmul %st(1) // x : ST*x fxch 5: fmul %st(0), %st // x*x : ST*x - shrl $1, %edx - movl %eax, %ecx - orl %edx, %ecx jnz 6b fstp %st(0) // ST*x ret @@ -156,6 +151,7 @@ ENTRY(__ieee754_pow) 31: fstp %st(1) ret +32: subl $8, %esp cfi_adjust_cfa_offset (8) .align ALIGNARG(4) 2: /* y is a real number. */ Created attachment 121374 [details]
glibc-i386-pow.patch
Patch that both makes paranoia quiet and doesn't introduce make check
regressions. The drawback is that it increases libm.so size by more than
2.7KB, not sure if that's the price we want to pay for it, especially when
it is for values nobody uses in real-world apps.
The patch uses the assembly for |y| < INT_MAX and for bigger exponents
falls back to more accurate C implementation (I chose the old e_pow.c from Sun
instead of IBM e_pow.c, because the latter would not enlarge libm.so by
2.7KB, but by several dozens KB because of all the dependencies and tables).
|