Bug 43354

Summary: expm1l(x) overflows when 11356.1768 < x <= 11356.5234
Product: [Fedora] Fedora Reporter: Trevin Beattie <trevin>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED RAWHIDE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 8CC: drepper, dvlasenk, mattdm, notting, petrosyan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-09-17 13:03:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
New algorithm and testcase for it
none
Speed measurement program for the testcase none

Description Trevin Beattie 2001-06-03 18:44:26 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.4.2-2 i586)

Description of problem:
The value of (e^x - 1) is representable as a long double when x <=
11356.5234, but expm1l(x) overflows to infinity when x > 11356.1768
(approximately).


How reproducible:
Always

Steps to Reproduce:
Compile and run the following program:

#include <stdio.h>
#include <math.h>

main ()
{
  long double a, b;
  const long double x = 11356.35L;

  a = expl (x) - 1.0L;
  b = expm1l (x);
  if (a == b) {
    printf ("e^%.2Lf - 1 = %Lg\n", b);
    return 0;
  } else {
    printf ("e^%.2Lf - 1 = %Lg; but expm1l(%.2Lf) returned %Lg\n",
	    x, a, x, b);
    return 1;
  }
}


Actual Results:  e^11356.35 - 1 = 1.00032e+4932; but expm1l(11356.35)
returned inf


Expected Results:  e^11356.35 - 1 = 1.00032e+4932


Additional info:

This bug probably isn't even worth reporting, except for completeness. 
expm1l is designed to compute e^x-1 more accurately when x is negative, so
[at least the i386 version] uses a slightly different algorithm than expl. 
This algorithm overflows for large x where expl would not.  But then, expm1
really isn't intended for large x anyway.

In my own port of the math library, I circumvent the problem by calling
__ieee754_expl when x >= 64, because at that magnitude, -1 is too small to
affect the result.  This is probably overkill, but it works.

Comment 1 Jakub Jelinek 2001-06-05 08:02:17 UTC
Ulrich, what do you think about this?

Comment 2 Ulrich Drepper 2001-06-13 23:06:54 UTC
> except for completeness. 
> expm1l is designed to compute e^x-1 more accurately when x is negative, so
> [at least the i386 version] uses a slightly different algorithm than expl.

Yes, true.  If you compute expm1 the "normal" way you use a different series
than for computing exp.  But on x86 you'd not want to stop using the FPU
instructions and there is none available allows the processor internally to
use the different series.  It's just more drawback of the Intel FPU.

If you want to change it intercept the +Inf path and test for the arguments
which lead to incorrect results.  I don't consider this important enough to
do myself right now.

Comment 3 Vladimir Makarov 2004-10-04 22:29:52 UTC
gcc-2.96 is too old.  Its release cycle was finished long ago. 
Therefore I am closing the case.  The bug was fixed at least in 3.2
(rhel3).  If  it is still important, the customer could reopen the case.


Comment 4 Vladimir Makarov 2004-10-04 22:34:26 UTC
Sorry, occasionally I modified a wrong case.  I am reopening the bug.


Comment 5 Trevin Beattie 2004-10-04 22:46:30 UTC
Since this bug is so old, I decided to re-confirm the bug.  It is
still present in glibc-2.3.2-27.9.7 (RedHat 9).

But it is still just as non-urgent. ;^)


Comment 6 Bill Nottingham 2006-08-05 03:08:51 UTC
Red Hat apologizes that these issues have not been resolved yet. We do want to
make sure that no important bugs slip through the cracks.

Red Hat Linux 7.3 and Red Hat Linux 9 are no longer supported by Red Hat, Inc.
They are maintained by the Fedora Legacy project (http://www.fedoralegacy.org/)
for security updates only. If this is a security issue, please reassign to the
'Fedora Legacy' product in bugzilla. Please note that Legacy security update
support for these products will stop on December 31st, 2006.

If this is not a security issue, please check if this issue is still present
in a current Fedora Core release. If so, please change the product and version
to match, and check the box indicating that the requested information has been
provided.

If you are currently still running Red Hat Linux 7.3 or 9, please note that
Fedora Legacy security update support for these products will stop on December
31st, 2006. You are strongly advised to upgrade to a current Fedora Core release
or Red Hat Enterprise Linux or comparable. Some information on which option may
be right for you is available at http://www.redhat.com/rhel/migrate/redhatlinux/.

Any bug still open against Red Hat Linux 7.3 or 9 at the end of 2006 will be
closed 'CANTFIX'. Again, if this bug still exists in a current release, or is a
security issue, please change the product as necessary. We thank you for your
help, and apologize again that we haven't handled these issues to this point.


Comment 7 Trevin Beattie 2006-08-07 18:11:41 UTC
I have confirmed that this bug is present in Fedora Core 5.  This is not
surprising, since I had listed the bug as unimportant.  Feel free to close it as
'WONTFIX' if you don't think it's important either.

Comment 8 Ulrich Drepper 2006-10-20 23:04:29 UTC
VERIFIED is the wrong state.  It means a patch is verified.

Comment 9 petrosyan 2008-02-03 09:30:25 UTC
This bug is still present in Fedora 8:
gcc-4.1.2-33
glibc-2.7-2

Comment 10 petrosyan 2008-02-26 07:06:06 UTC
this bug is also present on x86_64.
Filed upstream http://sourceware.org/bugzilla/show_bug.cgi?id=5794

Comment 11 Denys Vlasenko 2008-07-30 09:47:05 UTC
I created a testcase of new algorithm, will attach it now.
Extended explanation (I do hope bugzilla wont destroy text alignment):

http://ex-cs.sist.ac.jp/~tkouya/cgi-bin/try_mpfr.cgi - 8192-bit online calculator

x86 long double has 1 sign bit, 15-bit biased exponent, 64-bit mantissa
with explicit msb. Example: 128.0 has bit value (hex) of
       4006 8000 0000 0000 0000,
 2.0 = 4000 8000 0000 0000 0000,
 1.0 = 3fff 8000 0000 0000 0000,
 0.0 = 0000 0000 0000 0000 0000,
-1.0 = bfff 8000 0000 0000 0000 etc.

ln(2^64) = 44.3614195558364998034.
Thus e^x with x >= 45 is bigger than 2^64 and therefore subtracting 1
(e^x - 1) does not change it if mantissa has only 64 bits.

Testing for x >= 45 is not very easy, it's easier to test for x >= 64
(just test for binary exponent >= 6 (in biased form: >= 0x4005).

However, long doubles have explicit msb in mantissa, this even if
exponent > 5, mantissa still can be 0.000...0001 (binary).
Thus, we'll test for binary exponent >= 7: generally it means
that x >= 128, and even denormalized long doubles are > 64 in this case.

Asm code is added at the beginning of expm1:

        movw    8+8(%rsp), %ax  // load sign bit and 15-bit exponent
        xorb    $0x80, %ah      // invert sign bit (now 1 is "positive")
        cmpw    $0xc006, %ax    // is num positive and exp >= 6 (number is >=
128.0)?
        jae     expl            // (if num is denormal, it is at least > 64.0)

Testing results:
./a.out 127.999999999:
...
Ok, last checked:
1.27999999999999999993e+02:         3.88770840599459506441e+55
                                    3.88770840599459506441e+55
                                   (3.88770840599459506521e+55)
expm1(1.27999999999999999993e+02) = 3.887708405994595065008714846...e55
[expm1() is obtained using 8192-bit online calculator, see URL above]

Thus, all values up to largest possible x < 128 are the same as old algorithm.

./a.out 128:
Different!
1.28000000000000000000e+02:         3.88770840599459510163e+55
3.88770840599459509206e+55 (3.88770840599459509206e+55)
                             Value: 3.88770840599459510163e+55  Bit value: 40b7
caf2 a62e ea10 bc1e (new)
                             Value: 3.88770840599459509206e+55  Bit value: 40b7
caf2 a62e ea10 bbfa (current)
                       expm1(128) = 3.887708405994595092222673688...e55

New algorightm gives different result, which is closer to "exact" one
than result of current algorightm.

Let's see how smooth the transition is:

For x < 128, new and current algorithms are the same:

                             new                        current                
    exp(x)
 1.27999999999999999965e+02: 3.88770840599459495222e+55
3.88770840599459495222e+55 (3.88770840599459495727e+55)
 1.27999999999999999972e+02: 3.88770840599459498944e+55
3.88770840599459498944e+55 (3.88770840599459498439e+55)
 1.27999999999999999979e+02: 3.88770840599459502692e+55
3.88770840599459502692e+55 (3.88770840599459501124e+55)
 1.27999999999999999986e+02: 3.88770840599459506441e+55
3.88770840599459506441e+55 (3.88770840599459503836e+55)
 1.27999999999999999993e+02: 3.88770840599459506441e+55
3.88770840599459506441e+55 (3.88770840599459506521e+55)
[1.27999999999999999993e+02 = 4005 ffff ffff ffff ffff]
                             3.88770840599459509206e+55 - expm1(128)
approximated as exp(128) by new algorithm
                             3.88770840599459510163e+55 - expm1(128) result via
current algorithm

Thus, we see that around x = 128 incrementing mantissa by 1 lsb
makes expm1(x) grow by      ~0.000000000000000037e+55

New algorithm does not exhibit "jump" here, transition
from 4005 ffff ffff ffff ffff to 4006 8000 0000 0000 0000
adds ~0.000000000000000028e+55 to "new" expm1(x).

I did many runs for x < 128 (including 0, negative numbers, etc)
and it gives identical results as expected.

Upper bound where result goes to +inf (./a.out 11356.52340629414):

1.13565234062941435678e+04: 1.18973149535677761008e+4932
1.13565234062941435687e+04: 1.18973149535677866678e+4932
1.13565234062941435695e+04: 1.18973149535677972347e+4932
[online calculator says     1.1897314953567796767173743362...e4932]
1.13565234062941435704e+04: inf
1.13565234062941435713e+04: inf
1.13565234062941435722e+04: inf


Comment 12 Denys Vlasenko 2008-07-30 09:58:45 UTC
Created attachment 312979 [details]
New algorithm and testcase for it

This testcase shows that approximating expm1l(x) with expl(x)
for x >= 128.0 makes expm1l(x) work correctly for x up to
11356.5234062941435695, provides more precise results for these large x,
and does not exhibit worsened precision for x < 128.0.
It also does not exhibit bad behavior at x = 128 (the function is
still ascending and df is not significantly different there
from nearby non-transition points).

Comment 13 Denys Vlasenko 2008-07-30 10:40:57 UTC
Created attachment 312981 [details]
Speed measurement program for the testcase

Drop test_speed.c in the same directory as testcase, build:
gcc -Os -lm test_speed.c s_expm1l.S s_expm1l_fixed.S
and run to compare execution speeds.

Each iteration is 64 calls of __glibc27_expm1l[_fixed]. Test is done 9 times in
order to show variability.

On my machine:
# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU	 T7500	@ 2.20GHz
stepping	: 11
cpu MHz 	: 800.000
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id 	: 0
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr lahf_lm ida
bogomips	: 4394.30

# ./a.out 1
Value: 1.00000000000000000000e+00  Bit value: 3fff 8000 0000 0000 0000
Testing at:
1.00000000000000000000e+00: 1.71828182845904523543e+00
1.71828182845904523543e+00 (2.71828182845904523543e+00)
Current algorithm, iterations in 0.1 sec:   9009  11411  11923	11931  11933 
11973  11932  11918  11917
    New algorithm, iterations in 0.1 sec:  11510  11787  11797	11798  11794 
11804  11810  11816  11808
# ./a.out 127
Value: 1.27000000000000000000e+02  Bit value: 4005 fe00 0000 0000 0000
Testing at:
1.27000000000000000000e+02: 1.43020799583481045627e+55
1.43020799583481045627e+55 (1.43020799583481044630e+55)
Current algorithm, iterations in 0.1 sec:  10781  11621  12022	12017  12022 
12002  12000  12022  12021
    New algorithm, iterations in 0.1 sec:  11753  11819  11835	11812  11829 
11826  11835  11829  11849
# ./a.out 128
Value: 1.28000000000000000000e+02  Bit value: 4006 8000 0000 0000 0000
Testing at:
1.28000000000000000000e+02: 3.88770840599459510163e+55
3.88770840599459509206e+55 (3.88770840599459509206e+55)
Current algorithm, iterations in 0.1 sec:   9741  11620  11910	11929  11741 
11947  11956  11940  11938
    New algorithm, iterations in 0.1 sec:  11188  11272  11310	11320  11315 
11302  11292  11286  11293
# ./a.out 11356
Value: 1.13560000000000000000e+04  Bit value: 400c b170 0000 0000 0000
Testing at:
1.13560000000000000000e+04: 7.04914579998566566372e+4931
7.04914579998566243830e+4931 (7.04914579998566243830e+4931)
Current algorithm, iterations in 0.1 sec:  11142  11612  12007	12017  12018 
12018  11748  12012  11801
    New algorithm, iterations in 0.1 sec:  11117  11223  11230	11241  11239 
11239  11211  11223  11227

Comment 14 Denys Vlasenko 2008-07-30 12:02:10 UTC
Slight error in explanation:

-------------
./a.out 128:
Different!
1.28000000000000000000e+02:         3.88770840599459510163e+55
3.88770840599459509206e+55 (3.88770840599459509206e+55)
                             Value: 3.88770840599459510163e+55 (new)
                             Value: 3.88770840599459509206e+55 (current)
                       expm1(128) = 3.887708405994595092222673688...e55
New algorightm gives different result, which is closer to "exact" one
than result of current algorightm.
----------

(new) and (current) above should be swapped.

Comment 15 Ulrich Drepper 2008-08-03 17:55:10 UTC
Your patch is very weird.  It's x86-64 code but this is supposed to be for x86.  If it is meant to be for x86-64, then it is wrong because on that platform the floating point parameters are passed in the SSE registers.

About the change itself, you jump to expl in the double implementation.  You should jump to exp instead (no l suffix).  Furthermore, you probably shouldn't jujp to exp since this would duplicate the error handling wrapper.  Jump to __ieee754_exp.  This also takes care of the problem of jumping to a symbol which is exported and hence would have to go through the PLT.

About the code itself: all the additional text like those ASM_TYPE_DIRECTIVE are not optional.  You cannot just remove it.  Leave it all on place.

Comment 16 Denys Vlasenko 2008-08-04 10:51:19 UTC
> Your patch is very weird.  It's x86-64 code but this is supposed to be for x86.

I do not understand how you reached this conclusion. What exactly is weird?

> If it is meant to be for x86-64, then it is wrong because on that platform the floating point parameters are passed in the SSE registers.

The bug is about expm1l - note 'l'! It's a *long double* function.
I doubt very much you can pass 80-bit long doubles in 64-bit SSE registers.
Therefore this function has to use legacy x86 FP registers and as a result,
the function uses no x86_64 specific instructions. It's the same for x86_64
and x86_32.

> About the change itself, you jump to expl in the double implementation.  You should jump to exp instead (no l suffix).

See above. It's exp1ml, not exp1m. Therefore I have to jump to expl.
Had I jumped to exp, my testing would instantly catch it. Do you think
I lied about my test results?

> Furthermore, you probably shouldn't jujp to exp since this would
duplicate the error handling wrapper.  Jump to __ieee754_exp.
This also takes care of the problem of jumping to a symbol
which is exported and hence would have to go through the PLT.

You are correct here, in a glibc patch I would need to do that.

What I did attach is not a patch which can be applied to glibc tree,
I attached a test program which is meant to show that the proposed
change is working correctly. Making a glibc patch based
on this program is easy.

Comment 17 Ulrich Drepper 2008-08-05 22:09:31 UTC
I brought the changes into shape to be applied to glibc cvs.  The next rawhide build will have the fixed code.

Comment 18 Denys Vlasenko 2008-09-02 10:55:04 UTC
Thanks Ulrich, should I close this bug now?

Comment 19 Ulrich Drepper 2008-09-17 13:03:59 UTC
Fixed in rawhide.  Might or might not be backported to older Fedora releases.    It'st important enough to bother.