Bug 1127544 - openjpeg links with -ffast-math, modifies floating point handling of all loaders of the library
Summary: openjpeg links with -ffast-math, modifies floating point handling of all load...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: openjpeg
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Orion Poplawski
QA Contact: Fedora Extras Quality Assurance
URL: https://code.google.com/p/openjpeg/is...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-07 06:08 UTC by Elliott Sales de Andrade
Modified: 2015-07-30 18:06 UTC (History)
25 users (show)

Fixed In Version: openjpeg-1.5.1-14.fc20
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-03 17:22:59 UTC


Attachments (Terms of Use)
diagnostic script (1.57 KB, text/plain)
2015-04-27 12:14 UTC, Petr Viktorin
no flags Details

Description Elliott Sales de Andrade 2014-08-07 06:08:16 UTC
Description of problem:
After importing osgeo/gdal, the printing of floating point values changes.

Version-Release number of selected component (if applicable):
gdal-python-1.10.1-2.fc20.x86_64

Steps to Reproduce:
$ python
>>> s = '4.74303020008e-322'
>>> f = float(s)
>>> print(str(f))
4.74303020008e-322
>>> f
4.74e-322
>>> str(f) == s
True
>>> import osgeo
>>> print(str(f))
0.0
>>> f
0.0
>>> str(f) == s
False

Actual results:
See above.

Expected results:
String-ifing floats should not change when osgeo is imported.

Additional info:
I tried this with miniconda, which installs gdal-1.10.1. The change does not occur there, so there's something weird with Fedora's gdal package.

Comment 1 Elliott Sales de Andrade 2015-04-21 03:08:15 UTC
This is actually a bit worse than initially found. It doesn't just affect printing. For example,

>>> f = 1.401298464324817e-45
>>> import numpy as np
>>> np.array(f, dtype='f4')
array(1.401298464324817e-45, dtype=float32)
>>> import osgeo
>>> np.array(f, dtype='f4')
array(0.0, dtype=float32)

The number above is chosen to be 0x00000001 in memory, but it's not working right and causing trouble with tests in some other packages.

Comment 2 Orion Poplawski 2015-04-25 23:55:34 UTC
I'm reassigning to python just to bring in some python folks to see if they have any ideas on what might be happening here.  I'm at a loss as to where to start looking

One possible difference from the conda package *may* be regeneration of the swig bindings with a newer swig, so it might be a swig thing as well.

Comment 3 Petr Viktorin 2015-04-27 12:14:03 UTC
Hello,
Importing the osgeo module changes the floating point environment, see fenv(3). Resetting the FP environment restores the behavior.

In particular (on my x86_64 machine), MXCSR flags are changed
from: 0b0001111110100010
  to: 0b1001111111110000

According to a random internet guide [0], the affected bits are "Denormal", "Underflow", "Denormals Are Zero", "Round To zero", so it's likely the culprit.

I'll attach a script I found this with, but I'm not familiar enough with gdal to go dig up where this happens. I believe a well-behaved library should restore the environment, though, so I don't think it's a Python bug.

[0] http://softpixel.com/~cwright/programming/simd/sse.php

Comment 4 Petr Viktorin 2015-04-27 12:14:39 UTC
Created attachment 1019306 [details]
diagnostic script

Comment 5 Orion Poplawski 2015-04-27 14:33:37 UTC
Thanks for the info, just what I was hoping for!

Comment 6 Orion Poplawski 2015-04-28 17:09:35 UTC
It appears to get modified in a couple of location:

import osgeo
Watchpoint 1: $mxcsr

Old value = [ PE IM DM ZM OM UM PM ]
New value = [ IM DM ZM OM UM PM ]
0x00007fffe1d9b6dd in ?? () from /lib64/libgfortran.so.3
(gdb) bt
#0  0x00007fffe1d9b6dd in ?? () from /lib64/libgfortran.so.3
#1  0x00007fffe1d98f2d in ?? () from /lib64/libgfortran.so.3
#2  0x00007ffff7deaf2a in call_init (l=<optimized out>, argc=argc@entry=1, 
    argv=argv@entry=0x7fffffffd6a8, env=env@entry=0x7fffffffd6b8) at dl-init.c:76
#3  0x00007ffff7deb03b in call_init (env=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, l=<optimized out>) at dl-init.c:34
#4  _dl_init (main_map=main_map@entry=0x6d1670, argc=1, argv=0x7fffffffd6a8, 
    env=0x7fffffffd6b8) at dl-init.c:124
#5  0x00007ffff7defa31 in dl_open_worker (a=a@entry=0x7fffffffc548) at dl-open.c:566
#6  0x00007ffff7deadd4 in _dl_catch_error (objname=objname@entry=0x7fffffffc538, 
    errstring=errstring@entry=0x7fffffffc540, mallocedp=mallocedp@entry=0x7fffffffc537, 
    operate=operate@entry=0x7ffff7def570 <dl_open_worker>, args=args@entry=0x7fffffffc548)
    at dl-error.c:187
#7  0x00007ffff7deeec3 in _dl_open (
    file=0x7ffff7e7578c "/usr/lib64/python2.7/site-packages/osgeo/_gdal.so", mode=-2147483646, 
    caller_dlopen=0x7ffff7b1ab9f <_PyImport_GetDynLoadFunc+287>, nsid=-2, 
    argc=<optimized out>, argv=<optimized out>, env=0x7fffffffd6b8) at dl-open.c:650
#8  0x00007ffff75ee039 in dlopen_doit (a=a@entry=0x7fffffffc760) at dlopen.c:66
#9  0x00007ffff7deadd4 in _dl_catch_error (objname=0x679890, errstring=0x679898, 
    mallocedp=0x679888, operate=0x7ffff75edfe0 <dlopen_doit>, args=0x7fffffffc760)
    at dl-error.c:187
#10 0x00007ffff75ee69d in _dlerror_run (operate=operate@entry=0x7ffff75edfe0 <dlopen_doit>, 
    args=args@entry=0x7fffffffc760) at dlerror.c:163
#11 0x00007ffff75ee0d1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#12 0x00007ffff7b1ab9f in _PyImport_GetDynLoadFunc () from /lib64/libpython2.7.so.1.0
#13 0x00007ffff7b0309e in _PyImport_LoadDynamicModule () from /lib64/libpython2.7.so.1.0
#14 0x00007ffff7b01b17 in imp_load_module () from /lib64/libpython2.7.so.1.0


Continuing.
Watchpoint 1: $mxcsr

Old value = [ IM DM ZM OM UM PM ]
New value = [ DAZ IM DM ZM OM UM PM FZ ]
0x00007fffe619d112 in set_fast_math () from /lib64/libopenjpeg.so.1
(gdb) bt
#0  0x00007fffe619d112 in set_fast_math () from /lib64/libopenjpeg.so.1
#1  0x00007ffff7deaf2a in call_init (l=<optimized out>, argc=argc@entry=1,
    argv=argv@entry=0x7fffffffd6a8, env=env@entry=0x7fffffffd6b8) at dl-init.c:76
#2  0x00007ffff7deb03b in call_init (env=<optimized out>, argv=<optimized out>,
    argc=<optimized out>, l=<optimized out>) at dl-init.c:34
#3  _dl_init (main_map=main_map@entry=0x6d1670, argc=1, argv=0x7fffffffd6a8,
    env=0x7fffffffd6b8) at dl-init.c:124
#4  0x00007ffff7defa31 in dl_open_worker (a=a@entry=0x7fffffffc548) at dl-open.c:566
#5  0x00007ffff7deadd4 in _dl_catch_error (objname=objname@entry=0x7fffffffc538,
    errstring=errstring@entry=0x7fffffffc540, mallocedp=mallocedp@entry=0x7fffffffc537,
    operate=operate@entry=0x7ffff7def570 <dl_open_worker>, args=args@entry=0x7fffffffc548)
    at dl-error.c:187
#6  0x00007ffff7deeec3 in _dl_open (
    file=0x7ffff7e7578c "/usr/lib64/python2.7/site-packages/osgeo/_gdal.so", mode=-2147483646,
    caller_dlopen=0x7ffff7b1ab9f <_PyImport_GetDynLoadFunc+287>, nsid=-2,
    argc=<optimized out>, argv=<optimized out>, env=0x7fffffffd6b8) at dl-open.c:650


Looks like this may be triggered by openjpeg being compiled with -ffast-math.  Ah, what a mess.  I wonder what the best method is for mitigating this in general.

Comment 7 Orion Poplawski 2015-04-28 17:16:28 UTC
Just to be clear - the libgfortran is just clearing the Precision Exception bit set from an earlier exception somewhere.

libopenjpeg.so.1 is setting:

FZ - Flush To Zero
DAZ - Denormals Are Zero

FZ mode causes all underflowing operations to simply go to zero. This saves some processing time, but loses precision.

DAZ tells the CPU to force all Denormals to zero. A Denormal is a number that is so small that FPU can't renormalize it due to limited exponent ranges. They're just like normal numbers, but they take considerably longer to process. Note that not all processors support DAZ.

which certainly looks like the observed behavior.

Comment 8 Orion Poplawski 2015-04-28 19:34:56 UTC
Looks like openjpeg needs to not *link* with -ffast-math.  I should be able to put together an update soon.

Comment 9 Orion Poplawski 2015-04-28 21:55:06 UTC
Filed bug upstream: https://code.google.com/p/openjpeg/issues/detail?id=488

Comment 10 Fedora Update System 2015-04-28 22:11:58 UTC
openjpeg-1.5.1-14.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/openjpeg-1.5.1-14.fc20

Comment 11 Fedora Update System 2015-04-28 22:12:07 UTC
openjpeg-1.5.1-14.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/openjpeg-1.5.1-14.fc21

Comment 12 Fedora Update System 2015-04-28 22:12:16 UTC
openjpeg-1.5.1-14.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/openjpeg-1.5.1-14.fc22

Comment 13 Fedora Update System 2015-04-30 11:34:27 UTC
Package openjpeg-1.5.1-14.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing openjpeg-1.5.1-14.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-7165/openjpeg-1.5.1-14.fc21
then log in and leave karma (feedback).

Comment 14 Fedora Update System 2015-05-03 17:22:59 UTC
openjpeg-1.5.1-14.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2015-05-12 20:51:05 UTC
openjpeg-1.5.1-14.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 16 Fedora Update System 2015-05-15 13:33:44 UTC
openjpeg-1.5.1-14.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 17 Andy Lutomirski 2015-07-29 23:55:11 UTC
Slightly off topic, but I wonder whether rpmlint could or should be taught to disallow shared libraries linked with -ffast-math.

Comment 18 Dave Malcolm 2015-07-30 13:36:29 UTC
(In reply to Andy Lutomirski from comment #17)
> Slightly off topic, but I wonder whether rpmlint could or should be taught
> to disallow shared libraries linked with -ffast-math.

Good idea.  Please file this as an RFE against rpmlint.

Comment 19 Andy Lutomirski 2015-07-30 18:06:18 UTC
Done: bug 1248744


Note You need to log in before you can comment on or make changes to this bug.