The OpenEXR.testOptimizedInterleavePatterns test defined in openexr-3.1.7/src/test/OpenEXRTest/testOptimizedInterleavePatterns.cpp of the latest openexr package is failing when openexr is compiled with -march=x86-64-v3. This is causing this package to FTBFS in ELN despite building successfully in rawhide. Reproducible: Always Steps to Reproduce: On x86_64, build openexr from rawhide in ELN tag (e.g. fedpkg --release eln build|mockbuild) *or* build for rawhide with `%global __cflags_arch_x86_64 "-march=x86-64-v3"` defined in openexr.spec. Actual Results: Start 91: OpenEXR.testOptimizedInterleavePatterns 80/112 Test #91: OpenEXR.testOptimizedInterleavePatterns ...Subprocess aborted***Exception: 0.55 sec tempDir = /var/tmp/OpenEXRTest_RIMFCORH ======= Running testOptimizedInterleavePatterns Testing SSE optimisation with different interleave patterns (large images) ... 0, 0: RGBHalf read as RGBHalf... OK OPTIMISED 0, 1: RGBHalf read as RGBAHalf... OK OPTIMISED 0, 2: RGBHalf read as ABGRHalf... OK 0, 3: RGBHalf read as RGBFloat... error reading back channel B pixel 21,-76 got -nan expected -nan OpenEXRTest: /builddir/build/BUILD/openexr-3.1.7/src/test/OpenEXRTest/testOptimizedInterleavePatterns.cpp:233: bool {anonymous}::compare(const Imf_3_1::FrameBuffer&, const Imf_3_1::FrameBuffer&, const Imath_3_1::Box2i&, bool): Assertion `writtenHalf.bits()==readHalf.bits()' failed. Expected Results: Test passes, as in rawhide (which defaults to -mtune=generic).
To narrow things down a bit, this test passes with `%global __cflags_arch_x86_64 "-march=x86-64-v2"`, so it is specific to a -v3 optimization.
Further narrowing this down, looks like F16C is the culprit; `-march=x86-64-v2 -mf16c` (with or without AVX/AVX2/BMI/BMI2) fails, `-march=x86-64-v3 -mno-f16c` succeeds.
Reproduced: 99% tests passed, 1 tests failed out of 112 Total Test time (real) = 62.74 sec The following tests FAILED: 91 - OpenEXR.testOptimizedInterleavePatterns (Subprocess aborted) Errors while running CTest error: Bad exit status from /var/tmp/rpm-tmp.TSF4gE (%check)
Created attachment 1970772 [details] .ii file that seems to be miscompiled Looks like this file is miscompiled. I can't bisect it though.
Compiled with /usr/bin/g++ -DILM_IMF_TEST_IMAGEDIR=\"/builddir/build/BUILD/openexr-3.1.7/src/test/OpenEXRTest/\" -I/builddir/build/BUILD/openexr-3.1.7/redhat-linux-build/src/test/OpenEXRTest -I/builddir/build/BUILD/openexr-3.1.7/src/test/OpenEXRTest -I/builddir/build/BUILD/openexr-3.1.7/src/lib/OpenEXR -I/builddir/build/BUILD/openexr-3.1.7/redhat-linux-build/cmake -I/builddir/build/BUILD/openexr-3.1.7/src/lib/Iex -I/builddir/build/BUILD/openexr-3.1.7/src/lib/IlmThread -isystem /usr/include/Imath -O2 -fexceptions -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v3 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -DNDEBUG -MD -MT src/test/OpenEXRTest/CMakeFiles/OpenEXRTest.dir/testOptimizedInterleavePatterns.cpp.o -MF CMakeFiles/OpenEXRTest.dir/testOptimizedInterleavePatterns.cpp.o.d -o CMakeFiles/OpenEXRTest.dir/testOptimizedInterleavePatterns.cpp.o -c /builddir/build/BUILD/openexr-3.1.7/src/test/OpenEXRTest/testOptimizedInterleavePatterns.cpp Adding -mno-f16c fixes the problem.
The code is different when -mf16c is enabled, see src/lib/OpenEXRCore/unpack.c. Or for this case more importantly imath_float_to_half in /usr/include/Imath/half.h which uses _cvtss_sh if -mf16c but some hand-written bit-twiddling otherwise. The failure is that readHalf is 0xff26 while writtenHalf is 0xfd26, the former one is a qNaN and the latter corresponding sNaN. The corresponding word at ptr for readHalf is 0xffe4c000 which is again a qNaN, writtenHalf is read directly as half. I'd say something is canonicalizing the sNaN somewhere on the way to qNaN, which I believe the F16C instructions normally do but the hand-written code in half.h doesn't: // inf or nan if (IMATH_UNLIKELY (ui >= 0x7f800000)) { ret |= 0x7c00; if (ui == 0x7f800000) return ret; m = (ui & 0x7fffff) >> 13; // make sure we have at least one bit after shift to preserve nan-ness return ret | (uint16_t)m | (uint16_t)(m == 0); }
A small program which shows the differences between the f16c intrinsics and the pure C implementation of the <half.h> conversions (left in just the parts related to NaNs): #include <x86intrin.h> float foo (unsigned short x) { return _cvtsh_ss (x); } float bar (unsigned short x) { unsigned int hexpmant = ( (unsigned int)(x) << 17 ) >> 4; unsigned int vi = ((unsigned int)(x >> 15)) << 31; if (hexpmant >= 0x00800000) { vi |= hexpmant; if (hexpmant < 0x0f800000) vi += 0x38000000; else vi |= 0x7f800000; } else __builtin_abort (); float f; __builtin_memcpy (&f, &vi, sizeof (float)); return f; } unsigned short baz (float x) { return _cvtss_sh (x, _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC); } unsigned short qux (float x) { unsigned int vi; __builtin_memcpy (&vi, &x, sizeof (float)); unsigned int ui = (vi & ~0x80000000); unsigned int ret = ((vi >> 16) & 0x8000); if (ui >= 0x38800000) { if (ui >= 0x7f800000) { ret |= 0x7c00; if (ui == 0x7f800000) return ret; unsigned int m = (ui & 0x7fffff) >> 13; return ret | (unsigned short)m | (unsigned short)(m == 0); } else __builtin_abort (); } else __builtin_abort (); } int main () { union U { float f; unsigned int u; } x[6]; x[0].f = foo (0xff26); x[1].f = bar (0xff26); x[2].f = foo (0xfd26); x[3].f = bar (0xfd26); __builtin_printf ("%x %x %x %x\n", x[0].u, x[1].u, x[2].u, x[3].u); x[4].u = 0xffe4c000; x[5].u = 0xffa4c000; __builtin_printf ("%x %x %x %x\n", baz (x[4].f), qux (x[4].f), baz (x[5].f), qux (x[5].f)); return 0; } With -O2 -mf16c this prints: ffe4c000 ffe4c000 ffe4c000 ffa4c000 ff26 ff26 ff26 fd26 for me, which shows that in both directions, the f16c intrinsics convert sNaNs to qNaNs, while the handwritten code does not.
The half-to-float conversion already goes wrong. I re-ran the test with some instrumentation and reported this upstream: https://github.com/AcademySoftwareFoundation/openexr/issues/1456 At this point, I have no idea whether the test is incorrect, or whether OpenEXR should stop using imath because it does not do what it needs.
I'm going to switch this package to -mno-f16c on x86-64 until upstream figures out whether the code (i.e., dependency on imath) is wrong or the test.
(In reply to Florian Weimer from comment #9) > I'm going to switch this package to -mno-f16c on x86-64 until upstream > figures out whether the code (i.e., dependency on imath) is wrong or the > test. Well, the package is now FTBFS in rawhide, too. I filed two new upstream issues: SSE2 test inconsistency on i686 <https://github.com/AcademySoftwareFoundation/openexr/issues/1459> testDWAACompression and testDWABCompression fail on aarch64, i686 <https://github.com/AcademySoftwareFoundation/openexr/issues/1460> I've pushed my F16C fix nevertheless, but it currently won't build on those other architectures. I don't want to disable the testsuite and push a potentially broken package into rawhide.