Bug 2341839

Summary: flexiblas FTBFS with latest glibc, undefined symbol: _ZGVdN8v_cosf
Product: [Fedora] Fedora Reporter: Yaakov Selkowitz <yselkowi>
Component: flexiblasAssignee: Iñaki Ucar <i.ucar86>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: arjun, codonell, dj, fberat, fweimer, i.ucar86, jlaw, josmyers, mcermak, mcoufal, mfabian, pfrankli, rjones, sipoyare, skolosov, suraj.ghimire7
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-02-07 09:42:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yaakov Selkowitz 2025-01-23 21:22:15 UTC
While flexiblas was successfully rebuilt[1] during the F42 mass rebuild, the subsequent ELN build failed[2] with a test error:

flexiblas dlopen: /builddir/build/BUILD/flexiblas-3.4.4-build/flexiblas-3.4.4/build/lib/libflexiblas_fallback_lapack.so: undefined symbol: _ZGVdN8v_cosf
flexiblas  Failed to load the LAPACK fallback library.  Abort!

Retries[3][4] shows that both aarch64 and x86_64 on F42 (despite building earlier) and ELN are affected.

As the missing symbol is from libmvec, which is supposed to be automatically linked with -lm[5], and based on the buildroot differences between the F42 mass rebuild and today, it seems there is some sort of regression of the libmvec linker handling between glibc 2.40.9000-28 and 2.40.9000-31.

[1] https://koji.fedoraproject.org/koji/buildinfo?buildID=2619429
[2] https://koji.fedoraproject.org/koji/taskinfo?taskID=128331499
[3] https://koji.fedoraproject.org/koji/taskinfo?taskID=128379127
[4] https://koji.fedoraproject.org/koji/taskinfo?taskID=128380920
[5] https://sourceware.org/glibc/wiki/libmvec

Reproducible: Always

Comment 1 Florian Weimer 2025-01-24 12:16:05 UTC
Looking at flexiblas-netlib-3.4.4-7.fc42.x86_64, I see this:

# eu-readelf --symbols=.dynsym /usr/lib64/flexiblas/libflexiblas_fallback_lapack.so | grep cos
   15: 0000000000000000      0 NOTYPE  GLOBAL DEFAULT    UNDEF sincos
   31: 0000000000000000      0 NOTYPE  GLOBAL DEFAULT    UNDEF cosf
  124: 0000000000000000      0 NOTYPE  GLOBAL DEFAULT    UNDEF cos
  148: 0000000000000000      0 NOTYPE  GLOBAL DEFAULT    UNDEF sincosf

So this shared object is underlinked. Previously, this worked because something else loaded libm.so.6.

Now we apparently end up with auto-vectorization (which is a bit surprising, I though this was a -ffast-math thing), but nothing loads else libmvec.so.1. This means this kind of underlinking stops working.

/usr/bin/gcc -fPIC -O2 -flto=auto -ffat-lto-objects -fexceptions -g
  -grecord-gcc-switches -pipe -Wall -Werror=format-security
  -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS
  -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
  -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1
  -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables
  -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2
  -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fPIC
  -fstack-protector-strong -fstack-clash-protection -Wlto-type-mismatch
  -Wunused-parameter -Wunused-but-set-parameter -D_FILE_OFFSET_BITS=64
  -DNDEBUG -O3 -Wpedantic -Wstrict-prototypes -Wcast-qual -flto=auto
  -fno-fat-lto-objects
  -Wl,--dependency-file=CMakeFiles/flexiblas_fallback_lapack.dir/link.d
  -Wl,-z,relro -Wl,-z,pack-relative-relocs -Wl,-z,now
  -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
  -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1
  -specs=/usr/lib/rpm/redhat/redhat-package-notes -rdynamic
  -Wl,--export-dynamic -shared
  -Wl,-soname,libflexiblas_fallback_lapack.so -o
  ../../lib/libflexiblas_fallback_lapack.so
  CMakeFiles/flexiblas_fallback_lapack.dir/dummy_3_12_0.c.o
  /usr/lib64/liblapack_pic.a

I don't see any -lm on the command line, so this is clearly a flexiblas bug.

Comment 2 Iñaki Ucar 2025-01-24 12:26:11 UTC
Thanks, I'll bring this upstream.

Comment 3 Iñaki Ucar 2025-01-24 12:26:32 UTC
Here: https://github.com/mpimd-csc/flexiblas/issues/62

Comment 4 Iñaki Ucar 2025-01-24 12:34:34 UTC
FlexiBLAS is just a wrapper that redirects calls to OpenBLAS (if the function is available) or Netlib's BLAS/LAPACK (if not). Is it possible that OpenBLAS and/or Netlib needs to be rebuilt with this version of glibc?

Comment 5 Florian Weimer 2025-01-24 12:48:47 UTC
(In reply to Iñaki Ucar from comment #4)
> FlexiBLAS is just a wrapper that redirects calls to OpenBLAS (if the
> function is available) or Netlib's BLAS/LAPACK (if not). Is it possible that
> OpenBLAS and/or Netlib needs to be rebuilt with this version of glibc?

The impacted object is /usr/lib64/flexiblas/libflexiblas_fallback_lapack.so, which looks like it's more than just a wrapper.

I don't think a glibc change is responsible for this. It's more likely that it's caused by some optimizer changes in GCC that trigger more vectorization. In any case, the underlinking is a pre-existing flexiblas bug.

Comment 6 Iñaki Ucar 2025-01-24 12:51:54 UTC
libflexiblas_fallback_lapack.so basically mainly contains a static version of Netlib taken from our build of the lapack source package. See https://src.fedoraproject.org/rpms/flexiblas/blob/rawhide/f/flexiblas.spec#_256

Comment 7 Iñaki Ucar 2025-01-28 14:22:38 UTC
I tried a scratch build with system_lapack off, which basically rebuilds Netlib's BLAS/LAPACK into this libflexiblas_fallback_lapack.so instead of using the system version, and the build succeeds: https://koji.fedoraproject.org/koji/taskinfo?taskID=128568478. So there may be underlinking issues in FlexiBLAS, but this means that the latest glibc updates contain some ABI incompatibility, right?