Bug 2160296

Summary: ld.bfd: undefine reference with
Product: Red Hat Enterprise Linux 9 Reporter: Ben Marzinski <bmarzins>
Component: binutilsAssignee: Nick Clifton <nickc>
binutils sub component: gcc-toolset-11 QA Contact: qe-baseos-tools-bugs
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: fweimer, mprchlik, ohudlick, sipoyare
Version: 9.2Keywords: Bugfix, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-13 09:15:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Simplified reproducer for the ld.bfd issue. none

Description Ben Marzinski 2023-01-12 01:10:15 UTC
Created attachment 1937507 [details]
Simplified reproducer for the ld.bfd issue.

Description of problem:
Starting with binutils-2.35.2-26.el9.x86_64, the device-mapper-multipath package no longer builds due to an issue with ld.bfd and -flto=auto that appears to have been introduced in the fix to bz #2148469. I've created and attached a simple reproducer.

To hit this issue, you need two libraries and a program that uses them. Library_1 needs to use symbol versioning and define a function (function_1) that is called by a function (function_2) in Library_2. The calling program needs to override function_1 with it's own implementation, and call function_2. If the -flto=auto compilation option is used, when Library_2 is compiled, it will link against the versioned symbol for function_1 provided by Library_1. When the program is compiled, Library_2 will refuse to use the unversioned symbol provided by the program.

In the case of device-mapper-multipath, Library_1 is libmultipath, which defines a symbol function get_multipath_conf(). Library_2 is libmpathpersist, which calls this function in do_mpath_persistent_reserve_out(). The implementation of get_multipath_conf() in libmultipath only works for non-threaded programs, and is used by multipath and mpathpersist. The multipath daemon, multipathd, is multi-threaded, and needs to add locking around getting the configuration, so it overrides get_multipath_conf() with a locking version.

device-mapper-multipath and my simplified reproducer compile fine with binutils-2.35.2-25.el9.x86_64. They also compile fine using binutils-2.39-6.fc38.x86_64 from Fedora Rawhide. If define the symbol to be overriden (function_1 in my initial explanation and get_multipath_conf in device-mapper-multipath) as weak in calling library (Library_2 in my initial explanation, and libmpathpersist in device-mapper-multipath) I can work around the issue. linking using ld.gold also avoids the issue. 

Version-Release number of selected component (if applicable):
binutils-2.35.2-26.el9.x86_64

How reproducible:
Always

Steps to Reproduce:
1. download and untar the attached reproducer.tgz
2. run "make" in the reproducer directory

Actual results:
[root@rhel-storage-100 reproducer]# rpm -q binutils
binutils-2.35.2-26.el9.x86_64
[root@rhel-storage-100 reproducer]# make
gcc -flto=auto -c -o no_override.o main.c
gcc -flto=auto -shared -fPIC -o libbase.so base.c \
        -Wl,--version-script,libbase.version
gcc -flto=auto -shared -fPIC -o libextension.so extension.c -L. -lbase
gcc  -o test_no_override no_override.o -L. -lbase -lextension
gcc -flto=auto -DOVERRIDE -c -o yes_override.o main.c
gcc  -o test_yes_override yes_override.o -L. -lbase -lextension
/usr/bin/ld: ./libextension.so: undefined reference to `overridable_func.0'
collect2: error: ld returned 1 exit status
make: *** [Makefile:38: test_yes_override] Error 1


Expected results:
[root@rhel-storage-102 reproducer]# rpm -q binutils
binutils-2.39-6.fc38.x86_64
[root@rhel-storage-102 reproducer]# make
gcc -flto=auto -c -o no_override.o main.c
gcc -flto=auto -shared -fPIC -o libbase.so base.c \
        -Wl,--version-script,libbase.version
gcc -flto=auto -shared -fPIC -o libextension.so extension.c -L. -lbase
gcc  -o test_no_override no_override.o -L. -lbase -lextension
gcc -flto=auto -DOVERRIDE -c -o yes_override.o main.c
gcc  -o test_yes_override yes_override.o -L. -lbase -lextension

or

[root@rhel-storage-100 reproducer]# rpm -q binutils
binutils-2.35.2-25.el9.x86_64
[root@rhel-storage-100 reproducer]# make
gcc -flto=auto -c -o no_override.o main.c
gcc -flto=auto -shared -fPIC -o libbase.so base.c \
        -Wl,--version-script,libbase.version
gcc -flto=auto -shared -fPIC -o libextension.so extension.c -L. -lbase
gcc  -o test_no_override no_override.o -L. -lbase -lextension
gcc -flto=auto -DOVERRIDE -c -o yes_override.o main.c
gcc  -o test_yes_override yes_override.o -L. -lbase -lextension


Additional info:
See the README in the attached reproducer.tgz for more information.

Comment 1 Nick Clifton 2023-01-12 14:05:04 UTC
Hi Ben,

  This problem should now be fixed with binutils-2.35.2-35.el9

  Please could you try it out and let me know if the issue persists ?

Cheers
  Nick

Comment 2 Ben Marzinski 2023-01-12 18:38:24 UTC
(In reply to Nick Clifton from comment #1)
> Hi Ben,
> 
>   This problem should now be fixed with binutils-2.35.2-35.el9
> 
>   Please could you try it out and let me know if the issue persists ?
> 

That package fixes my issue. Looks like I just missed your fix when I hit this. Thanks. You can feel free to close this bug, since 2148469 is still open.

Comment 3 Nick Clifton 2023-01-13 09:15:01 UTC
Ok, closing