Bug 2162798
| Summary: | Red Hat bfd linker gives "undefined reference to symbol" for specific programs using LLVM 15 with OpenMP offload and LTO | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Daniel Woodworth <daniel.woodworth> |
| Component: | binutils | Assignee: | Nick Clifton <nickc> |
| binutils sub component: | system-version | QA Contact: | qe-baseos-tools-bugs |
| Status: | CLOSED NOTABUG | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | fweimer, mprchlik, nickc, ohudlick, sarnex, sipoyare |
| Version: | 9.0 | Keywords: | Bugfix, Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
If this bug requires documentation, please select an appropriate Doc Type value.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-01-25 23:23:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Daniel Woodworth
2023-01-20 23:30:48 UTC
(In reply to Daniel Woodworth from comment #0) Hi Daniel, I am currently unable to reproduce this problem :-( > 2.35.2-17.el9 for Red Hat Enterprise Linux 9.0 Are you able to reproduce the problem using the 2.35.2-37.el9 binutils ? https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2340527 (This was the version that I used for my local tests). Also - does the problem occur if you use the bfd linker (ld.bfd) rather than the gold linker (ld.gold) ? > I tested applying > these patches to see which one causes the error, and that patch was > binutils-plugin-as-needed.patch. Hmm, interesting. Once I can reproduce the problem, this should make fixing it a lot easier. Cheers Nick Hi Daniel, (In reply to Nick Clifton from comment #1) > I am currently unable to reproduce this problem :-( By which I mean, I can follow all the steps outlined in the description, but the link does not fail. However, another question does occur to me: $ echo "int main() {return 0;} void crc32_z(void) {}" > zlib-omptarget-clash.c $ clang -flto -fopenmp -fopenmp-targets=x86_64-pc-linux-gnu zlib-omptarget-clash.c -o zlib-omptarget-clash With these two commands you are creating a program that calls a zlib function, but you are not explicitly linking in the zlib library. Why do you expect it to link correctly ? Sure in the past the zlib library has been brought in by the libomptarget, but you should not rely upon this. If you use the zlib library, you should include it on the link command line. I suspect that this might be a case of the linker giving you a valid error message... Cheers Nick I'm sorry for the squished source example—what it's doing is defining a function with the same name as one from the zlib library and _not_ calling it:
int main() {
return 0;
}
void crc32_z(void) {
}
The zlib library was previously _not_ brought in by libomptarget, but the bug is exposed by a change which does bring it in in recent versions of LLVM.
I am also seeing the problem with ld.bfd:
$ clang -flto -fuse-ld=bfd -fopenmp -fopenmp-targets=x86_64-pc-linux-gnu zlib-omptarget-clash.c -o zlib-omptarget-clash
/usr/bin/ld.bfd: /tmp/zlib-omptarget-clash-a99f51.o (symbol from plugin): undefined reference to symbol 'crc32_z@@ZLIB_1.2.9'
/usr/bin/ld.bfd: /usr/lib64/libz.so.1: error adding symbols: DSO missing from command line
/iusers/dwoodwor/rhel9-lto-omp-zlib-repro/llvm-project-llvmorg-15.0.0/deploy/bin/clang-linker-wrapper: error: 'ld.bfd' failed
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
I'm not sure how to upgrade to 2.35.2-37.el9 binutils; I have been able to get it to upgrade to 2.35.2-24.el9, but I'm still seeing the bug there. I tried following your link but am having problems connecting to brewweb.engineering.redhat.com; I can try again later in case it's just temporarily down. It sounds like it probably is fixed in the latest package version if you can't reproduce it there, but I should double-check it in my environment to make sure there isn't something else going on.
I'm still not able to access https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2340527; is it an internal RedHat link? (In reply to Daniel Woodworth from comment #3) Hi Daniel, > I'm sorry for the squished source example—what it's doing is defining a > function with the same name as one from the zlib library and _not_ calling > it: Ah - sorry - I did misread the example. > I am also seeing the problem with ld.bfd: Hmm, interesting. > I'm not sure how to upgrade to 2.35.2-37.el9 binutils; I have been able to > get it to upgrade to 2.35.2-24.el9, but I'm still seeing the bug there. Hmm, OK, I will see if I can reproduce the problem with the -24.el9 release... > I tried following your link but am having problems connecting to > brewweb.engineering.redhat.com; Sorry about that. It is an internal Red Hat site. I just assumed that you would have access to it. My bad. The -37.el9 build might be accessible now, as it has finally passed through gating, but if you want I could just upload the rpms... Cheers Nick Bug 1896772 comment 1 describes the same issue. As an additional data point, do you expect interposition to happen in your case? Or should the definition of crc32_z remain private to the main program? Hi Daniel,
So I downloaded the -24.el9 binutils rpm and unpacked it locally:
% /home/nickc/Downloads/usr/bin/ld.bfd --version
GNU ld version 2.35.2-24.el9
But when I use it to reproduce the issue:
% clang -flto -fopenmp -fopenmp-targets=x86_64-pc-linux-gnu zlib-omptarget-clash.c -o zlib-omptarget-clash -fuse-ld=/home/nickc/Downloads/usr/bin/ld.bfd
%
(ie successful compilation and link)
And adding "-v" to the command line to make sure that I am using the correct versions:
% clang -flto -fopenmp -fopenmp-targets=x86_64-pc-linux-gnu zlib-omptarget-clash.c -o zlib-omptarget-clash -fuse-ld=/home/nickc/Downloads/usr/bin/ld.bfd -v
clang version 15.0.0
Target: x86_64-unknown-linux-gnu
[...]
"/home/nickc/Downloads/usr/bin/ld.bfd" -pie --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o zlib-omptarget-clash /dev/shm/zlib-omptarget-clash-wrapper-72a801.o /usr/lib/gcc/x86_64-redhat-linux/12/../../../../lib64/Scrt1.o /usr/lib/gcc/x86_64-redhat-linux/12/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/12/crtbeginS.o -L /usr/lib/gcc/x86_64-redhat-linux/12 -L /usr/lib/gcc/x86_64-redhat-linux/12/../../../../lib64 -L /lib/../lib64 -L /usr/lib/../lib64 -L /lib -L /usr/lib -plugin /home/nickc/work/sources/llvm/15.0/llvm-project-llvmorg-15.0.0/deploy/bin/../lib/LLVMgold.so -plugin-opt=mcpu=x86-64 /dev/shm/zlib-omptarget-clash-3cf022.o -l omp -l omptarget -rpath /home/nickc/work/sources/llvm/15.0/llvm-project-llvmorg-15.0.0/deploy/lib -L /home/nickc/work/sources/llvm/15.0/llvm-project-llvmorg-15.0.0/deploy/lib -l gcc --as-needed -l gcc_s --no-as-needed -l pthread -l c -l gcc --as-needed -l gcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/12/crtendS.o /usr/lib/gcc/x86_64-redhat-linux/12/../../../../lib64/crtn.o
So right compiler, right linker, right plugin. What else could be different between your environment and mine ?
I assume that you are running these tests on an x86_64 box right ?
It must be the libraries, or the crt files. I am running these tests on an x86_64 box with Fedora 36 installed. I will try setting up a RHEL-9 mock environment and testing there.
Cheers
Nick
Hi Daniel, Sorry, even in a mock RHEL-9.0 environment with binutils-2.35.2-17.el9 installed I still cannot reproduce this problem. :-( I tried both ld.bfd and ld.gold and both work. I am a bit stuck now. Any ideas as to what could be different between your test environment and mine ? Cheers Nick (In reply to Florian Weimer from comment #6) > As an additional data point, do you expect interposition to happen in your > case? Or should the definition of crc32_z remain private to the main program? The original use case is a benchmark which builds with its own copy of the zlib sources instead of linking against the system zlib, I think mainly for reproducibility. crc32_z remaining private to the main program makes more sense for this, but if none of the other libraries the benchmark uses actually call into zlib it might not make a difference whether there's interposition or not. (In reply to Nick Clifton from comment #7) > I assume that you are running these tests on an x86_64 box right ? Yes, this is on x86_64. (In reply to Nick Clifton from comment #8) > I am a bit stuck now. Any ideas as to what could be different between > your test environment and mine ? My test environment is a RHEL-9.0 container running (via podman) on a RHEL-8.4 host. I didn't think that should make a big difference here, but I was able to hunt down a machine running RHEL-9.0 directly with the same binutils (and zlib and LLVM) version and that machine does not show this bug. Since you're also unable to reproduce it on your RHEL-9.0 environment, I think this problem might be specific to this container setup or even the specific container image. I'll see if I can figure out what's different between these two environments that causes the error. I've been able to also reproduce this bug running the RHEL-9.0 container on the RHEL-9.0 system, and I was not able to reproduce it with a clean container image from https://catalog.redhat.com/software/containers/ubi9/ubi/615bcf606feffc5384e8452e?container-tabs=gti. It looks like this problem is specific to the internal RHEL-9.0 image I'm using and not a problem in RHEL-9.0 in general, so I'll close this bug and focus on getting that image fixed instead. Thanks for all the help triaging this! Also, I've realized I was mistaken and was not using the gold linker after all, and it seems like it is actually unaffected. This bug seems to actually be bfd-specific; I'm also updating the title and description accordingly. |