Bug 1500128
Summary: | glibc: Incomplete rollback of dynamic linker state on linking failure | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Ben Woodard <woodard> | ||||||||
Component: | glibc | Assignee: | glibc team <glibc-bugzilla> | ||||||||
Status: | CLOSED DUPLICATE | QA Contact: | qe-baseos-tools-bugs | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 8.2 | CC: | ashankar, codonell, dj, fweimer, mnewsome, pfrankli, tgummels | ||||||||
Target Milestone: | rc | Keywords: | Triaged | ||||||||
Target Release: | 8.2 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2019-10-01 13:10:23 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1599298 | ||||||||||
Attachments: |
|
Description
Ben Woodard
2017-10-10 01:43:11 UTC
This problem also exists in F26 and so it is likely an upstream problem as well. (In reply to Ben Woodard from comment #2) > This problem also exists in F26 and so it is likely an upstream problem as > well. I'm surprised this fails like this. Can you create a smaller example I can use on RHEL7 to reproduce? That way we can talk about the smaller example, and excludes any libudev issues? Created attachment 1336898 [details]
non-functioning attempt at a reproducer
(In reply to Ben Woodard from comment #5) > Created attachment 1336898 [details] > non-functioning attempt at a reproducer Please also attach all the shared objects that are involved in the working reproducer, and a shell script to run them in the failure mode. That way we have both sides of the equation. I'll look at the original objects and see what's unique about them. Created attachment 1336924 [details]
self contained reproducer
Just untar this and then cd into dl-repo and run the script ./runme.sh
I can reproduce a SIGSEGV. Program received signal SIGSEGV, Segmentation fault. 0x00007ffff6474b8b in __pthread_initialize_minimal_internal () from ./libpthread.so.0 (gdb) bt #0 0x00007ffff6474b8b in __pthread_initialize_minimal_internal () from ./libpthread.so.0 #1 0x00007ffff64745d1 in _init () from ./libpthread.so.0 #2 0x00007ffff7fd1f60 in ?? () from ./libudev.so #3 0x00007ffff7de65da in call_init (l=0x7ffff7fffd60, argc=1, argv=0x7fffffffdcc0, env=0x7fffffffdcd0) at dl-init.c:58 #4 0x00007ffff7de6795 in call_init (env=0x7fffffffdcd0, argv=0x7fffffffdcc0, argc=1, l=<optimized out>) at dl-init.c:103 #5 _dl_init (main_map=main_map@entry=0x7ffff7fff280, argc=1, argv=0x7fffffffdcc0, env=0x7fffffffdcd0) at dl-init.c:86 #6 0x00007ffff7dea9ca in dl_open_worker (a=a@entry=0x7fffffffd960) at dl-open.c:562 #7 0x00007ffff794926c in __GI__dl_catch_exception (exception=0x7fffffffd940, operate=0x7ffff7dea660 <dl_open_worker>, args=0x7fffffffd960) at dl-error-skeleton.c:198 #8 0x00007ffff7dea2ba in _dl_open (file=0x400740 "libudev.so", mode=-2147483646, caller_dlopen=0x400640, nsid=<optimized out>, argc=1, argv=<optimized out>, env=0x7fffffffdcd0) at dl-open.c:645 #9 0x00007ffff7bd3f76 in dlopen_doit (a=a@entry=0x7fffffffdb90) at dlopen.c:66 #10 0x00007ffff794926c in __GI__dl_catch_exception ( exception=exception@entry=0x7fffffffdb30, operate=0x7ffff7bd3f20 <dlopen_doit>, args=0x7fffffffdb90) at dl-error-skeleton.c:198 #11 0x00007ffff79492ef in __GI__dl_catch_error (objname=0x7ffff7dd60d0 <last_result+16>, errstring=0x7ffff7dd60d8 <last_result+24>, mallocedp=0x7ffff7dd60c8 <last_result+8>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:217 #12 0x00007ffff7bd45a9 in _dlerror_run ( operate=operate@entry=0x7ffff7bd3f20 <dlopen_doit>, args=args@entry=0x7fffffffdb90) at dlerror.c:162 #13 0x00007ffff7bd4002 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87 #14 0x0000000000400640 in ?? () #15 0x00007fffffffdcc0 in ?? () #16 0x0000000100400500 in ?? () #17 0x0000000000000000 in ?? () (gdb) ~/build/glibc/elf/ld.so --library-path /home/carlos/build/glibc:/home/carlos/build/glibc/elf:/home/carlos/build/glibc/rt:/home/carlos/build/glibc/dlfcn:/home/carlos/build/glibc/resolv/:. ./orig Could not open libcuda.so - libnvidia-fatbinaryloader.so.384.90: cannot open shared object file: No such file or directory ./orig: Relink `./libudev.so' with `/home/carlos/build/glibc/rt/librt.so.1' for IFUNC symbol `clock_gettime' Segmentation fault (core dumped) The error is interesting, because what the dynamic loader is saying is that librt.so.1 is not yet resolved, but that a reference to clock_gettime exists. The hint to relink against librt is not correct. [carlos@athas dl-repo]$ readelf -a -W libudev.so | grep librt 0x0000000000000001 (NEEDED) Shared library: [librt.so.1] 0x00c0: Version: 1 File: librt.so.1 Cnt: 1 As you can see libudev.so is already linked against librt.so.1. I assume this comes from the failure mode issue. This is present in upstream master. Created attachment 1337016 [details]
working reproducer
The original reporter cracked the mystery. librt has the NODELETE flag.
Adding -Wl,nodelete to libb.so's link line causes the problem to reproduce.
[ben@Mustang dl-bug]$ make run
LD_LIBRARY_PATH=. ./main
d_fn x=12
inside b_fn
rm libe.so
LD_LIBRARY_PATH=. ./main
Could not open liba.so - libe.so: cannot open shared object file: No such file or directory
make: *** [Makefile:38: run] Segmentation fault (core dumped)
We have a report that this can also happen due to ENOMEM while opening a shared object which has a NODELETE shared object as a dependency. *** This bug has been marked as a duplicate of bug 1410154 *** |