Bug 517001
Summary: | dlopen/dlclose of im-scim.so causes segfault | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mamoru TASAKA <mtasaka> | ||||||||
Component: | glibc | Assignee: | Andreas Schwab <schwab> | ||||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | rawhide | CC: | drepper, i18n-bugs, jakub, petersen, phuang, schwab, tagoh, zaitcev | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2009-09-28 15:03:02 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Created attachment 357138 [details]
gdb log for this test case
Note: I guess the main cause of bug 515350 and bug 514720 is this bug. Created attachment 357145 [details]
gdb log for this test case (again)
(Please use this gdb log)
But normally scim-gtk is not installed? So this is not specific to rawhide? (If not then I would suggest to report this upstream.) (In reply to comment #4) > But normally scim-gtk is not installed? This is a bug report against scim. Whether scim-gtk is installed by default or not does not matter here. (In reply to comment #5) > So this is not specific to rawhide? I don't know. (In reply to comment #6) > (If not then I would suggest to report this upstream.) It is not so easy to determine if this is a bug also in upstream scim or specific to Fedora because Fedora's scim contains many patches. (In reply to comment #5) > So this is not specific to rawhide? I can see this issue on rawhide only, but the testing code works fine on F-11 say. Well, I unpacked F-11 scim{-libs,-gtk}-1.4.8-3.fc11.i586 on my rawhide machine and tried the testing code and it does NOT segfault. However when I recompile scim-1.4.8-3.fc11 on my rawhide machine, it DOES seem to segfault. By the way when I recompile scim-1.4.9-2.fc12 on my rawhide machine with 's/-O2/-O0/', it does NOT segfault, however with -O1 it segfaults. So the root problem is probably in gcc or ld. I created a workaround to fix this problem. I changed the link argument to make im-scim.so unloadable. Please try https://koji.fedoraproject.org/koji/taskinfo?taskID=1637285 I tried 1.4.9-3.fc12 and test program does not segfault anymore. *** Bug 514720 has been marked as a duplicate of this bug. *** *** Bug 515350 has been marked as a duplicate of this bug. *** CC-ing to gcc maintainer. Jakub, would you investigate what is the real cause? Move this bug to gcc This has clearly nothing to do with gcc, looks like a glibc bug to me so far. im-scim.so is dlopened, has DT_NEEDED on libscim-1.0.so.8. In LD_DEBUG=all I see: ... 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/tmp/x [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/libdl.so.2 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/libc.so.6 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/ld-linux.so.2 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/usr/lib/gtk-2.0/immodules/im-scim.so [0] 4525: binding file /usr/lib/libscim-1.0.so.8 [0] to /usr/lib/gtk-2.0/immodules/im-scim.so [0]: normal symbol `_ZN4scim7PointerINS_10ConfigBaseEED1Ev' [LIBSCIM_1.0] ... 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/tmp/x [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/libdl.so.2 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/libc.so.6 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/lib/ld-linux.so.2 [0] 4525: symbol=_ZN4scim7PointerINS_10ConfigBaseEED1Ev; lookup in file=/usr/lib/gtk-2.0/immodules/im-scim.so [0] 4525: binding file /usr/lib/gtk-2.0/immodules/im-scim.so [0] to /usr/lib/gtk-2.0/immodules/im-scim.so [0]: normal symbol `_ZN4scim7PointerINS_10ConfigBaseEED1Ev' ... 4525: file=/usr/lib/gtk-2.0/immodules/im-scim.so [0]; destroying link map but note that libscim-1.0.so.8 wasn't unloaded (presumably STB_GNU_UNIQUE in action). __cxa_atexit was called twice with _ZN4scim7PointerINS_10ConfigBaseEED1Ev function (which resolved to the im-scim.so copy, libscim-1.0.so.8 has its own too), the first time with libscim-1.0.so.8's __dso_handle, the second time with im-scim.so's __dso_handle. When im-scim.so was unloaded, __cxa_finalize removed the second dtor for that function, but as libscim-1.0.so.8 wasn't unloaded until exit, exit tries to call _ZN4scim7PointerINS_10ConfigBaseEED1Ev from im-scim.so, which no longer exists. The questions are: 1) why isn't a relocation dependency generated 2) how could be im-scim.so unloaded when libscim-1.0.so.8 that has a relocation dependency on it and couldn't be unloaded. You need to remove DF_1_NODELETE flag from im-scim.so to reproduce... Self-contained testcase: #!/bin/sh sed 's/_TAB_/\t/g' > Makefile <<\EOF CXXFLAGS += -fpic -O2 n1: n1.o n2.so _TAB_$(CC) -o n1 n1.c -ldl n2.so: n2.o n3.so n4.so _TAB_$(CXX) -shared -o $@ $< ./n3.so ./n4.so n3.so: n3.o n3.map _TAB_$(CXX) -shared -o $@ $< -Wl,--version-script,n3.map #_TAB_$(CXX) -shared -o $@ $< n4.so: n4.o _TAB_$(CXX) -shared -o $@ $< clean: _TAB_rm -f *.o *~ *core *.so n1 EOF cat > n1.c <<\EOF #include <dlfcn.h> int main (void) { void *handle = dlopen ("./n2.so", RTLD_LAZY); if (handle) dlclose (handle); return 0; } EOF cat > n2.C <<\EOF #include <stdlib.h> inline void foo (void) { } __attribute__((constructor)) void ctor (void) { atexit (foo); } EOF cat > n3.C <<\EOF #include <stdlib.h> inline void foo (void) { } inline int bar (void) { static int barvar; return ++barvar; } int (*barp) (void) = bar; __attribute__((constructor)) void ctor (void) { atexit (foo); } EOF cat > n3.map <<\EOF N3 { global: _ZZ3barvE6barvar; barp; _Z3foov; local: *; }; EOF cat > n4.C <<\EOF inline int bar (void) { static int barvar; return ++barvar; } int (*barp2) (void) = bar; EOF Needs to be compiled with F12 gcc, so that _ZZ3barvE6barvar is STB_GNU_UNIQUE. If _Z3foov isn't versioned in n3.so, it works just fine, supposedly because a relocation dependency is added (or, if that happens after _ZZ3barvE6barvar lookup which marks n3.so as DF_1_NODELETE, just marks the undef_map as DF_1_NODELETE too). I think the problem is that dl-reloc.c (RESOLVE_MAP) has: int flags = DL_LOOKUP_ADD_DEPENDENCY; \ if ((version) != NULL && (version)->hash != 0) \ { \ v = (version); \ flags = 0; \ } \ _lr = _dl_lookup_symbol_x (strtab + (*ref)->st_name, l, (ref), \ scope, v, _tc, flags, NULL); \ In the testcase version != NULL && version->hash != 0 and so it doesn't add a relocation dependency, even when it resolves to a completely different library. When DL_LOOKUP_ADD_DEPENDENCY was introduced not all callers of _dl_lookup_versioned_symbol were properly adjusted. Fixed in 2.10.90-24. Actually I tried to remove Patch33 in devel scim.spec and I don't see this segfault any more. So I think it is better to remove Patch33 workaround on scim.spec and rebuild scim. (In reply to comment #20) > Actually I tried to remove Patch33 in devel scim.spec and > I don't see this segfault any more. > So I think it is better to remove Patch33 workaround on scim.spec > and rebuild scim. I have removed the workaround patch added in scim-1.4.9-3. |
Created attachment 357137 [details] test program Description of problem: The attached test program causes segfault Version-Release number of selected component (if applicable): scim-gtk-1.4.9-2.fc12.i686 How reproducible: 100% Steps to Reproduce: 1. Compile the attached test program with -ldl -g 2. execute 3. Actual results: The test program causes segfault Expected results: Shouldn't segfault Additional info: It seems that some nasty exit handler is executed (gdb log is not useful, though)