+++ This bug was initially created as a clone of Bug #658851 +++ +++ This bug was initially created as a clone of Bug #179072 +++ From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7 Description of problem: dl_open_worker() in elf/dl-open.c calls _dl_debug_state() with .r_state==RT_CONSISTENT even though relocations have not yet been performed on newly-loaded objects. A debugger that is observing _dl_debug_state() would like to see the relocations the same way that any newly-loaded code will see them. The time to call _dl_debug_state() is just before running the initializer functions of the newly-loaded objects. Version-Release number of selected component (if applicable): glibc-2.3.90-30 How reproducible: Always Steps to Reproduce: 1. Look in elf/dl-open.c:158, function dl_open_worker(). 2. 3. Actual Results: The call to _dl_debug_state() is on line 328, the relocations are performed just after that, and the call to _dl_init() is on line 470. Expected Results: The call to _dl_debug_state() should be just before the call to _dl_init(). Additional info: Suggested patch will be attached. --- Additional comment from jreiser on 2006-02-04 17:30:28 EST --- Here is a testcase which shows that gdb runs into trouble when relocations are not performed before ld-linux calls _dl_debug_state() with RT_CONSISTENT. $ cat my_lib.c #include <stdio.h> int sub1(int x) { printf("sub1 %d\n", x); } $ cat my_main.c #include <dlfcn.h> int main() { void *handle = dlopen("./my_lib.so", RTLD_LAZY); void (*sub1)(int) = (void (*)(int))dlsym(handle, "sub1"); sub1(6); return 0; } $ gcc -o my_lib.so -shared -fPIC -g my_lib.c $ gcc -o my_main -g my_main.c -ldl $ gdb my_main GNU gdb Red Hat Linux (6.3.0.0-1.98rh) (gdb) set stop-on-solib-events 1 ## sets a breakpoint on _dl_debug_state() (gdb) run Starting program: /home/jreiser/my_main Reading symbols from shared object read from target memory...done. Loaded system supplied DSO at 0xc3c000 Stopped due to shared library event (gdb) info shared ## which modules are in memory now? From To Syms Read Shared Object Library 0x006087f0 0x0061d15f Yes /lib/ld-linux.so.2 (gdb) c Continuing. Stopped due to shared library event (gdb) info shared From To Syms Read Shared Object Library 0x006087f0 0x0061d15f Yes /lib/ld-linux.so.2 0x00777c00 0x00778a8c Yes /lib/libdl.so.2 0x0063a590 0x00727368 Yes /lib/libc.so.6 (gdb) c Continuing. Stopped due to shared library event (gdb) info shared From To Syms Read Shared Object Library 0x006087f0 0x0061d15f Yes /lib/ld-linux.so.2 0x00777c00 0x00778a8c Yes /lib/libdl.so.2 0x0063a590 0x00727368 Yes /lib/libc.so.6 (gdb) c Continuing. Stopped due to shared library event (gdb) info shared From To Syms Read Shared Object Library 0x006087f0 0x0061d15f Yes /lib/ld-linux.so.2 0x00777c00 0x00778a8c Yes /lib/libdl.so.2 0x0063a590 0x00727368 Yes /lib/libc.so.6 0x00dcc41c 0x00dcc53c Yes ./my_lib.so ## Now my_lib.so is loaded, and gdb believes that everything is ready to run. ## However, ld-linux has not performed relocations on my_lib.so, ## so there will be a SIGSEGV when the user calls sub1 in my_lib.so. (gdb) print sub1(42) Program received signal SIGSEGV, Segmentation fault. 0x000003f2 in ?? () The program being debugged was signaled while in a function called from GDB. GDB remains in the frame where the signal was received. To change this behavior use "set unwindonsignal on" Evaluation of the expression containing the function (sub1) will be abandoned. (gdb) x/i $pc ## where was execution at time of SIGSEGV? 0x3f2: Cannot access memory at address 0x3f2 (gdb) x/12i sub1 0xdcc4d8 <sub1>: push %ebp 0xdcc4d9 <sub1+1>: mov %esp,%ebp 0xdcc4db <sub1+3>: push %ebx 0xdcc4dc <sub1+4>: sub $0x14,%esp 0xdcc4df <sub1+7>: call 0xdcc4d4 <__i686.get_pc_thunk.bx> 0xdcc4e4 <sub1+12>: add $0x1164,%ebx 0xdcc4ea <sub1+18>: mov 0x8(%ebp),%eax 0xdcc4ed <sub1+21>: mov %eax,0x4(%esp) 0xdcc4f1 <sub1+25>: lea 0xffffef10(%ebx),%eax 0xdcc4f7 <sub1+31>: mov %eax,(%esp) 0xdcc4fa <sub1+34>: call 0xdcc3ec ## printf@PLT 0xdcc4ff <sub1+39>: add $0x14,%esp (gdb) x/i 0xdcc3ec ## printf@PLT 0xdcc3ec: jmp *0xc(%ebx) (gdb) x/x 0xdcc4e4+0x1164+0xc 0xdcd654: 0x000003f2 ## unrelocated (gdb) q Because ld-linux did not perform relocations before calling _dl_debug_state, then gdb was presented with inconsistent state. If ld-linux calls _dl_debug_state after performing relocations, and just before calling _dl_init, then gdb will see a sane world, and the user's request "print sub1(42)" will execute correctly without SIGSEGV.
Implemented by Gary Benson, it needs glibc stap probes: Improved linker-debugger interface http://sourceware.org/ml/archer/2011-q2/msg00000.html Requirement for GDB: RHEL-5 Bug 658851
glibc-2.5-75 has the line %define systemtaparches %{ix86} x86_64 ppc ppc64 s390 s390x from RHEL-6 but for RHEL-5 there must be also ia64: %define systemtaparches %{ix86} x86_64 ppc ppc64 s390 s390x ia64 The patch of mine from Comment 11 was right, it had ia64 listed there. Please respin the errata.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0260.html