Bug 658851 - _dl_debug_state() RT_CONSISTENT called too early
Summary: _dl_debug_state() RT_CONSISTENT called too early
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gdb
Version: 5.5
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jan Kratochvil
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard: bzcl34nup
Depends On: 179072 711924
Blocks: 669432
TreeView+ depends on / blocked
 
Reported: 2010-12-01 13:55 UTC by Siddhesh Poyarekar
Modified: 2018-11-26 19:39 UTC (History)
13 users (show)

Fixed In Version: gdb-7.0.1-42.el5
Doc Type: Bug Fix
Doc Text:
This text has been already reviewed for RHEL-6.2 Bug 669432: https://errata.devel.redhat.com/errata/edit/11694
Clone Of: 179072
: 669432 711924 (view as bug list)
Environment:
Last Closed: 2012-02-21 06:12:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0238 0 normal SHIPPED_LIVE gdb bug fix update 2012-02-20 15:07:32 UTC
Sourceware 2328 0 None None None Never

Description Siddhesh Poyarekar 2010-12-01 13:55:03 UTC
+++ This bug was initially created as a clone of Bug #179072 +++

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
dl_open_worker() in elf/dl-open.c calls _dl_debug_state() with .r_state==RT_CONSISTENT even though relocations have not yet been performed on newly-loaded objects.  A debugger that is observing _dl_debug_state() would like to see the relocations the same way that any newly-loaded code will see them.  The time to call _dl_debug_state() is just before running the initializer functions of the newly-loaded objects.

Version-Release number of selected component (if applicable):
glibc-2.3.90-30

How reproducible:
Always

Steps to Reproduce:
1. Look in elf/dl-open.c:158, function dl_open_worker().
2. 
3.
  

Actual Results:  The call to _dl_debug_state() is on line 328, the relocations are performed just after that, and the call to _dl_init() is on line 470.

Expected Results:  The call to _dl_debug_state() should be just before the call to _dl_init().

Additional info:

Suggested patch will be attached.

--- Additional comment from jreiser on 2006-01-26 18:41:13 EST ---

Created attachment 123756 [details]
patch to call _dl_debug_state just before _dl_init in dl_open_worker

--- Additional comment from jreiser on 2006-02-04 17:30:28 EST ---

Here is a testcase which shows that gdb runs into trouble when relocations are
not performed before ld-linux calls _dl_debug_state() with RT_CONSISTENT.

$ cat my_lib.c
#include <stdio.h>

int
sub1(int x)
{
        printf("sub1 %d\n", x);
}
$ cat my_main.c
#include <dlfcn.h>

int
main()
{
        void *handle = dlopen("./my_lib.so", RTLD_LAZY);
        void (*sub1)(int) = (void (*)(int))dlsym(handle, "sub1");
        sub1(6);
        return 0;
}
$ gcc -o my_lib.so -shared -fPIC -g my_lib.c
$ gcc -o my_main -g my_main.c -ldl
$ gdb my_main
GNU gdb Red Hat Linux (6.3.0.0-1.98rh)
(gdb) set stop-on-solib-events 1  ## sets a breakpoint on _dl_debug_state()
(gdb) run
Starting program: /home/jreiser/my_main
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xc3c000
Stopped due to shared library event
(gdb) info shared  ## which modules are in memory now?
From        To          Syms Read   Shared Object Library
0x006087f0  0x0061d15f  Yes         /lib/ld-linux.so.2
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From        To          Syms Read   Shared Object Library
0x006087f0  0x0061d15f  Yes         /lib/ld-linux.so.2
0x00777c00  0x00778a8c  Yes         /lib/libdl.so.2
0x0063a590  0x00727368  Yes         /lib/libc.so.6
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From        To          Syms Read   Shared Object Library
0x006087f0  0x0061d15f  Yes         /lib/ld-linux.so.2
0x00777c00  0x00778a8c  Yes         /lib/libdl.so.2
0x0063a590  0x00727368  Yes         /lib/libc.so.6
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From        To          Syms Read   Shared Object Library
0x006087f0  0x0061d15f  Yes         /lib/ld-linux.so.2
0x00777c00  0x00778a8c  Yes         /lib/libdl.so.2
0x0063a590  0x00727368  Yes         /lib/libc.so.6
0x00dcc41c  0x00dcc53c  Yes         ./my_lib.so

  ## Now my_lib.so is loaded, and gdb believes that everything is ready to run.
  ## However, ld-linux has not performed relocations on my_lib.so,
  ## so there will be a SIGSEGV when the user calls sub1 in my_lib.so.

(gdb) print sub1(42)

Program received signal SIGSEGV, Segmentation fault.
0x000003f2 in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (sub1) will be abandoned.

(gdb) x/i $pc  ## where was execution at time of SIGSEGV?
0x3f2:  Cannot access memory at address 0x3f2
(gdb) x/12i sub1
0xdcc4d8 <sub1>:        push   %ebp
0xdcc4d9 <sub1+1>:      mov    %esp,%ebp
0xdcc4db <sub1+3>:      push   %ebx
0xdcc4dc <sub1+4>:      sub    $0x14,%esp
0xdcc4df <sub1+7>:      call   0xdcc4d4 <__i686.get_pc_thunk.bx>
0xdcc4e4 <sub1+12>:     add    $0x1164,%ebx
0xdcc4ea <sub1+18>:     mov    0x8(%ebp),%eax
0xdcc4ed <sub1+21>:     mov    %eax,0x4(%esp)
0xdcc4f1 <sub1+25>:     lea    0xffffef10(%ebx),%eax
0xdcc4f7 <sub1+31>:     mov    %eax,(%esp)
0xdcc4fa <sub1+34>:     call   0xdcc3ec  ## printf@PLT
0xdcc4ff <sub1+39>:     add    $0x14,%esp
(gdb) x/i 0xdcc3ec  ## printf@PLT
0xdcc3ec:       jmp    *0xc(%ebx)
(gdb) x/x 0xdcc4e4+0x1164+0xc
0xdcd654:       0x000003f2  ## unrelocated
(gdb) q

Because ld-linux did not perform relocations before calling _dl_debug_state,
then gdb was presented with inconsistent state.  If ld-linux calls
_dl_debug_state after performing relocations, and just before calling _dl_init,
then gdb will see a sane world, and the user's request "print sub1(42)" will
execute correctly without SIGSEGV.


--- Additional comment from sundaram on 2006-02-20 05:46:55 EST ---



These bugs are being closed since a large number of updates have been released
after the FC5 test1 and test2 releases. Kindly update your system by running yum
update as root user or try out the third and final test version of FC5 being
released in a short while and verify if the bugs are still present on the system
.Reopen or file new bug reports as appropriate after confirming the presence of
this issue. Thanks

--- Additional comment from drepper.fsp on 2006-05-07 21:36:56 EDT ---

I changed the wrong bug.

--- Additional comment from mattdm on 2007-04-06 12:26:15 EDT ---

Fedora Core 5 and Fedora Core 6 are, as we're sure you've noticed, no longer
test releases. We're cleaning up the bug database and making sure important bug
reports filed against these test releases don't get lost. It would be helpful if
you could test this issue with a released version of Fedora or with the latest
development / test release. Thanks for your help and for your patience.

[This is a bulk message for all open FC5/FC6 test release bugs. I'm adding
myself to the CC list for each bug, so I'll see any comments you make after this
and do my best to make sure every issue gets proper attention.]


--- Additional comment from jreiser on 2007-04-07 17:22:46 EDT ---

This bug still exists in glibc-2.5.90-17 for fc7t2.  The testcase of comment #2
still demonstrates the problem.  The specific addresses involved are:
-----
(gdb) print sub1(42)

Program received signal SIGSEGV, Segmentation fault.
0x000002ea in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (sub1) will be abandoned.
(gdb) x/i $pc
0x2ea:  Cannot access memory at address 0x2ea
(gdb) x/12i sub1
0xb373dc <sub1>:        push   %ebp
0xb373dd <sub1+1>:      mov    %esp,%ebp
0xb373df <sub1+3>:      push   %ebx
0xb373e0 <sub1+4>:      sub    $0x14,%esp
0xb373e3 <sub1+7>:      call   0xb373d7 <__i686.get_pc_thunk.bx>
0xb373e8 <sub1+12>:     add    $0x116c,%ebx
0xb373ee <sub1+18>:     mov    0x8(%ebp),%eax
0xb373f1 <sub1+21>:     mov    %eax,0x4(%esp)
0xb373f5 <sub1+25>:     lea    0xffffef0c(%ebx),%eax
0xb373fb <sub1+31>:     mov    %eax,(%esp)
0xb373fe <sub1+34>:     call   0xb372e4 <printf@plt>
0xb37403 <sub1+39>:     add    $0x14,%esp
(gdb) x/i 0xb372e4
0xb372e4 <printf@plt>:  jmp    *0x10(%ebx)
(gdb) x/x 0xb373e8+0x116c+0x10
0xb38564:       0x000002ea
-----

I change the Version field of this bugreport to devel.

--- Additional comment from mattdm on 2007-04-08 14:27:27 EDT ---

Thanks.

--- Additional comment from fedora-triage-list on 2008-04-03 12:49:36 EDT ---

Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

--- Additional comment from jreiser on 2008-04-04 22:57:41 EDT ---

The problem persists in Fedora 9 Beta rawhide glibc-2.7.90-9.i686.

Following comment #6 above, the specific displacements in 2.7.90-9 are:
-----
(gdb) info shared
From        To          Syms Read   Shared Object Library
0x00ba1830  0x00bb9c1f  Yes         /lib/ld-linux.so.2
0x00d61aa0  0x00d62aa8  Yes         /lib/libdl.so.2
0x00bda3e0  0x00ce9a38  Yes         /lib/libc.so.6
0x00111360  0x00111488  Yes         ./my_lib.so
(gdb) print sub1(42)

Program received signal SIGSEGV, Segmentation fault.
0x0000033e in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (sub1) will be abandoned.
(gdb) x/i $pc
0x33e:	Cannot access memory at address 0x33e
(gdb) x/12i sub1
0x11141c <sub1>:	push   %ebp
0x11141d <sub1+1>:	mov    %esp,%ebp
0x11141f <sub1+3>:	push   %ebx
0x111420 <sub1+4>:	sub    $0x14,%esp
0x111423 <sub1+7>:	call   0x111417 <__i686.get_pc_thunk.bx>
0x111428 <sub1+12>:	add    $0x1170,%ebx
0x11142e <sub1+18>:	mov    0x8(%ebp),%eax
0x111431 <sub1+21>:	mov    %eax,0x4(%esp)
0x111435 <sub1+25>:	lea    -0x10f4(%ebx),%eax
0x11143b <sub1+31>:	mov    %eax,(%esp)
0x11143e <sub1+34>:	call   0x111338 <printf@plt>
0x111443 <sub1+39>:	add    $0x14,%esp
(gdb) x/i 0x111338
0x111338 <printf@plt>:	jmp    *0x10(%ebx)
(gdb) x/x 0x111428+0x1170+0x10
0x1125a8:	0x0000033e
-----


--- Additional comment from fedora-triage-list on 2008-05-13 22:04:49 EDT ---

Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from ppluzhnikov on 2008-11-01 17:53:29 EDT ---

Problem persists in current Fedora 9 (glibc-2.8-8.i686),
and also shows up as failure of GDB to debug programs
which dynamically load libpthread.so.0.

Full details here:
http://sourceware.org/ml/gdb/2008-11/msg00004.html

--- Additional comment from jan.kratochvil on 2008-11-12 17:38:41 EST ---

Created attachment 323394 [details]
Minimized to only move _dl_debug_state() after relocations, for glibc-2.8.90-16.

(In reply to comment #0)
> The time to call _dl_debug_state() is just before running the initializer
> functions of the newly-loaded objects.

_dl_debug_state() call be probably called at three places:
(A) r_map is already correct (guaranteed by RT_CONSISTENT); current version.
(B) Like (A) but also after the relocations are resolved.
(C) Like (B) but also after the initializers are executed.

I believe the call+RT_CONSISTENT was intended for (A) and neither (B) nor (C).
(C) is probably too late because we would not be able to debug initializers.
But still nobody guarantees we should stop at (B) instead of at current (A).

------------------------------------------------------------------------------

GDB currently assumes that at _dl_debug_state() time if the `nptl_version',
`_thread_db_*' and other symbols are present libthread_db can be used.

This is currently not true as it fails at least while reading libpthread
unrelocated symbol `stack_used':
glibc-20081031T2102/nptl/allocatestack.c:
static LIST_HEAD (stack_used);
File: /lib64/libpthread.so.0
Symbol table '.symtab' contains 907 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    61: 0000000000217220    16 OBJECT  LOCAL  DEFAULT   28 stack_used
Relocation section '.rela.dyn' at offset 0x41a8 contains 68 entries:
  Offset          Info           Type           Sym. Value   Sym. Name + Addend
000000217220  000000000008 R_X86_64_RELATIVE                   0000000000217220

No other _dl_debug_state() call is done at a later time so the debugger cannot
start tracking the threads.

The only possibility is to delay libthread_db initialization by placing
a breakpoint at the _dl_debug_state() notification time to the function found
for DT_INIT.  While I implemented the attached GDB patch and so I do not try to
avoid this place for the fix I do not find it as the right solution.

------------------------------------------------------------------------------

Another possibility is to fix libpthread data structures to be fully PIC and so
understandable by libthread_db at the current _dl_debug_state() point (A).
It seems libthread_db can already parse libpthread at (B) time still before its
__pthread_initialize_minimal() is called from DT_INIT.
I find it also as a real possibility how to fix glibc libthread_db.

Still I do not see there a regression risk to delay _dl_debug_state() after
relocations resolving; there is nothing more useful on an unrelocated library.
(STT_IFUNC code - called code returning a value for relocations - would be
a bit harder to debug but only for resolving with RTLD_NOW - not a problem.)

Adjusted the patch for glibc-2.8.90-16 making it a minimal working change.

--- Additional comment from jan.kratochvil on 2008-11-12 17:40:14 EST ---

Created attachment 323395 [details]
Two testcases (not suitable for the glibc testsuite).

Testcase for the problem with libpthread loaded first only on dlopen().
Prerequisite is to run: prelink -uf /lib64/libpthread.so.0
GDB was tested gdb-6.8-1.fc9 up to GDB CVS HEAD snapshot.

$ gdb ./testload
...
(gdb) r
Starting program: /tmp/rh179072/testload
[Thread debugging using libthread_db enabled]
Error while reading shared library symbols:
Cannot find new threads: generic error
Cannot find new threads: generic error
(gdb) _

Testcase that the patched library and/or patched GDB fixes the threads problem:
$ gdb ./testload
...
(gdb) r
Starting program: /tmp/rh179072/testload
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff7fde6f0 (LWP 7018)]
[New Thread 0x7ffff7bc0950 (LWP 7021)]
[Thread 0x7ffff7bc0950 (LWP 7021) exited]

Program exited normally.

------------------------------------------------------------------------------

Testcase that at the (B) time (with the patch) we still cannot call safely
functions as they may expect static class variables to be initialized:

# class C {
#   int i;
#   C () { i = 1; }
#   void f () { assert (i == 1); }
# };
$ gdb -x ./cxx.gdbinit
...
(gdb) p c
$1 = {i = 0}
(gdb) p c.f
$2 = {void (C *)} 0x7ffff7ddd73a <C::f()>
(gdb) p c.f()
cxxmain: cxxlib.C:8: void C::f(): Assertion `i == 1' failed.

--- Additional comment from jan.kratochvil on 2008-11-12 17:43:18 EST ---

Created attachment 323396 [details]
Illustrative fix for GDB CVS HEAD not requiring fixed glibc (proof of concept).

--- Additional comment from ppluzhnikov on 2008-11-12 18:05:03 EST ---

(In reply to comment #12)

> _dl_debug_state() call be probably called at three places:
> (A) r_map is already correct (guaranteed by RT_CONSISTENT); current version.
> (B) Like (A) but also after the relocations are resolved.
> (C) Like (B) but also after the initializers are executed.
> 
> I believe the call+RT_CONSISTENT was intended for (A) and neither (B) nor (C).

What makes you believe that?

I can't find official documentation for RT_CONSISTENT,
but Solaris (from which this interface is copied) AFAICT calls 
_dl_debug_state() (or rather its Solaris equivalent) at (B):
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/sgs/rtld/common/setup.c#1028

--- Additional comment from jan.kratochvil on 2008-11-12 18:16:36 EST ---

I also could not find any documentation for _dl_debug_state()/RT_CONSISTENT,
My text tries to advice (B) as the best choice out of the existing possibilities and thanks to finding out the Solaris code supports that.

--- Additional comment from fedora-triage-list on 2009-06-09 18:05:58 EDT ---


This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from jreiser on 2009-06-10 19:51:25 EDT ---

The problem of inconsistent assumptions about the state of the memory image at the call of _dl_debug_state() [have relocations been performed or not?] persists in Fedora 11 glibc-2.10.1-2.i686.

The significant details from the testcase of Comment #2 are now:
-----
Stopped due to shared library event
(gdb) info shared
From        To          Syms Read   Shared Object Library
0x005e4830  0x005fd27f  Yes         /lib/ld-linux.so.2
0x007a6a60  0x007a7a68  Yes         /lib/libdl.so.2
0x0061e840  0x0072ca78  Yes         /lib/libc.so.6
0x004c9380  0x004c94a8  Yes         ./my_lib.so
(gdb) print sub1(42)

Program received signal SIGSEGV, Segmentation fault.
0x0000035e in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(sub1) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) x/i $pc
0x35e:	Cannot access memory at address 0x35e
(gdb) x/12i sub1
0x4c943c <sub1>:	push   %ebp
0x4c943d <sub1+1>:	mov    %esp,%ebp
0x4c943f <sub1+3>:	push   %ebx
0x4c9440 <sub1+4>:	sub    $0x14,%esp
0x4c9443 <sub1+7>:	call   0x4c9437 <__i686.get_pc_thunk.bx>
0x4c9448 <sub1+12>:	add    $0x11b8,%ebx
0x4c944e <sub1+18>:	lea    -0x113c(%ebx),%eax
0x4c9454 <sub1+24>:	mov    0x8(%ebp),%edx
0x4c9457 <sub1+27>:	mov    %edx,0x4(%esp)
0x4c945b <sub1+31>:	mov    %eax,(%esp)
0x4c945e <sub1+34>:	call   0x4c9358 <printf@plt>
0x4c9463 <sub1+39>:	add    $0x14,%esp
(gdb) x/i 0x4c9358
0x4c9358 <printf@plt>:	jmp    *0x10(%ebx)
(gdb) x/x 0x4c9448+0x11b8+0x10
0x4ca610 <__cxa_finalize+4776>:	0x0000035e
-----

--- Additional comment from fedora-triage-list on 2010-04-27 07:38:32 EDT ---


This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from jreiser on 2010-04-27 11:45:19 EDT ---

This bug still is present in Fedora 13, glibc-2.11.90-20; I am changing Version of this bug to 13.  Here is the session of Comment #2 on x86_64:

(gdb) set stop-on-solib-events 1
(gdb) run
Starting program: /home/jreiser/179072/my_main 
Stopped due to shared library event
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x0000003712e00af0  0x0000003712e198e4  Yes         /lib64/ld-linux-x86-64.so.2
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x0000003712e00af0  0x0000003712e198e4  Yes         /lib64/ld-linux-x86-64.so.2
0x0000003713e00de0  0x0000003713e01998  Yes         /lib64/libdl.so.2
0x000000371321e9a0  0x000000371332f620  Yes         /lib64/libc.so.6
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x0000003712e00af0  0x0000003712e198e4  Yes         /lib64/ld-linux-x86-64.so.2
0x0000003713e00de0  0x0000003713e01998  Yes         /lib64/libdl.so.2
0x000000371321e9a0  0x000000371332f620  Yes         /lib64/libc.so.6
(gdb) c
Continuing.
Stopped due to shared library event
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x0000003712e00af0  0x0000003712e198e4  Yes         /lib64/ld-linux-x86-64.so.2
0x0000003713e00de0  0x0000003713e01998  Yes         /lib64/libdl.so.2
0x000000371321e9a0  0x000000371332f620  Yes         /lib64/libc.so.6
0x00007ffff7de84b0  0x00007ffff7de85e8  Yes         ./my_lib.so
(gdb) print sub1(42)

Program received signal SIGSEGV, Segmentation fault.
0x000000000000048e in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(sub1) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) x/i $pc
=> 0x48e:	Cannot access memory at address 0x48e
(gdb) x/12i sub1
   0x7ffff7de857c <sub1>:	push   %rbp
   0x7ffff7de857d <sub1+1>:	mov    %rsp,%rbp
   0x7ffff7de8580 <sub1+4>:	sub    $0x10,%rsp
   0x7ffff7de8584 <sub1+8>:	mov    %edi,-0x4(%rbp)
   0x7ffff7de8587 <sub1+11>:	lea    0x68(%rip),%rax        # 0x7ffff7de85f6
   0x7ffff7de858e <sub1+18>:	mov    -0x4(%rbp),%edx
   0x7ffff7de8591 <sub1+21>:	mov    %edx,%esi
   0x7ffff7de8593 <sub1+23>:	mov    %rax,%rdi
   0x7ffff7de8596 <sub1+26>:	mov    $0x0,%eax
   0x7ffff7de859b <sub1+31>:	callq  0x7ffff7de8488 <printf@plt>
   0x7ffff7de85a0 <sub1+36>:	leaveq 
   0x7ffff7de85a1 <sub1+37>:	retq   
(gdb) x/i 0x7ffff7de8488
   0x7ffff7de8488 <printf@plt>:	jmpq   *0x2003aa(%rip)        # 0x7ffff7fe8838
(gdb) x/xg 0x7ffff7fe8838
0x7ffff7fe8838:	0x000000000000048e

--- Additional comment from jan.kratochvil on 2010-04-30 06:07:51 EDT ---

*** Bug 587576 has been marked as a duplicate of this bug. ***

Comment 2 Jan Kratochvil 2011-03-13 13:28:39 UTC
Implementation plan:

http://sources.redhat.com/bugzilla/show_bug.cgi?id=2328#c4
suggests GDB `stop-on-solib-events' should stop before STT_GNU_IFUNC entries.
But those are currently never resolved during solib load, filed:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=12575
If the PR 12575 gets fixed then GDB should put a breakpoint also at the first
STT_GNU_IFUNC entry.  Without the PR 12575 fix the general GDB implementation
plan is the same, just the first STT_GNU_IFUNC entry breakpoint would never
get used.

Without fixing PR 12575 it would be IMO best to fix this whole issues at the 
glibc side to make its behavior compatible with the original Solaris linker as
described at
http://sources.redhat.com/bugzilla/show_bug.cgi?id=2328#c3
But such change has been rejected for glibc-to-glibc compatibility reasons in
http://sources.redhat.com/bugzilla/show_bug.cgi?id=2328#c6
and after fixing PR 12575 it would no longer be useful anyway.

For the case -Wl,-z,now/RTLD_NOW/LD_BIND_NOW is not in effect the breakpoint
should be put at DT_INIT or the first entry of DT_INIT_ARRAY, as called by
glibc `call_init'.  A single breakpoint on first such entry point is enough.

If there is no DT_INIT/DT_INIT_ARRAY then GDB should stop straight at 
`_dl_debug_state'.

GDB needs to consider all the newly found solibs as `_dl_debug_state' does not
provide which solib(s) has/have been loaded during RT_ADD.

Comment 3 John Reiser 2011-03-14 16:30:03 UTC
It would be much simpler and quicker for all involved (glibc, gdb, other "auditors" of runtime loading, existing apps, ...) to extend the RT_CONSISTENT enum with another state such as the RT_RUNNABLE suggested in http://sources.redhat.com/bugzilla/show_bug.cgi?id=2328#c6, and then call _dl_debug_state() with RT_RUNNABLE just before _dl_init().

I would rather see something simple that works soon than endure more delay which may be encountered by implementing and adapting to complicated solutions, especially ones may have to deal with interactions from another utility (prelinking).  The differing important visible state at RT_CONSISTENT between Solaris and Linux (whether relocations have been done or not) suggests that extending the enum should be allowed.

Comment 5 RHEL Program Management 2011-06-20 22:20:06 UTC
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update.

Contact your manager or support representative in case you need to escalate this bug.

Comment 12 Jan Kratochvil 2011-10-25 15:11:16 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
This text has been already reviewed for RHEL-6.2 Bug 669432:
https://errata.devel.redhat.com/errata/edit/11694

Comment 14 errata-xmlrpc 2012-02-21 06:12:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0238.html


Note You need to log in before you can comment on or make changes to this bug.