Bug 110212 - Gdb fails with "Couldn't get registers" debugging multi-threaded app
Gdb fails with "Couldn't get registers" debugging multi-threaded app
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: gdb (Show other bugs)
1
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Alexandre Oliva
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-11-16 23:58 EST by Stephen Moehle
Modified: 2014-01-16 07:27 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-28 15:02:40 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
testcase (603 bytes, text/plain)
2014-01-16 07:27 EST, Paweł Sikora
no flags Details

  None (edit)
Description Stephen Moehle 2003-11-16 23:58:53 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b)
Gecko/20031004

Description of problem:
When debugging a multi-threaded app with gdb, gdb fails soon after
issuing the run command with:

Couldn't get registers: No such process.

After that, gdb can do nothing with the program.

If I use LD_ASSUME_KERNEL=2.2.5 before running gdb, gdb seems to be
able to debug the program OK.

I have tried using both 5.3.90 that came with Fedora and with a
version of 6.0 I built myself, with the same results.

I am using the errata glibc, version 2.3.2-101.1, but the problem also
happened with the prior glibc.

The particular program I am trying to debug is gxine 0.3.3,
http://prdownloads.sourceforge.net/xine/gxine-0.3.3.tar.gz, but I have
no reason to believe the problem is specific to this app.

Here is the complete output from gdb:

GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".
 
(gdb) run
Starting program: /home/stephe/work/gxine-0.3.3/src/gxine
[Thread debugging using libthread_db enabled]
[New Thread -1084398240 (LWP 8119)]
server: trying to connect to already running instance of gxine
(/home/stephe/.gxine/socket)...
connect: Connection refused
server: socket '/home/stephe/.gxine/socket' created
[New Thread 38214576 (LWP 8120)]
dxr3_scr: Failed to open control device /dev/em8300-0 (No such file or
directory)
[New Thread 116026288 (LWP 8121)]
[New Thread 48704432 (LWP 8122)]
[New Thread 70015920 (LWP 8123)]
[New Thread 89590704 (LWP 8124)]
[New Thread 145636272 (LWP 8125)]
[New Thread 1084206000 (LWP 8126)]
[New Thread 1094695856 (LWP 8127)]
[New Thread 1105185712 (LWP 8128)]
[New Thread 1115675568 (LWP 8129)]
libdvdnav: Using dvdnav version 1-rc2 from http://xine.sf.net
Couldn't get registers: No such process.
(gdb)


Version-Release number of selected component (if applicable):
5.3.90-0.20030710.41

How reproducible:
Always
Comment 1 Dan Nuffer 2003-12-05 00:22:30 EST
I am having the same problem with OpenWBEM (http://openwbem.sf.net/)
Comment 2 R.K.Aa. 2004-03-15 08:33:22 EST
Since there is no support for RH8 anymore, I just purchased RH9 and
upgraded, because of bug 110038. And now i *STILL* can't debug Mozilla
- i get the above error instead: Couldn't get registers: No such process.
:((
Comment 3 Jeff Johnston 2004-04-06 16:30:30 EDT
Please try gdb 6.1.  A recent fix has been made for gdb debugging nptl
threads.  In certain cases, gdb would fail to attach to all threads
and if a global signal occurred, it would cause the application to
terminate which causes the message above.  There also is a fix for
debugging applications which fork.
Comment 4 R.K.Aa. 2004-04-06 22:17:15 EDT
Where is 6.1 located? According to gnu.org the latest is 6.0, rpmfind
only has some alien src package, and I get connection refused when
trying to access http://sources.redhat.com/gdb/
Comment 5 Stephen Moehle 2004-04-18 19:54:07 EDT
This bug is fixed in GDB 6.1 build from source.  I think this bug can
be safely closed.
Comment 6 Alex Sim 2004-04-22 13:54:53 EDT
This bug is not fixed in GDB 6.1 that I have tested for the last few 
days trying to make things work.
We're running RH-WS with 2.4.21-9.0.1.EL kernel and the rest of gcc 
and glibc from the OS.
It seems that the problem in GDB is in the thread id handling that is 
too large in NPTL and gdb cannot handle it.
I modified the gdb source in a few files for tid type and its 
handlings, and
it seems behaving properly.

With the original GDB 6.1 without any modifications and without 
LD_ASSUME_KERNEL (with a simple multithreaded foo program):

% gdb ~/foo
GNU gdb 6.1
(gdb) r 2
Starting program: /.../foo 2
[Thread debugging using libthread_db enabled]
[New Thread -1220099968 (LWP 15348)]
[New Thread -1220101200 (LWP 15380)]
Couldn't get registers: No such process

With the LD_ASSUME_KERNEL:

% gdb ~/foo
GNU gdb 6.1
(gdb) r 2
Starting program: /.../foo 2
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 15312)]
[New Thread 32769 (LWP 15344)]
[New Thread 16386 (LWP 15345)]
current thread id=16386
[New Thread 32771 (LWP 15346)]
current thread id=32771



With the modified GDB 6.1 and without LD_ASSUME_KERNEL:

% /workspace/gdb-6.1/gdb/gdb ~/foo
GNU gdb 6.1
(gdb) r 2
Starting program: /home/users/asim/foo 2
[Thread debugging using libthread_db enabled]
[New Thread 3074867328 (LWP 15423)]
[New Thread 3074866096 (LWP 15453)]
[New Thread 3064376240 (LWP 15454)]
current thread id=3074866096
current thread id=3064376240

Comment 7 Jeff Hansen 2004-08-02 15:53:35 EDT
Could you post a diff of the changes that you made to GDB 6.1?  Are
there any side effects of the change that you've made?  Could you
submit this change to the GDB team so that they can apply it to their
normal branch?

The LD_ASSUME_KERNEL environment variable fixes the problem for me,
but I don't want the rest of my app to be living in a kernel 2.2.5
world, so I need the fix that you've described, if possible.  Thanks!
Comment 8 Matthew Miller 2006-07-11 13:22:50 EDT
Fedora Core 1 is maintained by the Fedora Legacy project for security updates
only. If this problem is a security issue, please reopen and reassign to the
Fedora Legacy product. If it is not a security issue and hasn't been resolved in
the current FC5 updates or in the FC6 test release, reopen and change the
version to match.

Thanks!

NOTE: Fedora Core 1 is reaching the final end of support even by the Legacy
project. After Fedora Core 6 Test 2 is released (currently scheduled for July
26th), there will be no more security updates for FC1. Please use these next two
weeks to upgrade any remaining FC1 systems to a current release.

Comment 9 John Thacker 2006-10-28 15:02:40 EDT
Note that FC1 and FC2 are no longer supported even by Fedora Legacy.  Please
install a still supported version and retest.  If this still occurs on FC3 or
FC4 and is a security issue, please reopen and assign to that version and Fedora
Legacy.  If it still occurs on FC5 or FC6, please reopen and assign to the
correct version.
Comment 10 Paweł Sikora 2014-01-16 07:26:35 EST
i can reproduce such gdb error on my fc-20 machine:

glibc-2.18-11.fc20.x86_64
gdb-7.6.50.20130731-16.fc20.x86_64
kernel-3.12.7-300.fc20.x86_64

% gcc cant-save-core.c -o cant-save-core -pthread -g2 -Wall
% gdb ./cant-save-core

(gdb) r
Starting program: /home/pawels/bugs/cant-save-core 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7ffff7fdf700 (LWP 17415)]
[New Thread 0x7ffff77de700 (LWP 17416)]
[Thread 0x7ffff7fdf700 (LWP 17415) exited]
bug

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff77de700 (LWP 17416)]
0x000000381ea35c59 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	  return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) generate-core-file 
Couldn't get registers: No such process.
Comment 11 Paweł Sikora 2014-01-16 07:27:10 EST
Created attachment 851032 [details]
testcase

Note You need to log in before you can comment on or make changes to this bug.