Bug 209445 - gdb hangs after stopping multithreaded application
gdb hangs after stopping multithreaded application
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: gdb (Show other bugs)
5
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Jan Kratochvil
:
Depends On:
Blocks: 209670
  Show dependency treegraph
 
Reported: 2006-10-05 09:15 EDT by Wade Hampton
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version: gdb-6.6-3.fc7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-02-05 04:43:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (2.43 KB, text/plain)
2006-10-05 16:41 EDT, Jan Kratochvil
no flags Details
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (fixed) (2.48 KB, text/plain)
2006-10-05 16:43 EDT, Jan Kratochvil
no flags Details

  None (edit)
Description Wade Hampton 2006-10-05 09:15:18 EDT
Description of problem:

Debugging multithreaded application with GDB.  Stop application and gdb reports
Cannot fetch general-purpose registers for thread 130096048L  generic error.
Attempt to run, load a new file, or quit results in gdb hang.

Version-Release number of selected component (if applicable):

Fedora Core 4:  gdb 6.3.0, gcc 4.0.2
Fedora Core 5:  gdb 6.3.0, gcc 4.1.1

How reproducible:

Every time I try to quit debugging my program.

Steps to Reproduce:
1.  start gdb with my program
2.  run the program
3.  quit program
  
Actual results:

gdb can not restart program or quit.

Expected results:

Ability to restart program, reload it, or quit gdb

Additional info:  Program is using glib2
Comment 1 Jan Kratochvil 2006-10-05 09:26:27 EDT
As FC5 gdb generally works with threads I feel more reproducibility info would
be needed.
Could you please try the RawHide/FC6 gdb instead?
It has been rebased on 6.5 and it has improved the threads supports a lot:
wget
http://sunsite.mff.cuni.cz/pub/fedora/development/source/SRPMS/gdb-6.5-8.fc6.src.rpm
rpmbuild --rebuild gdb-6.5-8.fc6.src.rpm
rpm -U /usr/src/redhat/RPMS/i386/gdb-6.5-8.fc6.i386.rpm
Comment 2 Wade Hampton 2006-10-05 11:31:34 EDT
On my FC5 box, I tried the updated GDB from FC6 beta (reported as 6.5-rh when it
starts).  I have the same result, however the error message now reads:  
   Couldn't get registers: No such process.
Comment 3 Jan Kratochvil 2006-10-05 11:55:49 EDT
It would be useful to get your application for test.
I could not reproduce it, for example on Ekiga with 13 threads.

In some cases I could reproduce gdb lockup on its quit; it is better to quit
gdb(1) for a next debugging session, still I understand it should get fixed.
Comment 4 Wade Hampton 2006-10-05 16:11:45 EDT
The application is for in-house use and can't be released. I can only provide a
bit of info, but it has over 50 threads, one talking to a postgresql server, one
running a gsoap server, and all controlled by glib2 (they are gthreads using
glib message passing).  I did add a call to wait on all threads to quit, then
sleep(3), then exit.  That seemed to let my program quit gracefully in the
debugger.  My concern is that I have some memory leak somewhere that is messing
up GDB, which GDB **should** handle.  (I also plan on using efence, etc. to
check the application.)  Any suggestions?  Wish I could provide more.
Comment 5 Jan Kratochvil 2006-10-05 16:41:07 EDT
Created attachment 137859 [details]
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686

Lockup od gdb on exit has been confirmed, not yet fixed but I do not ask about
it here.

Efence use is very simple, I hope you are aware of it:
  LD_PRELOAD=/usr/lib/libefence.so application...
Valgrind should be generally more effective, but in fact I was always more
depending on efence myself.

Any memory leak should never mess GDB, I did not much understand why it should
be a problem.

Problem reproduced here but it looks to me as a rare race - does your
application really create/delete the threads all the time?

Thanks for the bugreport, sure to be fixed.
Comment 6 Jan Kratochvil 2006-10-05 16:43:47 EDT
Created attachment 137861 [details]
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (fixed)
Comment 7 Wade Hampton 2006-10-06 15:56:59 EDT
My application only creates the threads at startup.  I think what could be
happening is on cleanup/termination.  My code is terminating the 50+ threads
(joining each of them) and then quickly terminating the application. That could
be triggering a problem like the race you describe.  It would also explain why
adding a sleep before exiting the application seemed fixed the problem.

I tried a quick and dirty glib application but can't duplicate the problem at
this time.  I'll try your code first of the week.
Comment 8 Michal Babej 2006-10-09 08:07:25 EDT
Hello,

i'm running into this issue while i'm testing firefox's totem plugin. It is very
easily reproducible (running lastest RawHide/i386). Things to reproduce:
1. Firefox with Totem plugin
2. some test videos ( i used those free Theora videos from
http://commons.wikimedia.org/wiki/Category:Video )
3. run firefox, attach to it from gdb, and see some short videos. after 4-6
videos usually firefox freezes, with gdb reporting the error

Versions:
gdb-6.5-8.fc6
firefox-1.5.0.7-6.fc6
totem-2.16.1-1.fc6
totem-mozplugin-2.16.1-1.fc6
Comment 9 Jan Kratochvil 2007-02-05 04:43:42 EST
* Mon Feb  5 2007 Jan Kratochvil <jan.kratochvil@redhat.com> - 6.6-3
- Fix a race during attaching to dying threads; backport (BZ 209445).

Note You need to log in before you can comment on or make changes to this bug.