209445 – gdb hangs after stopping multithreaded application

Bug 209445 - gdb hangs after stopping multithreaded application

Summary: gdb hangs after stopping multithreaded application

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	gdb
Sub Component:
Version:	5
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Jan Kratochvil
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	209670
TreeView+	depends on / blocked

Reported:	2006-10-05 13:15 UTC by Wade Hampton
Modified:	2007-11-30 22:11 UTC (History)
CC List:	3 users (show)
Fixed In Version:	gdb-6.6-3.fc7
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-02-05 09:43:42 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (2.43 KB, text/plain) 2006-10-05 20:41 UTC, Jan Kratochvil	no flags	Details
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (fixed) (2.48 KB, text/plain) 2006-10-05 20:43 UTC, Jan Kratochvil	no flags	Details
Show Obsolete (1) View All

Description Wade Hampton 2006-10-05 13:15:18 UTC

Description of problem:

Debugging multithreaded application with GDB.  Stop application and gdb reports
Cannot fetch general-purpose registers for thread 130096048L  generic error.
Attempt to run, load a new file, or quit results in gdb hang.

Version-Release number of selected component (if applicable):

Fedora Core 4:  gdb 6.3.0, gcc 4.0.2
Fedora Core 5:  gdb 6.3.0, gcc 4.1.1

How reproducible:

Every time I try to quit debugging my program.

Steps to Reproduce:
1.  start gdb with my program
2.  run the program
3.  quit program
  
Actual results:

gdb can not restart program or quit.

Expected results:

Ability to restart program, reload it, or quit gdb

Additional info:  Program is using glib2

Comment 1 Jan Kratochvil 2006-10-05 13:26:27 UTC

As FC5 gdb generally works with threads I feel more reproducibility info would
be needed.
Could you please try the RawHide/FC6 gdb instead?
It has been rebased on 6.5 and it has improved the threads supports a lot:
wget
http://sunsite.mff.cuni.cz/pub/fedora/development/source/SRPMS/gdb-6.5-8.fc6.src.rpm
rpmbuild --rebuild gdb-6.5-8.fc6.src.rpm
rpm -U /usr/src/redhat/RPMS/i386/gdb-6.5-8.fc6.i386.rpm

Comment 2 Wade Hampton 2006-10-05 15:31:34 UTC

On my FC5 box, I tried the updated GDB from FC6 beta (reported as 6.5-rh when it
starts).  I have the same result, however the error message now reads:  
   Couldn't get registers: No such process.

Comment 3 Jan Kratochvil 2006-10-05 15:55:49 UTC

It would be useful to get your application for test.
I could not reproduce it, for example on Ekiga with 13 threads.

In some cases I could reproduce gdb lockup on its quit; it is better to quit
gdb(1) for a next debugging session, still I understand it should get fixed.

Comment 4 Wade Hampton 2006-10-05 20:11:45 UTC

The application is for in-house use and can't be released. I can only provide a
bit of info, but it has over 50 threads, one talking to a postgresql server, one
running a gsoap server, and all controlled by glib2 (they are gthreads using
glib message passing).  I did add a call to wait on all threads to quit, then
sleep(3), then exit.  That seemed to let my program quit gracefully in the
debugger.  My concern is that I have some memory leak somewhere that is messing
up GDB, which GDB **should** handle.  (I also plan on using efence, etc. to
check the application.)  Any suggestions?  Wish I could provide more.

Comment 5 Jan Kratochvil 2006-10-05 20:41:07 UTC

Created attachment 137859 [details]
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686

Lockup od gdb on exit has been confirmed, not yet fixed but I do not ask about
it here.

Efence use is very simple, I hope you are aware of it:
  LD_PRELOAD=/usr/lib/libefence.so application...
Valgrind should be generally more effective, but in fact I was always more
depending on efence myself.

Any memory leak should never mess GDB, I did not much understand why it should
be a problem.

Problem reproduced here but it looks to me as a rare race - does your
application really create/delete the threads all the time?

Thanks for the bugreport, sure to be fixed.

Comment 6 Jan Kratochvil 2006-10-05 20:43:47 UTC

Created attachment 137861 [details]
Reproducibility for FC6(RawHide) / gdb-6.5-8.fc6.i386 / kernel-2.6.18-1.2693.fc6.i686 (fixed)

Comment 7 Wade Hampton 2006-10-06 19:56:59 UTC

My application only creates the threads at startup.  I think what could be
happening is on cleanup/termination.  My code is terminating the 50+ threads
(joining each of them) and then quickly terminating the application. That could
be triggering a problem like the race you describe.  It would also explain why
adding a sleep before exiting the application seemed fixed the problem.

I tried a quick and dirty glib application but can't duplicate the problem at
this time.  I'll try your code first of the week.

Comment 8 Michal Babej 2006-10-09 12:07:25 UTC

Hello,

i'm running into this issue while i'm testing firefox's totem plugin. It is very
easily reproducible (running lastest RawHide/i386). Things to reproduce:
1. Firefox with Totem plugin
2. some test videos ( i used those free Theora videos from
http://commons.wikimedia.org/wiki/Category:Video )
3. run firefox, attach to it from gdb, and see some short videos. after 4-6
videos usually firefox freezes, with gdb reporting the error

Versions:
gdb-6.5-8.fc6
firefox-1.5.0.7-6.fc6
totem-2.16.1-1.fc6
totem-mozplugin-2.16.1-1.fc6

Comment 9 Jan Kratochvil 2007-02-05 09:43:42 UTC

* Mon Feb  5 2007 Jan Kratochvil <jan.kratochvil> - 6.6-3
- Fix a race during attaching to dying threads; backport (BZ 209445).

Note You need to log in before you can comment on or make changes to this bug.