Bug 117471

Summary: Gdb loses track of threads in small example program
Product: Red Hat Enterprise Linux 3 Reporter: Johan Walles <johan.walles>
Component: gdbAssignee: Jan Kratochvil <jan.kratochvil>
Status: CLOSED CURRENTRELEASE QA Contact: Jay Turner <jturner>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: cagney, jan.kratochvil, jjohnstn, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: gdb-6.3.0.0-1.96.i386 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-15 19:59:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Buggy C program none

Description Johan Walles 2004-03-04 12:18:45 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624

Description of problem:
I've written a buggy program (which I will attach shortly).  It has
two threads that tries to communicate with each other using a socket.
 When it segfaults, gdb says:

Program received signal SIGSEGV, Segmentation fault.
Couldn't get registers: Processen finns inte.

"Processen finns inte" is Swedish for "the process doesn't exist".


Version-Release number of selected component (if applicable):
gdb-6.0post-0.20031117.6

How reproducible:
Always

Steps to Reproduce:
1. gcc -Wall -Werror -g -lpthread socketaccepts.c -o socketaccepts
2. gdb socketaccepts
3. run
    

Actual Results:  Starting program: /home/johan/src/test/socketaccepts
[Thread debugging using libthread_db enabled]
[New Thread -1218593984 (LWP 6212)]
 
Program received signal SIGSEGV, Segmentation fault.
Couldn't get registers: Processen finns inte.
(gdb) bt
Cannot fetch general-purpose registers for thread -1218593984: generic
error
Cannot fetch general-purpose registers for thread -1218593984: generic
error
(gdb) quit
The program is running.  Exit anyway? (y or n) y
Quitting: thread_db_get_info: cannot get thread info: generic error


Expected Results:  Gdb should have allowed me to debug the program. 
Specifically, I would like to be able to get a backtrace using "bt".


Additional info:

I have the same problem on a home-built gdb 5.3 on RHAS21, so this
seems to be an old problem.  I have seen this on two different SMP
systems, haven't tried on any single CPU boxes.

Severity "high" because even though gdb doesn't actually crash it
becomes entirely unusable, which from my perspective is the same thing.

Comment 1 Johan Walles 2004-03-04 12:19:50 UTC
Created attachment 98284 [details]
Buggy C program

Comment 2 Elena Zannoni 2004-03-18 16:49:07 UTC
What versions of glibc and kernel do you have installed?


Comment 3 Johan Walles 2004-03-19 07:13:04 UTC
The RHAS21 system has:
glibc-2.2.4-32.8
Linux version 2.4.9-e.27smp (bhcompile.redhat.com) (gcc
version 2.96 20000731 (Red Hat Linux 7.2 2.96-118.7.2)) #1 SMP Tue Aug
5 15:49:54 EDT 2003

The RHEL3 box has:
glibc-2.3.2-95.6
Linux version 2.4.21-9.ELsmp (bhcompile.redhat.com)
(gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-26)) #1 SMP Thu Jan 8
17:08:56 EST 2004

Have you been able to reproduce?


Comment 4 Jeff Johnston 2004-04-05 22:28:53 UTC
So far I am unable to reproduce.

I have run the test case on both an RHEL3-U1 machine with non-smp
kernel and one with up2date'd RHEL3-U1 smp kernel (2.4.21-9.0.1.ELsmp)
on a hyperthreaded i686 machine.  Both test runs are successful:

[Thread debugging using libthread_db enabled]
[New Thread -1218598080 (LWP 7036)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1218598080 (LWP 7036)]
main (argc=1, argv=0xbfffbf04) at socketaccepts.c:55
55	   addr.sin_addr = *(struct in_addr*)INADDR_LOOPBACK;
(gdb) bt
#0  main (argc=1, argv=0xbfffbf04) at socketaccepts.c:55

I am looking for a different smp machine to test against.  Can you
attempt to update your U1 kernel via up2date to the one listed above?

As well, does your gdb state at start up that is loading libthread_Db
from /lib/tls/libthread_db.so.1?
....


Comment 5 Johan Walles 2004-04-06 08:44:09 UTC
I just realized I have a .gdbinit with a bunch of stuff in it. 
Running gdb with -nx resolves the problem.  I was able to narrow the
problem down to the following statement:

  handle SIGSEGV nostop

I'm guessing the problem is that the program falls back on the default
SIGSEGV handler and terminates, but gdb fails to realize the program
has terminated.

If you do "handle SIGSEGV nostop" before "run", it should repro for you.


Comment 6 Andrew Cagney 2004-09-01 18:51:32 UTC
This is expected behavior.

Comment 8 Johan Walles 2004-09-02 06:17:41 UTC
Andrew,

I expected gdb to realize when the program it is debugging terminates.
 Are you really saying I should expect gdb to under some circumstances
*not* understand what my program is doing?


Comment 9 Andrew Cagney 2004-10-21 20:31:38 UTC
Reopened.  GDB shouldn't be printing the second line of:

Program received signal SIGSEGV, Segmentation fault.
Couldn't get registers: Processen finns inte.



Comment 10 Jan Kratochvil 2006-07-24 10:19:05 UTC
Tested on FC5 gdb-6.3.0.0-1.122 the bug is no longer present.  RHEL versions
untested (for possible backporting).
$ gdb ./pr98284
(gdb)   handle SIGSEGV nostop
Signal        Stop      Print   Pass to program Description
SIGSEGV       No        Yes     Yes             Segmentation fault
(gdb) r
Starting program: /tmp/pr98284
[Thread debugging using libthread_db enabled]
[New Thread -1208654144 (LWP 16649)]

Program received signal SIGSEGV, Segmentation fault.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb) bt
No stack.


Comment 11 Johan Walles 2006-07-26 08:22:02 UTC
Comment 10 describes the behaviour I would expect.


Comment 12 Jan Kratochvil 2006-08-05 16:14:09 UTC
Tested on RHEL4U3 gdb-6.3.0.0-1.96.i386 and the bug is also fixed there as in FC5.
Please submit BEA Issue Tracker issue instead if you request a backport for RHEL3.
Suggesting CLOSED-CURRENTRELEASE.


Comment 13 Johan Walles 2006-08-15 06:08:46 UTC
I'm fine without a backport, it's enough that you've fixed this in later releases.

Comment 14 Jan Kratochvil 2006-08-15 19:59:58 UTC
Thanks for your bugreport. Sorry it was not resolved earlier.