Red Hat Bugzilla – Bug 116583
gdb gets confused with multithreaded app
Last modified: 2007-11-30 17:07:00 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624
Description of problem:
I'll attach a reproducer for this that confuses gdb every time. It
runs fine inside gdb on all non-NPTL platforms I have tried, and it
confuses gdb on all NPTL platforms I have tried.
Works: RHAS21/ia32, GNU gdb 5.3
Broken: RHEL3/ia32, GNU gdb Red Hat Linux (6.0post-0.20031117.6rh)
Broken: RHEL3/ia64, GNU gdb Red Hat Linux (5.3.90-0.20030710.40rh)
The exact failure mode is a bit different between different platforms,
but it has reproducably and obviously failed on all NPTL platforms I
The program starts 100 threads and waits for them to finish. It does
this over and over again.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Build the to-be-attached lphello using "gcc -g -Wall -lpthread -o
2. Launch "gdb lphello".
4. Wait for a while (10 secs).
5. Interrupt the program using Ctrl-C.
6. Do "info threads".
Actual Results: Varies. Here are some examples:
..Cannot get thread event message: generic error
#0 0xb75cf6a1 in __nptl_create_event () from /lib/tls/libpthread.so.0
Error accessing memory address 0xb75cf6a0: Processen finns inte.
ptrace: No such process.
thread_db_get_info: cannot get thread info: generic error
(gdb) info threads
Cannot find new threads: generic error
#0 0x20000000000470b0 in __nptl_create_event () from
Expected Results: The program should have run inside of gdb the same
as it does outside of gdb. "info threads" should have given me a
listing of all active threads. "bt" should give me a stack trace for
the current thread.
Created attachment 97947 [details]
Reproducer. Build w. "gcc -g -lpthread lphello.c -o lphello", run in gdb
Suspect kernel bug. Threads spontaneously disappearing.
I've done some investigation with mainline gdb and Linux 2.6,
which (some of the time) behaves consistent with what was reported here.
I believe I know what's going on, though not precisely why.
The process is dying because there is a thread that gdb has not
attached to. This is always a potential race condition with
process-wide signals such as those generated from the terminal or by
`kill'. That is not due to a kernel bug, but rather is a limitation
of gdb's support for NPTL-style threads. The only way to avoid that
race condition ever coming up is to use the new 2.6 ptrace feature
PTRACE_O_TRACECLONE instead of relying on libthread_db to tell you
about new threads.
However, in this case this failure mode is arising without a race.
When I run the test case under gdb, I see exactly 100 "New Thread"
messages, and then no more, while the program goes on to create many
more threads (it does many iterations of creating 100 threads, then
waiting for those 100 threads to finish). Then the terminal-generated
SIGINT is taken by one of these later threads to which gdb never
attached, and it kills the whole process (attached threads included).
Please look on the gdb end as to why the threads after the 100th are
not getting attached.
A patch has been put in place in the next RHEL3 update.