This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 240962 - Hangs and/or multithreaded process left Stopped (T) on CTRL-C of strace
Hangs and/or multithreaded process left Stopped (T) on CTRL-C of strace
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: strace (Show other bugs)
rawhide
ia64 Linux
high Severity high
: ---
: ---
Assigned To: Roland McGrath
Brian Brock
:
Depends On:
Blocks: 222053
  Show dependency treegraph
 
Reported: 2007-05-23 10:29 EDT by Jan Kratochvil
Modified: 2007-11-30 17:12 EST (History)
0 users

See Also:
Fixed In Version: 4.5.16-1.fc7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-08-06 13:59:30 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Testcase. (1016 bytes, text/plain)
2007-05-23 10:29 EDT, Jan Kratochvil
no flags Details
Bugfix. (1.20 KB, patch)
2007-05-23 10:40 EDT, Jan Kratochvil
no flags Details | Diff
Bugfix updated according to the Roland's comments. (1.69 KB, patch)
2007-05-24 05:40 EDT, Jan Kratochvil
no flags Details | Diff
Bugfix update #2 according to the Roland's comments. (1.67 KB, patch)
2007-05-24 11:08 EDT, Jan Kratochvil
no flags Details | Diff

  None (edit)
Description Jan Kratochvil 2007-05-23 10:29:15 EDT
Description of problem:
strace sometimes hangs during detaching from a multithreaded application during
CTRL-C of strace itself.
As a sideeffects in some cases the multithreaded application gets Stopped (T, by
SIGSTOP) and needs to be sent `kill -CONT'.  Shell prints:
[1]+  Stopped                 appname args

Version-Release number of selected component (if applicable):
strace-4.5.15-1.el5.ia64
kernel-2.6.18-8.el5.ia64

How reproducible:
The application Stopped (T) state always - see also Bug 240961.
The strace hang best/only reproduced on ia64, in about 10% of testruns.

Steps to Reproduce:
1. gcc -o mt3-tkill mt3-tkill.c -Wall -ggdb2 -pthread
2. ./mt3-tkill
3. On other console: strace -o /tmp/x -f -p `pidof mt3-tkill'

Actual results:
Process 13968 attached with 64 threads - interrupt to quit
Process 13907 detached
...
Process 13905 detached
[HANG]

Expected results:
Process 13968 attached with 64 threads - interrupt to quit
Process 13907 detached
...
Process 13900 detached
[EXIT]

Additional info:
The process being traced gets into state:
/proc/12664/task/12664/status:State:	S (sleeping)
/proc/12664/task/12665/status:State:	T (tracing stop)
...
/proc/12664/task/12730/status:State:	T (tracing stop)
with STRACE in state:
#0  0xa000000000010641 in __kernel_syscall_via_break ()
#1  0x2000000000162fe0 in wait4 () from /lib/tls/libc.so.6.1
#2  0x4000000000008420 in detach (tcp=0x600000000001c050, sig=0) at
strace.c:1337
1337	  if (wait4(tcp->pid, &status, __WALL, NULL) < 0) {
#3  0x40000000000093b0 in cleanup () at strace.c:1516
#4  0x4000000000006ea0 in main (argc=6, argv=0x60000fffffff9f18) at
strace.c:803

- STRACE sends SIGSTOP to the process thread group leader but never receives it
back through wait4().

Testcase contains workaround of Linux kernel Bug leaking ERESTARTNOINTR to the
userland, it is present on some older Linux kernel variants around 2.6.9.  This
Linux kernel problem otherwise does not affect this Bug.
Comment 1 Jan Kratochvil 2007-05-23 10:29:15 EDT
Created attachment 155244 [details]
Testcase.
Comment 2 Jan Kratochvil 2007-05-23 10:40:51 EDT
Created attachment 155246 [details]
Bugfix.
Comment 3 Jan Kratochvil 2007-05-23 12:26:41 EDT
The problem occurs due to kill() may choose arbitrarily the target task of the
process group while we later wait just on one specific TID.
PID process waits become TID task specific waits for process under ptrace(2).
[ Roland McGrath originally provided this useful info. ]

Unfortunately the POSIX specification does not seem to mention this behavior:
        http://www.opengroup.org/onlinepubs/009695399/functions/kill.html
This paragraph talks only about kill (getpid (), ...):
        If the value of pid causes sig to be generated for the sending process,
        and if sig is not blocked for the calling thread and if no other thread
        has sig unblocked or is waiting in a sigwait() function for sig, either
        sig or at least one pending unblocked signal shall be delivered to the
        sending thread before kill() returns.
Comment 4 Jan Kratochvil 2007-05-24 05:40:30 EDT
Created attachment 155329 [details]
Bugfix updated according to the Roland's comments.
Comment 5 Jan Kratochvil 2007-05-24 11:08:16 EDT
Created attachment 155354 [details]
Bugfix update #2 according to the Roland's comments.
Comment 6 Jan Kratochvil 2007-08-03 07:59:35 EDT
Fixed in Rawhide strace-4.5.16-1.fc8:
* Fri Aug  3 2007 Roland McGrath <roland@redhat.com> - 4.5.16-1
- fix multithread issues (#240962, [...])

and upstream:

2007-05-24  Jan Kratochvil  <jan.kratochvil@redhat.com>

        * strace.c [LINUX] (my_tgkill): New macro.
        [LINUX] (detach): Use my_tgkill () instead of kill(2).
        Fixes RH#240962.
Comment 7 Fedora Update System 2007-08-06 13:58:57 EDT
strace-4.5.16-1.fc7 has been pushed to the Fedora 7 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.