Bug 63370
Summary: | strace -f -p pid causes pid to hang after a child exits | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Jay Berkenbilt <ejb> | ||||
Component: | strace | Assignee: | Jakub Jelinek <jakub> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.3 | CC: | bbaetz, ejb, geoffrey, jwm | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2002-08-09 14:50:20 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jay Berkenbilt
2002-04-13 00:22:50 UTC
Created attachment 53691 [details]
source code for simple example program
*** This bug has been marked as a duplicate of 62591 *** I have reopened this bug, changed it to a RedHat 7.3 bug (rather than a skipjack bug), and set the severity to normal instead of high. Although this bug was marked as a duplicate of bug 62591, it really isn't. The problem reported there does indeed appear to have been fixed by strace-4.4-4, but the problem reported here has not been entirely fixed, though the nature of the problem has changed. Before, I had marked this as high severity because it was possible to unknowingly disable the system with this problem. That is no longer possible though as the nature of what goes wrong has changed. For this reason, I have dropped the severity back down to normal. The behavior now is that one or more of the traced processes may have their status changed to T, but a kill -CONT fixes the problem and lets things continue where they left off. For example, again run my C program in the background and run strace -f -p pid where pid is the primary process assigned to the job as returned by jobs -l. It is now possible to get out of strace with CTRL-C. When you do, one of the child processes will be left with 'T' as its status. If you kill -CONT that process, its parent gets left with 'T' as its status. If you kill -CONT that process, everything is back to normal. A kill -CONT to the job from the shell from which it was started (propagating the signal to the process group) should also work. I think the specifics are that whatever process strace is printing information about at the time that it is interrupted remains STOPped, at least if the process is in certain states. Let me give a more exact recipe for reproducing the problem. Start my program in the background: $ ./a.out & [1] 4790 Now, in another window, type strace -f -p 4790. You should see plenty output that more or less alternates between 4790 and each new child process. Wait until a child process enters nanosleep(). As soon as it does, hit CTRL-C on the strace process. Do ps lpid where pid is the child process that was in nanosleep. You'll see that it has T as its status. Send kill -CONT to it. Everything may return to normal, or the parent process may be stopped. Run strace again. This time, CTRL-C when the parent is in nanosleep. The same thing happens -- the parent process ends up STOPped. Hit return in the window with the shell that originally started the program. You'll see [1]+ Stopped ./a.out Run kill -CONT %, and everything again returns to normal. I do not believe that interrupting strace should leave the child process stopped. *** Bug 64560 has been marked as a duplicate of this bug. *** This problem still exists in limbo. Is anyone investigating? I'm going to post a new bug for the one small aspect of this that is still around. Then maybe this one should be closed again. |