Bug 209873
| Summary: | broken strace/gdb of threaded programs | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | David Woodhouse <dwmw2> | ||||||
| Component: | kernel | Assignee: | Roland McGrath <roland> | ||||||
| Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | rawhide | CC: | davej, wtogami | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2006-10-28 05:40:03 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
David Woodhouse
2006-10-07 09:42:56 UTC
[New Thread 844313792 (LWP 9809)]
-- Executing V110("mISDN/1-u1", "") in new stack
[Thread 844313792 (LWP 9809) exited]
reading register pc (#64): No such process.
(gdb) c
Continuing.
reading register pc (#64): No such process.
WTF?
Report omits kernel info. Current FC6 kernel on the architecture indicated above:
Linux pegasos.infradead.org 2.6.18-1.2741.fc6 #1 Wed Oct 4 20:18:10 EDT 2006 ppc
ppc ppc GNU/Linux
Also on ppc64 kernel. Here's what I see when 'strace -f' observes a threaded
program exiting...
[pid 9385] write(1, "Setting timer 268505240 for 5-se"..., 51Setting timer
268505240 for 5-second expiration...
) = 51
[pid 9385] timer_settime(0, 0, {it_interval={5, 0}, it_value={5, 0}}, NULL) = 0
[pid 9385] exit_group(1) = ?
[pid 9386] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid 9386] SYS_300(0x30033510, 0xc, 0x30033508, 0x3003a940, 0x30033508) = 0
[pid 9386] rt_sigtimedwait([RTMIN],
<unfinished ...>
Process 9386 detached
[1]+ Stopped strace -f ./sigev_thread
[root@pegasos dwmw2]# kill -9 %1
[1]+ Killed strace -f ./sigev_thread
I can only test ppc64 kernels. I tried my vanilla 2.6.18+utrace ppc64 on an otherwise fc5 ppc/ppc64 installation, and strace -f worked fine. I still don't have your actual test case, so I used a trivial multithreaded program of my own. Please attach your test program source so I can try what you tried. I have a lot of downloading and installing to do before I can test that fc6 kernel, or test any kernel in an fc6 environment. Created attachment 138320 [details]
test program
This is sufficient.
Ok, I did reproduce some weirdness with 2.6.18+utrace on ppc64 using your test. It looks like the bug is specifically with a group exit that should be killing many live threads, which did not happen in my trivial test. Same problem on i386, it's not machine-specific, just this test case. I'll figure it out. Created attachment 138377 [details]
utrace fix
This fixes the test case. I am also looking into other utrace interactions
with SIGKILL.
|