Bug 526007
Summary: | ltrace cannot properly handle multi-threaded processes | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Martin Osvald 🛹 <mosvald> | ||||||||||||||||||||||||||
Component: | ltrace | Assignee: | Petr Machata <pmachata> | ||||||||||||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Miloš Prchlík <mprchlik> | ||||||||||||||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||||||||||||||
Priority: | urgent | ||||||||||||||||||||||||||||
Version: | 5.3 | CC: | alanm, bgollahe, bugzilla.acct, collura, cww, jwest, mfuruta, mnewsome, mprchlik, myamazak, ohudlick, patrickm, pmuller, rbinkhor, redhat-bz, rprice, samukawa-oxa, tao, ykawada | ||||||||||||||||||||||||||
Target Milestone: | rc | Keywords: | Patch, ZStream | ||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||||||
OS: | Linux | ||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||||||||||
Last Closed: | 2013-09-30 22:39:01 UTC | Type: | --- | ||||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||||
Bug Depends On: | |||||||||||||||||||||||||||||
Bug Blocks: | 668957, 719046, 733216, 742340, 878756, 978304 | ||||||||||||||||||||||||||||
Attachments: |
|
Description
Martin Osvald 🛹
2009-09-28 09:51:04 UTC
Created attachment 362874 [details]
temporary patch
Also occurs on Fedora Core 13 This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Confirmed that the upstream version fixes this issue. I'll look into how difficult would it be to port this over to RHEL (or rather the other way around, to merge our patches upstream; note we cannot simply rebase, ltrace hasn't been able to pass its own testsuite for quite some time now). Just a note, upstream started 0.6 release and is looking for ltrace patches laying all around. Seems like a good time to merge RHEL & Fedora patches upstream. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. Created attachment 515874 [details]
Source RPM with rebased patchset
This src rpm includes the complete patch set. I'll spin scratch binaries next and link them here. This works on Fedora 14 on x86_64, I haven't had a chance to check this anywhere else yet. Test case is included.
Created attachment 516201 [details] Source RPM with rebased patchset On RHEL 5, process status of stopped process is always reported as "T" in /proc/pid/status, regardless of whether it's in tracing or job control stop. The status is clarified further on the line. This SRPM contains an additional patch that implements this. Multi-threaded tracing now works for me on x86_64 RHEL 5 machine. Scratch build for x86_64 here: https://brewweb.devel.redhat.com/taskinfo?taskID=3530264 (In reply to comment #45) > Created attachment 516201 [details] > Source RPM with rebased patchset > > On RHEL 5, process status of stopped process is always reported as "T" in > /proc/pid/status, regardless of whether it's in tracing or job control stop. > The status is clarified further on the line. This SRPM contains an additional > patch that implements this. Multi-threaded tracing now works for me on x86_64 > RHEL 5 machine. Dear Tanaka-san, Would you please rebuild attached SRPM on x86_64 and verify it, if possible? Best Regards, Masaki Furuta Verified that the same SRPM works also on i386, ppc and ppc64. Created attachment 516376 [details]
Source RPM with rebased patchset
This includes an extra patch that fixes a race between waiting for SIGSTOP to be delivered to all tasks, and one of those tasks exiting.
Furthermore, I verified that multi-threaded racing also works on s390 and s390x.
Created attachment 516786 [details]
Source RPM with rebased patchset
This SRPM contains one additional correctness fix and two race condition fixes.
Created attachment 516896 [details]
Additional compilation fix for ia64
Hi Furuta-san, NEC confirmed the fix with the test package. Thanks, Ken'ichi Tanaka Created attachment 518786 [details]
Partial fix
This is partial fix that removes a race between obtaining list of tasks and attaching to those tasks. This fixes SIGSEGVs that ltrace was getting and copious "can't attach" messages that it was putting out when it tried to attach to processes with many (100-ish) threads.
I still see a different attach problem, where the tracee dies of SIGTRAP when ltrace attempts to attach. Similarly, detach problems were not addressed yet.
Created attachment 519070 [details]
Additional fix for attach
This is additional patch that amends the previous "Partial fix" and fixes attach to many-threaded processes. The gist of the addressed problem is that some tasks were still running unattached while the breakpoints were already inserted.
Created attachment 519156 [details]
Additional fix for detach
This is additional patch that fixes a number of bugs in detach logic that would leave tracee in inconsistent state (with pending events, instruction pointers pointing mid-instruction, breakpoints left inside, and perhaps others). As a result, the tracee would be killed, or could generally change behavior as the result of tracing.
I have ltrace with this fix spinning on my Fedora machine, attaching and detaching to a multi-threaded process, and it seems to be stable and well-behaved, at least on x86_64. I'll do more thorough cross-arch testing on Monday.
Created attachment 522392 [details]
Additional fix
Additional fix for proper syscall detection, ported from upstream.
(In reply to comment #71) > Created attachment 522392 [details] > Additional fix > > Additional fix for proper syscall detection, ported from upstream. Dear NEC, Would you please verify this as well, if possible? Best Regards, Masaki Furuta Patch series updated for additional of tracing of ppc64 binaries. Fix in CVS. Patch series updated for additional fixlets in test suite and error reporting, and overzealous syscall detection on s390, introduced in past fix, removed. Fix in CVS. Dear Furuta-san,
>>Dear NEC,
>>
>>Would you please verify this as well, if possible?
I confirmed this problem is fixed on server updated ltrace-0.5-syscall_p.patch.
Dear NEC, Sorry for bothering you ,but As per suggestion from engineering, could you please verify our latest ltrace package again? I'm uploading ltrace-0.5-13.45svn.el5_7.4 at http://people.redhat.com/~mfuruta/.526007/ Best Regards, Masaki Furuta Dear Furuta-San,
>> Sorry for bothering you ,but As per suggestion from engineering, could you
>> please verify our latest ltrace package again? I'm uploading
>> ltrace-0.5-13.45svn.el5_7.4 at http://people.redhat.com/~mfuruta/.526007/
I confirmed this problem is fixed with ltrace-0.5-13.45svn.el5_7.4.
Patch series updated to not attempt to run a binary from unsupported architecture, which was a regression. This is just a nit related to ltrace start-up, and should have no impact to the tracing itself. Created attachment 525218 [details]
Additional fix for vfork
Now that ltrace recognizes threads, it also triggers when vfork is called, but it mishandles it. This patch implements the fixes necessary for proper vfork support. Only verified on x86_64 as of now, I may update the fix if it turns out to be broken on other arches.
The above patch is racy, we need to re-enable the vfork breapoint in _parent_, not _leader_. The fix is trivial and already in CVS. Dear Samukawa-san, Could you verify a new test package? Our engineer believes it is complete. http://people.redhat.com/myamazak/bz526007/ Best regards, M Yamazaki Dear Yamazaki-San,
>>Could you verify a new test package? Our engineer believes it is complete.
>>http://people.redhat.com/myamazak/bz526007/
I confirmed this problem is fixed with ltrace-0.5-13.45svn.el5_7.7.
Dear Samukawa-san, Thank you for the confirmation and feedback. Regards, M Yamazaki Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1317.html |