Bug 504007
Summary: | possible intermittent deadlock with uprobes due to task_finder mm_lock holding | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Frank Ch. Eigler <fche> | ||||||||||||||
Component: | systemtap | Assignee: | David Smith <dsmith> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE <qe-baseos-auto> | ||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | low | ||||||||||||||||
Version: | 5.4 | CC: | aconway, mjw, pmuller | ||||||||||||||
Target Milestone: | rc | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2009-09-02 10:01:01 UTC | Type: | --- | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
Frank Ch. Eigler
2009-06-03 18:44:17 UTC
Created attachment 347075 [details]
original systemtap script
Here's some additional information from aconway. The system was running as part of a cluster. qpidd (a message broker) was running and the perftest program (from qpidc-perftest) was shoving messages through qpidd.
I've also attached the systemtap script he was using.
Created attachment 347084 [details]
test source
Here's a test program that duplicates the problem from the open posix testsuite (included with ltp). Untar and build using 'make'.
Created attachment 347085 [details]
systemtap script that attaches to the 'stress' program
Here's the systemtap script I've used to duplicate the problem. The script is very similar to aconway's original script. The script should be run as:
# cd pthread_mutex_lock
# stap -v 504007_stress.stp -c ./stress
Note that this script was run on an x86_64 system. The path to the 'stress' executable will need to be updated. If the system isn't an x86_64 system, the path to libpthread.so may also need to be adjusted.
Note that the deadlock is intermittent. It may help to stress the system by running 'while true; do ps awux > /dev/null; sleep 1; done' in another terminal.
Created attachment 347225 [details]
patch 1 (commit 9b59029)
Created attachment 347226 [details]
patch 2 (commit bec8cf)
Created attachment 347227 [details]
patch 3 (commit 920d63)
The 3 patches I've attached (with their associated upstream commit ids) seem to fix this problem. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1313.html |