Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 712188 - corosync-objctl -t run multiple times with heavy load deadlocks corosync
corosync-objctl -t run multiple times with heavy load deadlocks corosync
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: corosync (Show other bugs)
6.1
All All
medium Severity high
: rc
: ---
Assigned To: Jan Friesse
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-09 14:20 EDT by Steven Dake
Modified: 2016-04-26 10:05 EDT (History)
3 users (show)

See Also:
Fixed In Version: corosync-1.4.0-1.el6
Doc Type: Bug Fix
Doc Text:
Cause: A race condition when using the tracking functionality of the internal object database. Consequence: Corosync would lock-up under heavy load with contrived test cases. Fix: Resolved race condition. Result: Corosync now doesn't lock up with corosync-objctl -t is run multiple times with heavy objdb load.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-12-06 06:51:09 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Proposed patch (5.95 KB, patch)
2011-06-16 11:21 EDT, Jan Friesse
no flags Details | Diff
Second version of proposed patch (6.04 KB, patch)
2011-06-17 08:35 EDT, Jan Friesse
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1515 normal SHIPPED_LIVE corosync bug fix and enhancement update 2011-12-05 19:38:47 EST

  None (edit)
Description Steven Dake 2011-06-09 14:20:23 EDT
Description of problem:
During customer investigation of different issue, a scenario was found where it is possible to lock up corosync.  This use model could occur in the typical cluster suite "reload" operatoin.

Version-Release number of selected component (if applicable):
corosync-1.2.3-36.el6

How reproducible:
100%

Steps to Reproduce:
1. run corosync (or cluster suite)
2. run testa one time
3. run corosync-objctl -t test
4. run testall (make sure test is on the system in the cwd)
  
Actual results:
corosync deadlocks and stops processing and participating in cluster membership

Expected results:
corosync shouldn't deadlock

Additional info:
Comment 2 Steven Dake 2011-06-09 14:22:26 EDT
Honza,

Make sure to add a test case to the automated test suite for this scenario.
Comment 3 Steven Dake 2011-06-09 14:23:13 EDT
step 3 should be run atleast 2 times to generate the deadlock.
Comment 5 Jan Friesse 2011-06-16 11:21:04 EDT
Created attachment 505074 [details]
Proposed patch

Patch sent to ML
Comment 6 Jan Friesse 2011-06-17 08:35:26 EDT
Created attachment 505267 [details]
Second version of proposed patch

Patch sent to ML. It should fix all reviewer notes.
Comment 11 Steven Dake 2011-10-27 14:49:25 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: A race condition when using the tracking functionality of the internal object database.
  Consequence: Corosync would lock-up under heavy load with contrived test cases.
  Fix: Resolved race condition.
  Result: Corosync now doesn't lock up with corosync-objctl -t is run multiple times with heavy objdb load.
Comment 12 errata-xmlrpc 2011-12-06 06:51:09 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1515.html

Note You need to log in before you can comment on or make changes to this bug.