Bug 983296 - rgmanager segfault while starting service
Summary: rgmanager segfault while starting service
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rgmanager
Version: 6.2
Hardware: x86_64
OS: Linux
Target Milestone: rc
: ---
Assignee: Ryan McCabe
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2013-07-10 22:13 UTC by Chester Knapp
Modified: 2018-12-03 19:18 UTC (History)
7 users (show)

Fixed In Version: rgmanager-
Doc Type: Bug Fix
Doc Text:
* Previously, attempts to start an MRG Messaging (MRG-M) broker caused rgmanager to terminate unexpectedly with a segmentation fault. This was caused by subtle memory corruption introduced by calling pthread_mutex_unlock() on a mutual exclusion that was not locked. This update adresses scenarios where memory could be corrupted when calling pthread_mutex_unlock(), and crashes no longer occur in the described scenario.
Clone Of:
Last Closed: 2013-11-21 10:56:10 UTC
Target Upstream Version:

Attachments (Terms of Use)
abrt output (5.32 MB, application/gzip)
2013-07-16 13:57 UTC, Chester Knapp
no flags Details
another abrt core dump (5.32 MB, application/gzip)
2013-07-16 13:58 UTC, Chester Knapp
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1600 0 normal SHIPPED_LIVE rgmanager bug fix update 2013-11-20 21:39:20 UTC

Description Chester Knapp 2013-07-10 22:13:56 UTC
Created attachment 771875 [details]
abrt output (core dump)

Description of problem: rgmanager dumps core while starting up an HA MRG-M broker

Version-Release number of selected component (if applicable):

How reproducible:
Only once, so far

Steps to Reproduce:
1. Attempt to start MRG broker (e.g.  /usr/sbin/clusvcadm -e BR.0 -m hostname)

Actual results:
rgmanager crashes

Expected results:
rgmanager starts services normally

Additional info:
uname -a: 
Linux omhq1adf 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10 15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux

Comment 2 Ryan McCabe 2013-07-12 19:50:31 UTC
Hopefully this is fixed by cluster.git commit 156a32063baa99fefdb445c68111b174dc06753e

Comment 3 Chester Knapp 2013-07-16 13:57:33 UTC
Created attachment 774266 [details]
abrt output

Comment 4 Chester Knapp 2013-07-16 13:58:52 UTC
Created attachment 774281 [details]
another abrt core dump

I've added two additional core dumps after seeing this behavior repeated. Consider upping the frequency of the defect.

Comment 5 Ryan McCabe 2013-07-16 15:21:20 UTC
Are you able to try a test package with a proposed fix?

Comment 6 Chester Knapp 2013-07-16 16:43:44 UTC
I'll contact the customer to determine if this is feasible. Although, I don't have a reliable reproducer either way. rgmanager has cored 5 times out of several hundred over the last week, so it is still intermittent at best.

Comment 7 Chester Knapp 2013-07-16 16:44:54 UTC
(In reply to Ryan McCabe from comment #5)
> Are you able to try a test package with a proposed fix?

Yes, the customer is agreeable. We'll try out the test package you provide.

Comment 8 Ryan McCabe 2013-07-17 15:22:34 UTC
(In reply to Chester Knapp from comment #7)
> (In reply to Ryan McCabe from comment #5)
> > Are you able to try a test package with a proposed fix?
> Yes, the customer is agreeable. We'll try out the test package you provide.

You can grab the build from https://brewweb.devel.redhat.com/buildinfo?buildID=282334

Let me know how it goes.

Thanks for testing!

Comment 11 Chester Knapp 2013-08-27 15:15:16 UTC
Customer opted not to use the test package.

Comment 16 errata-xmlrpc 2013-11-21 10:56:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.