Bug 983296 - rgmanager segfault while starting service
rgmanager segfault while starting service
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rgmanager (Show other bugs)
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Ryan McCabe
Cluster QE
: OtherQA
Depends On:
  Show dependency treegraph
Reported: 2013-07-10 18:13 EDT by Chester Knapp
Modified: 2014-07-21 00:26 EDT (History)
7 users (show)

See Also:
Fixed In Version: rgmanager-
Doc Type: Bug Fix
Doc Text:
* Previously, attempts to start an MRG Messaging (MRG-M) broker caused rgmanager to terminate unexpectedly with a segmentation fault. This was caused by subtle memory corruption introduced by calling pthread_mutex_unlock() on a mutual exclusion that was not locked. This update adresses scenarios where memory could be corrupted when calling pthread_mutex_unlock(), and crashes no longer occur in the described scenario.
Story Points: ---
Clone Of:
Last Closed: 2013-11-21 05:56:10 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
abrt output (5.32 MB, application/gzip)
2013-07-16 09:57 EDT, Chester Knapp
no flags Details
another abrt core dump (5.32 MB, application/gzip)
2013-07-16 09:58 EDT, Chester Knapp
no flags Details

  None (edit)
Description Chester Knapp 2013-07-10 18:13:56 EDT
Created attachment 771875 [details]
abrt output (core dump)

Description of problem: rgmanager dumps core while starting up an HA MRG-M broker

Version-Release number of selected component (if applicable):

How reproducible:
Only once, so far

Steps to Reproduce:
1. Attempt to start MRG broker (e.g.  /usr/sbin/clusvcadm -e BR.0 -m hostname)

Actual results:
rgmanager crashes

Expected results:
rgmanager starts services normally

Additional info:
uname -a: 
Linux omhq1adf 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10 15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
Comment 2 Ryan McCabe 2013-07-12 15:50:31 EDT
Hopefully this is fixed by cluster.git commit 156a32063baa99fefdb445c68111b174dc06753e
Comment 3 Chester Knapp 2013-07-16 09:57:33 EDT
Created attachment 774266 [details]
abrt output
Comment 4 Chester Knapp 2013-07-16 09:58:52 EDT
Created attachment 774281 [details]
another abrt core dump

I've added two additional core dumps after seeing this behavior repeated. Consider upping the frequency of the defect.
Comment 5 Ryan McCabe 2013-07-16 11:21:20 EDT
Are you able to try a test package with a proposed fix?
Comment 6 Chester Knapp 2013-07-16 12:43:44 EDT
I'll contact the customer to determine if this is feasible. Although, I don't have a reliable reproducer either way. rgmanager has cored 5 times out of several hundred over the last week, so it is still intermittent at best.
Comment 7 Chester Knapp 2013-07-16 12:44:54 EDT
(In reply to Ryan McCabe from comment #5)
> Are you able to try a test package with a proposed fix?

Yes, the customer is agreeable. We'll try out the test package you provide.
Comment 8 Ryan McCabe 2013-07-17 11:22:34 EDT
(In reply to Chester Knapp from comment #7)
> (In reply to Ryan McCabe from comment #5)
> > Are you able to try a test package with a proposed fix?
> Yes, the customer is agreeable. We'll try out the test package you provide.

You can grab the build from https://brewweb.devel.redhat.com/buildinfo?buildID=282334

Let me know how it goes.

Thanks for testing!
Comment 11 Chester Knapp 2013-08-27 11:15:16 EDT
Customer opted not to use the test package.
Comment 16 errata-xmlrpc 2013-11-21 05:56:10 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.