Bug 825375 - dbus-related crash in rgmanager
dbus-related crash in rgmanager
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rgmanager (Show other bugs)
6.3
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Ryan McCabe
Cluster QE
:
: 970017 970018 970550 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-25 16:16 EDT by Ryan McCabe
Modified: 2013-09-20 15:26 EDT (History)
4 users (show)

See Also:
Fixed In Version: rgmanager-3.0.12.1-13.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 05:18:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
fix (700 bytes, patch)
2012-05-25 16:17 EDT, Ryan McCabe
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 491363 None None None Never

  None (edit)
Description Ryan McCabe 2012-05-25 16:16:03 EDT
When running rgmanager without the -q flag (the default), rgmanager can crash inside dbus library functions as a result of unlocked access to internal dbus data structures from different rgmanager threads.

I've observed the following crash (and others similar to it):

process 26806: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection.
  D-Bus not built with -rdynamic so unable to print a backtrace
Aborted (core dumped)

(gdb) bt
#0  0x00007ff38ceed8a5 in raise () from /lib64/libc.so.6
#1  0x00007ff38ceef085 in abort () from /lib64/libc.so.6
#2  0x00007ff38d47f975 in _dbus_abort () at dbus-sysdeps.c:88
#3  0x00007ff38d47b845 in _dbus_warn_check_failed (
    format=0x7ff38d484388 "The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.\n%s") at dbus-internals.c:283
#4  0x00007ff38d465c62 in _dbus_connection_read_write_dispatch (
    connection=0xe573a0, timeout_milliseconds=500, 
    dispatch=<value optimized out>) at dbus-connection.c:3512
#5  0x000000000041b261 in ?? ()
#6  0x00007ff38d8a1851 in start_thread (arg=0x7ff38bea3700)
    at pthread_create.c:301
#7  0x00007ff38cfa267d in clone () from /lib64/libc.so.6

The can be reproduced by repeatedly relocating and restarting services. I was able to reproduce it fairly reliably (albeit after a couple hours in some cases) by using the following configuration snippet:
	<rm>
		<service name="a"/>
		<service name="b"/>
	</rm>

and running the following commands at the same time on two nodes:
 while [ 1 ] ; do clusvcadm -r a ; done
 while [ 1 ] ; do clusvcadm -R a ; done
 while [ 1 ] ; do clusvcadm -r b ; done
 while [ 1 ] ; do clusvcadm -R b ; done
Comment 1 Ryan McCabe 2012-05-25 16:17:58 EDT
Created attachment 586943 [details]
fix
Comment 6 errata-xmlrpc 2013-02-21 05:18:16 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0409.html
Comment 7 Ryan McCabe 2013-06-03 13:17:31 EDT
*** Bug 970017 has been marked as a duplicate of this bug. ***
Comment 8 Ryan McCabe 2013-06-04 09:09:26 EDT
*** Bug 970550 has been marked as a duplicate of this bug. ***
Comment 9 Ryan McCabe 2013-06-12 08:09:11 EDT
*** Bug 970018 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.