Bug 825375 - dbus-related crash in rgmanager
Summary: dbus-related crash in rgmanager
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rgmanager
Version: 6.3
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Ryan McCabe
QA Contact: Cluster QE
URL:
Whiteboard:
: 970017 970018 970550 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-25 20:16 UTC by Ryan McCabe
Modified: 2018-12-03 17:37 UTC (History)
4 users (show)

Fixed In Version: rgmanager-3.0.12.1-13.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-21 10:18:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
fix (700 bytes, patch)
2012-05-25 20:17 UTC, Ryan McCabe
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 491363 0 None None None Never
Red Hat Product Errata RHBA-2013:0409 0 normal SHIPPED_LIVE rgmanager bug fix update 2013-02-20 20:50:33 UTC

Description Ryan McCabe 2012-05-25 20:16:03 UTC
When running rgmanager without the -q flag (the default), rgmanager can crash inside dbus library functions as a result of unlocked access to internal dbus data structures from different rgmanager threads.

I've observed the following crash (and others similar to it):

process 26806: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection.
  D-Bus not built with -rdynamic so unable to print a backtrace
Aborted (core dumped)

(gdb) bt
#0  0x00007ff38ceed8a5 in raise () from /lib64/libc.so.6
#1  0x00007ff38ceef085 in abort () from /lib64/libc.so.6
#2  0x00007ff38d47f975 in _dbus_abort () at dbus-sysdeps.c:88
#3  0x00007ff38d47b845 in _dbus_warn_check_failed (
    format=0x7ff38d484388 "The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.\n%s") at dbus-internals.c:283
#4  0x00007ff38d465c62 in _dbus_connection_read_write_dispatch (
    connection=0xe573a0, timeout_milliseconds=500, 
    dispatch=<value optimized out>) at dbus-connection.c:3512
#5  0x000000000041b261 in ?? ()
#6  0x00007ff38d8a1851 in start_thread (arg=0x7ff38bea3700)
    at pthread_create.c:301
#7  0x00007ff38cfa267d in clone () from /lib64/libc.so.6

The can be reproduced by repeatedly relocating and restarting services. I was able to reproduce it fairly reliably (albeit after a couple hours in some cases) by using the following configuration snippet:
	<rm>
		<service name="a"/>
		<service name="b"/>
	</rm>

and running the following commands at the same time on two nodes:
 while [ 1 ] ; do clusvcadm -r a ; done
 while [ 1 ] ; do clusvcadm -R a ; done
 while [ 1 ] ; do clusvcadm -r b ; done
 while [ 1 ] ; do clusvcadm -R b ; done

Comment 1 Ryan McCabe 2012-05-25 20:17:58 UTC
Created attachment 586943 [details]
fix

Comment 6 errata-xmlrpc 2013-02-21 10:18:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0409.html

Comment 7 Ryan McCabe 2013-06-03 17:17:31 UTC
*** Bug 970017 has been marked as a duplicate of this bug. ***

Comment 8 Ryan McCabe 2013-06-04 13:09:26 UTC
*** Bug 970550 has been marked as a duplicate of this bug. ***

Comment 9 Ryan McCabe 2013-06-12 12:09:11 UTC
*** Bug 970018 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.