When running rgmanager without the -q flag (the default), rgmanager can crash inside dbus library functions as a result of unlocked access to internal dbus data structures from different rgmanager threads. I've observed the following crash (and others similar to it): process 26806: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details. Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection. D-Bus not built with -rdynamic so unable to print a backtrace Aborted (core dumped) (gdb) bt #0 0x00007ff38ceed8a5 in raise () from /lib64/libc.so.6 #1 0x00007ff38ceef085 in abort () from /lib64/libc.so.6 #2 0x00007ff38d47f975 in _dbus_abort () at dbus-sysdeps.c:88 #3 0x00007ff38d47b845 in _dbus_warn_check_failed ( format=0x7ff38d484388 "The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.\n%s") at dbus-internals.c:283 #4 0x00007ff38d465c62 in _dbus_connection_read_write_dispatch ( connection=0xe573a0, timeout_milliseconds=500, dispatch=<value optimized out>) at dbus-connection.c:3512 #5 0x000000000041b261 in ?? () #6 0x00007ff38d8a1851 in start_thread (arg=0x7ff38bea3700) at pthread_create.c:301 #7 0x00007ff38cfa267d in clone () from /lib64/libc.so.6 The can be reproduced by repeatedly relocating and restarting services. I was able to reproduce it fairly reliably (albeit after a couple hours in some cases) by using the following configuration snippet: <rm> <service name="a"/> <service name="b"/> </rm> and running the following commands at the same time on two nodes: while [ 1 ] ; do clusvcadm -r a ; done while [ 1 ] ; do clusvcadm -R a ; done while [ 1 ] ; do clusvcadm -r b ; done while [ 1 ] ; do clusvcadm -R b ; done
Created attachment 586943 [details] fix
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0409.html
*** Bug 970017 has been marked as a duplicate of this bug. ***
*** Bug 970550 has been marked as a duplicate of this bug. ***
*** Bug 970018 has been marked as a duplicate of this bug. ***