Created attachment 447451 [details] log file Description of problem: Specifying cluster-url has impact on re-start of two brokers running in cluster results in "critical Modified cluster state outside of cluster context" this node then aborts. Remaining node is not responsive then. #cat /etc/qpidd.conf cluster-mechanism=ANONYMOUS auth=no log-enable=debug+ log-enable=debug+:cluster log-to-file=/tmp/qpidd.log log-to-stdout=no log-to-stderr=no cluster-name=bz602198_verified cluster-url=nec-em13.rhts.eng.bos.redhat.com,hp-dl360g5-02.rhts.eng.bos.redhat.com #tailf ./qpidd.log 2010-09-15 05:37:33 debug min_ssf: 0, max_ssf: 256 2010-09-15 05:37:33 debug CyrusSasl::start(ANONYMOUS): selected ANONYMOUS response: 'anonymous.eng.bos.redhat.com' 2010-09-15 05:37:33 debug Known-brokers for connection: amqp:tcp:nec-em13.rhts.eng.bos.redhat.com:5672,tcp:hp-dl360g5-02.rhts.eng.bos.redhat.com:5672 2010-09-15 05:37:33 debug Connection [39773 nec-em13.rhts.eng.bos.redhat.com:5672] no security layer in place 2010-09-15 05:37:33 debug SessionState::SessionState .: 0x1067ae98 2010-09-15 05:37:33 debug SessionState::SessionState anonymous.qpid.cluster-update: 0x2aaab0041f40 2010-09-15 05:37:33 debug anonymous.qpid.cluster-update: attached on broker. 2010-09-15 05:37:33 debug Attached channel 1 to anonymous.qpid.cluster-update 2010-09-15 05:37:33 debug anonymous.qpid.cluster-update: ready to send, activating output. 2010-09-15 05:37:33 critical Modified cluster state outside of cluster context #gdb (gdb) 9 Thread 4436 0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 8 Thread 4438 0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 7 Thread 4439 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 6 Thread 4440 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 5 Thread 4441 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 4 Thread 4442 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 3 Thread 4473 0x0000003f43c0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 4474 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 * 1 Thread 4435 0x0000003f43030265 in raise () from /lib64/libc.so.6 (gdb) Thread 9 (Thread 4436): #0 0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002b3b86ab1abf in qpid::sys::Timer::run (this=0x10664d40) at ../include/qpid/sys/posix/Condition.h:69 #2 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable ( p=0x10664d74) at qpid/sys/posix/Thread.cpp:35 #3 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #4 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 8 (Thread 4438): #0 0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002b3b86ab1abf in qpid::sys::Timer::run (this=0x10670e90) at ../include/qpid/sys/posix/Condition.h:69 #2 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable ( p=0x10670ec4) at qpid/sys/posix/Thread.cpp:35 #3 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #4 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 7 (Thread 4439): #0 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 #1 0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563 #2 0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0) at qpid/sys/epoll/EpollPoller.cpp:515 #3 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #5 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 6 (Thread 4440): #0 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 #1 0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563 #2 0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0) at qpid/sys/epoll/EpollPoller.cpp:515 #3 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #5 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 5 (Thread 4441): #0 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 #1 0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563 #2 0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0) at qpid/sys/epoll/EpollPoller.cpp:515 #3 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #5 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 4 (Thread 4442): #0 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 #1 0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563 #2 0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0) at qpid/sys/epoll/EpollPoller.cpp:515 #3 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #5 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 3 (Thread 4473): #0 0x0000003f43c0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00002b3b87927a6d in wait (this=0x1067abd0) at ../include/qpid/sys/posix/Condition.h:63 #2 wait (this=0x1067abd0) at ../include/qpid/sys/Monitor.h:41 #3 qpid::sys::Waitable::wait (this=0x1067abd0) at qpid/sys/Waitable.h:88 #4 0x00002b3b87922dd9 in waitFor (this=0x1067ab60, timeout=0) at qpid/sys/StateMonitor.h:65 #5 waitFor (this=0x1067ab60, timeout=0) at qpid/client/SessionImpl.cpp:769 #6 qpid::client::SessionImpl::open (this=0x1067ab60, timeout=0) at qpid/client/SessionImpl.cpp:109 #7 0x00002b3b878ef8c8 in qpid::client::ConnectionImpl::newSession ( this=0x106773d0, name=..., timeout=0, channel=65535) at qpid/client/ConnectionImpl.cpp:437 #8 0x00002b3b878e09a9 in qpid::client::Connection::newSession ( this=0x2aaaac000d08, name=..., timeout=0) at qpid/client/Connection.cpp:141 #9 0x00002b3b87224e33 in qpid::cluster::UpdateClient::run ( this=0x2aaaac000bb0) at qpid/cluster/UpdateClient.cpp:130 #10 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable ( p=0x1067abfc) at qpid/sys/posix/Thread.cpp:35 #11 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #12 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 2 (Thread 4474): #0 0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6 #1 0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10677a70, timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563 #2 0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10677a70) at qpid/sys/epoll/EpollPoller.cpp:515 #3 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable ( p=0x1b) at qpid/sys/posix/Thread.cpp:35 #4 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0 #5 0x0000003f430d3d1d in clone () from /lib64/libc.so.6 Thread 1 (Thread 4435): #0 0x0000003f43030265 in raise () from /lib64/libc.so.6 #1 0x0000003f43031d10 in abort () from /lib64/libc.so.6 #2 0x00002b3b86aa8b34 in qpid::sys::assertClusterSafe () at qpid/sys/ClusterSafe.cpp:42 #3 0x00002b3b865da0f7 in qpid::broker::SemanticState::attached (this=0x1153) at qpid/broker/SemanticState.cpp:803 #4 0x00002b3b865fa3e9 in qpid::broker::SessionState::readyToSend ( this=0x2aaab0041f40) at qpid/broker/SessionState.cpp:367 #5 0x00002b3b86a74b9f in qpid::amqp_0_10::SessionHandler::attach ( this=0x2aaab0041a50, name_=<value optimized out>, force=96) at qpid/amqp_0_10/SessionHandler.cpp:169 #6 0x00002b3b86a3221d in invoke<qpid::framing::AMQP_AllOperations::SessionHandler> (this=0x7fff11594830, body=...) at qpid/framing/SessionAttachBody.h:67 #7 qpid::framing::AMQP_AllOperations::SessionHandler::Invoker::visit ( this=0x7fff11594830, body=...) at qpid/framing/AllInvoker.cpp:790 #8 0x00002b3b86a761de in qpid::framing::invoke<qpid::amqp_0_10::SessionHandler> (target=<value optimized out>, body=...) at qpid/framing/Invoker.h:67 #9 0x00002b3b86a71f23 in qpid::amqp_0_10::SessionHandler::invoke ( this=0x1153, m=...) at qpid/amqp_0_10/SessionHandler.cpp:67 #10 0x00002b3b86a75139 in qpid::amqp_0_10::SessionHandler::handleIn ( this=0x2aaab0041a50, f=...) at qpid/amqp_0_10/SessionHandler.cpp:82 #11 0x00002b3b86542349 in operator() (this=0x10679e70, frame=...) at qpid/framing/Handler.h:42 #12 qpid::broker::Connection::received (this=0x10679e70, frame=...) at qpid/broker/Connection.cpp:158 #13 0x00002b3b8721330f in qpid::cluster::Connection::received ( this=0x10678c20, f=...) at qpid/cluster/Connection.cpp:194 #14 0x00002b3b8720c80e in qpid::cluster::Connection::decode (this=0x10678c20, data=0x2aaab0031a40 "\017", size=<value optimized out>) at qpid/cluster/Connection.cpp:311 #15 0x00002b3b86aa6d52 in qpid::sys::AsynchIOHandler::readbuff ( this=0x10679e10, buff=0x2aaab0001090) at qpid/sys/AsynchIOHandler.cpp:135 #16 0x00002b3b869d69da in boost::function2<void, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() ( this=0x0, a0=..., a1=0x6) at /usr/include/boost/function/function_template.hpp:576 #17 0x00002b3b869d4140 in qpid::sys::posix::AsynchIO::readable ( this=0x2aaab0000b00, h=...) at qpid/sys/posix/AsynchIO.cpp:428 #18 0x00002b3b86aac287 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() (this=0x0, a0=...) at /usr/include/boost/function/function_template.hpp:576 #19 0x00002b3b86aaba60 in qpid::sys::DispatchHandle::processEvent ( this=0x2aaab0000b08, type=<value optimized out>) at qpid/sys/DispatchHandle.cpp:278 #20 0x00002b3b869e2d74 in process (this=0x10647ec0) at qpid/sys/Poller.h:125 #21 qpid::sys::Poller::run (this=0x10647ec0) at qpid/sys/epoll/EpollPoller.cpp:519 #22 0x00002b3b86531142 in qpid::broker::Broker::run ( this=<value optimized out>) at qpid/broker/Broker.cpp:344 #23 0x00000000004093b9 in QpiddDaemon::child (this=0x7fff11597080) at posix/QpiddBroker.cpp:141 #24 0x00002b3b8654e2be in qpid::broker::Daemon::fork (this=0x7fff11597080) at qpid/broker/Daemon.cpp:91 #25 0x0000000000406c45 in QpiddBroker::execute (this=<value optimized out>, options=<value optimized out>) at posix/QpiddBroker.cpp:179 #26 0x000000000040559f in main (argc=4, argv=0x7fff11597678) at qpidd.cpp:80 Version-Release number of selected component (if applicable): rpm -qa | grep -P '(qpid|qmf|ais)' | sort -u openais-0.80.6-16.el5_5.7 openais-devel-0.80.6-16.el5_5.7 python-qmf-0.7.946106-13.el5 python-qpid-0.7.946106-14.el5 qmf-0.7.946106-14.el5 qmf-devel-0.7.946106-14.el5 qpid-cpp-client-0.7.946106-14.el5 qpid-cpp-client-devel-0.7.946106-14.el5 qpid-cpp-client-devel-docs-0.7.946106-14.el5 qpid-cpp-client-ssl-0.7.946106-14.el5 qpid-cpp-mrg-debuginfo-0.7.946106-14.el5 qpid-cpp-server-0.7.946106-14.el5 qpid-cpp-server-0.7.946106-3.el5 qpid-cpp-server-cluster-0.7.946106-14.el5 qpid-cpp-server-devel-0.7.946106-14.el5 qpid-cpp-server-ssl-0.7.946106-14.el5 qpid-cpp-server-store-0.7.946106-14.el5 qpid-cpp-server-xml-0.7.946106-14.el5 qpid-java-client-0.7.946106-9.el5 qpid-java-common-0.7.946106-9.el5 qpid-tools-0.7.946106-10.el5 How reproducible: Always Steps to Reproduce: (both nodes) 0. rm -rf /var/lib/qpidd/*cluster* /var/lib/qpidd/rhm 1. service openais start 2. service qpidd start 3. qpid-cluster (node A) 3. qpid-cluster -f -k (both nodes) 4. service qpidd start Result on node A you can see attached log and core Actual results: qpidd fails to join cluster Expected results: qpidd joins cluster and stays running Additional info:
Created attachment 447452 [details] gdb core examined
Are nec-em13.rhts.eng.bos.redhat.com and hp-dl360g5-02.rhts.eng.bos.redhat.com two separate hosts? If so they should not be included in the cluster-url option for any one broker. The cluster-url option is only intended to control the addresses by which the local node advertises itself. Each node would specify a different value (corresponding to the addresses it is listening on).
Suspected dup of bug 625540, i.e. a misconfiguration of cluster-url causes broker to connect to itself for update.
*** This bug has been marked as a duplicate of bug 625540 ***