Bug 634168 - cluster-url -> "critical Modified cluster state outside of cluster context"
Summary: cluster-url -> "critical Modified cluster state outside of cluster context"
Keywords:
Status: CLOSED DUPLICATE of bug 625540
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: All
OS: Linux
urgent
high
Target Milestone: 1.3
: ---
Assignee: messaging-bugs
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-15 12:21 UTC by ppecka
Modified: 2011-08-12 16:22 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-15 14:10:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log file (522.47 KB, application/octet-stream)
2010-09-15 12:21 UTC, ppecka
no flags Details
gdb core examined (26.38 KB, application/octet-stream)
2010-09-15 12:23 UTC, ppecka
no flags Details

Description ppecka 2010-09-15 12:21:57 UTC
Created attachment 447451 [details]
log file

Description of problem:
Specifying cluster-url has impact on re-start of two brokers running in cluster results in "critical Modified cluster state outside of cluster context" this node then aborts. Remaining node is not responsive then.

#cat /etc/qpidd.conf
cluster-mechanism=ANONYMOUS
auth=no
log-enable=debug+
log-enable=debug+:cluster
log-to-file=/tmp/qpidd.log
log-to-stdout=no
log-to-stderr=no
cluster-name=bz602198_verified
cluster-url=nec-em13.rhts.eng.bos.redhat.com,hp-dl360g5-02.rhts.eng.bos.redhat.com




#tailf ./qpidd.log

2010-09-15 05:37:33 debug min_ssf: 0, max_ssf: 256
2010-09-15 05:37:33 debug CyrusSasl::start(ANONYMOUS): selected ANONYMOUS response: 'anonymous.eng.bos.redhat.com'
2010-09-15 05:37:33 debug Known-brokers for connection: amqp:tcp:nec-em13.rhts.eng.bos.redhat.com:5672,tcp:hp-dl360g5-02.rhts.eng.bos.redhat.com:5672
2010-09-15 05:37:33 debug Connection [39773 nec-em13.rhts.eng.bos.redhat.com:5672] no security layer in place
2010-09-15 05:37:33 debug SessionState::SessionState .: 0x1067ae98
2010-09-15 05:37:33 debug SessionState::SessionState anonymous.qpid.cluster-update: 0x2aaab0041f40
2010-09-15 05:37:33 debug anonymous.qpid.cluster-update: attached on broker.
2010-09-15 05:37:33 debug Attached channel 1 to anonymous.qpid.cluster-update
2010-09-15 05:37:33 debug anonymous.qpid.cluster-update: ready to send, activating output.
2010-09-15 05:37:33 critical Modified cluster state outside of cluster context



#gdb


(gdb)   9 Thread 4436  0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
  8 Thread 4438  0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
  7 Thread 4439  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
  6 Thread 4440  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
  5 Thread 4441  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
  4 Thread 4442  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
  3 Thread 4473  0x0000003f43c0aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
  2 Thread 4474  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
* 1 Thread 4435  0x0000003f43030265 in raise () from /lib64/libc.so.6
(gdb) 
Thread 9 (Thread 4436):
#0  0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002b3b86ab1abf in qpid::sys::Timer::run (this=0x10664d40)
    at ../include/qpid/sys/posix/Condition.h:69
#2  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x10664d74) at qpid/sys/posix/Thread.cpp:35
#3  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#4  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 8 (Thread 4438):
#0  0x0000003f43c0b150 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002b3b86ab1abf in qpid::sys::Timer::run (this=0x10670e90)
    at ../include/qpid/sys/posix/Condition.h:69
#2  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x10670ec4) at qpid/sys/posix/Thread.cpp:35
#3  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#4  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 7 (Thread 4439):
#0  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 6 (Thread 4440):
#0  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 5 (Thread 4441):
#0  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 4 (Thread 4442):
#0  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10647ec0, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10647ec0)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (p=0x6)
    at qpid/sys/posix/Thread.cpp:35
#4  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 3 (Thread 4473):
#0  0x0000003f43c0aee9 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0
#1  0x00002b3b87927a6d in wait (this=0x1067abd0)
    at ../include/qpid/sys/posix/Condition.h:63
#2  wait (this=0x1067abd0) at ../include/qpid/sys/Monitor.h:41
#3  qpid::sys::Waitable::wait (this=0x1067abd0) at qpid/sys/Waitable.h:88
#4  0x00002b3b87922dd9 in waitFor (this=0x1067ab60, timeout=0)
    at qpid/sys/StateMonitor.h:65
#5  waitFor (this=0x1067ab60, timeout=0) at qpid/client/SessionImpl.cpp:769
#6  qpid::client::SessionImpl::open (this=0x1067ab60, timeout=0)
    at qpid/client/SessionImpl.cpp:109
#7  0x00002b3b878ef8c8 in qpid::client::ConnectionImpl::newSession (
    this=0x106773d0, name=..., timeout=0, channel=65535)
    at qpid/client/ConnectionImpl.cpp:437
#8  0x00002b3b878e09a9 in qpid::client::Connection::newSession (
    this=0x2aaaac000d08, name=..., timeout=0) at qpid/client/Connection.cpp:141
#9  0x00002b3b87224e33 in qpid::cluster::UpdateClient::run (
    this=0x2aaaac000bb0) at qpid/cluster/UpdateClient.cpp:130
#10 0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x1067abfc) at qpid/sys/posix/Thread.cpp:35
#11 0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#12 0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 2 (Thread 4474):
#0  0x0000003f430d4108 in epoll_wait () from /lib64/libc.so.6
#1  0x00002b3b869e22b1 in qpid::sys::Poller::wait (this=0x10677a70, 
    timeout=<value optimized out>) at qpid/sys/epoll/EpollPoller.cpp:563
#2  0x00002b3b869e2d47 in qpid::sys::Poller::run (this=0x10677a70)
    at qpid/sys/epoll/EpollPoller.cpp:515
#3  0x00002b3b869d8f9a in qpid::sys::(anonymous namespace)::runRunnable (
    p=0x1b) at qpid/sys/posix/Thread.cpp:35
#4  0x0000003f43c0673d in start_thread () from /lib64/libpthread.so.0
#5  0x0000003f430d3d1d in clone () from /lib64/libc.so.6

Thread 1 (Thread 4435):
#0  0x0000003f43030265 in raise () from /lib64/libc.so.6
#1  0x0000003f43031d10 in abort () from /lib64/libc.so.6
#2  0x00002b3b86aa8b34 in qpid::sys::assertClusterSafe ()
    at qpid/sys/ClusterSafe.cpp:42
#3  0x00002b3b865da0f7 in qpid::broker::SemanticState::attached (this=0x1153)
    at qpid/broker/SemanticState.cpp:803
#4  0x00002b3b865fa3e9 in qpid::broker::SessionState::readyToSend (
    this=0x2aaab0041f40) at qpid/broker/SessionState.cpp:367
#5  0x00002b3b86a74b9f in qpid::amqp_0_10::SessionHandler::attach (
    this=0x2aaab0041a50, name_=<value optimized out>, force=96)
    at qpid/amqp_0_10/SessionHandler.cpp:169
#6  0x00002b3b86a3221d in invoke<qpid::framing::AMQP_AllOperations::SessionHandler> (this=0x7fff11594830, body=...) at qpid/framing/SessionAttachBody.h:67
#7  qpid::framing::AMQP_AllOperations::SessionHandler::Invoker::visit (
    this=0x7fff11594830, body=...) at qpid/framing/AllInvoker.cpp:790
#8  0x00002b3b86a761de in qpid::framing::invoke<qpid::amqp_0_10::SessionHandler> (target=<value optimized out>, body=...) at qpid/framing/Invoker.h:67
#9  0x00002b3b86a71f23 in qpid::amqp_0_10::SessionHandler::invoke (
    this=0x1153, m=...) at qpid/amqp_0_10/SessionHandler.cpp:67
#10 0x00002b3b86a75139 in qpid::amqp_0_10::SessionHandler::handleIn (
    this=0x2aaab0041a50, f=...) at qpid/amqp_0_10/SessionHandler.cpp:82
#11 0x00002b3b86542349 in operator() (this=0x10679e70, frame=...)
    at qpid/framing/Handler.h:42
#12 qpid::broker::Connection::received (this=0x10679e70, frame=...)
    at qpid/broker/Connection.cpp:158
#13 0x00002b3b8721330f in qpid::cluster::Connection::received (
    this=0x10678c20, f=...) at qpid/cluster/Connection.cpp:194
#14 0x00002b3b8720c80e in qpid::cluster::Connection::decode (this=0x10678c20, 
    data=0x2aaab0031a40 "\017", size=<value optimized out>)
    at qpid/cluster/Connection.cpp:311
#15 0x00002b3b86aa6d52 in qpid::sys::AsynchIOHandler::readbuff (
    this=0x10679e10, buff=0x2aaab0001090) at qpid/sys/AsynchIOHandler.cpp:135
#16 0x00002b3b869d69da in boost::function2<void, qpid::sys::AsynchIO&, qpid::sys::AsynchIOBufferBase*, std::allocator<boost::function_base> >::operator() (
    this=0x0, a0=..., a1=0x6)
    at /usr/include/boost/function/function_template.hpp:576
#17 0x00002b3b869d4140 in qpid::sys::posix::AsynchIO::readable (
    this=0x2aaab0000b00, h=...) at qpid/sys/posix/AsynchIO.cpp:428
#18 0x00002b3b86aac287 in boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >::operator() (this=0x0, a0=...)
    at /usr/include/boost/function/function_template.hpp:576
#19 0x00002b3b86aaba60 in qpid::sys::DispatchHandle::processEvent (
    this=0x2aaab0000b08, type=<value optimized out>)
    at qpid/sys/DispatchHandle.cpp:278
#20 0x00002b3b869e2d74 in process (this=0x10647ec0) at qpid/sys/Poller.h:125
#21 qpid::sys::Poller::run (this=0x10647ec0)
    at qpid/sys/epoll/EpollPoller.cpp:519
#22 0x00002b3b86531142 in qpid::broker::Broker::run (
    this=<value optimized out>) at qpid/broker/Broker.cpp:344
#23 0x00000000004093b9 in QpiddDaemon::child (this=0x7fff11597080)
    at posix/QpiddBroker.cpp:141
#24 0x00002b3b8654e2be in qpid::broker::Daemon::fork (this=0x7fff11597080)
    at qpid/broker/Daemon.cpp:91
#25 0x0000000000406c45 in QpiddBroker::execute (this=<value optimized out>, 
    options=<value optimized out>) at posix/QpiddBroker.cpp:179
#26 0x000000000040559f in main (argc=4, argv=0x7fff11597678) at qpidd.cpp:80





Version-Release number of selected component (if applicable):
rpm -qa | grep -P '(qpid|qmf|ais)' | sort -u
openais-0.80.6-16.el5_5.7
openais-devel-0.80.6-16.el5_5.7
python-qmf-0.7.946106-13.el5
python-qpid-0.7.946106-14.el5
qmf-0.7.946106-14.el5
qmf-devel-0.7.946106-14.el5
qpid-cpp-client-0.7.946106-14.el5
qpid-cpp-client-devel-0.7.946106-14.el5
qpid-cpp-client-devel-docs-0.7.946106-14.el5
qpid-cpp-client-ssl-0.7.946106-14.el5
qpid-cpp-mrg-debuginfo-0.7.946106-14.el5
qpid-cpp-server-0.7.946106-14.el5
qpid-cpp-server-0.7.946106-3.el5
qpid-cpp-server-cluster-0.7.946106-14.el5
qpid-cpp-server-devel-0.7.946106-14.el5
qpid-cpp-server-ssl-0.7.946106-14.el5
qpid-cpp-server-store-0.7.946106-14.el5
qpid-cpp-server-xml-0.7.946106-14.el5
qpid-java-client-0.7.946106-9.el5
qpid-java-common-0.7.946106-9.el5
qpid-tools-0.7.946106-10.el5


How reproducible:
Always 

Steps to Reproduce:
(both nodes)
0. rm -rf /var/lib/qpidd/*cluster* /var/lib/qpidd/rhm
1. service openais start
2. service qpidd start
3. qpid-cluster

(node A)
3. qpid-cluster -f -k
  
(both nodes)
4. service qpidd start

Result
on node A you can see attached log and core

Actual results:
qpidd fails to join cluster

Expected results:
qpidd joins cluster and stays running

Additional info:

Comment 1 ppecka 2010-09-15 12:23:27 UTC
Created attachment 447452 [details]
gdb core examined

Comment 2 Gordon Sim 2010-09-15 13:05:27 UTC
Are nec-em13.rhts.eng.bos.redhat.com and hp-dl360g5-02.rhts.eng.bos.redhat.com two separate hosts? If so they should not be included in the cluster-url option for any one broker. The cluster-url option is only intended to control the addresses by which the local node advertises itself. Each node would specify a different value (corresponding to the addresses it is listening on).

Comment 3 Gordon Sim 2010-09-15 13:11:15 UTC
Suspected dup of bug 625540, i.e. a misconfiguration of cluster-url causes broker to connect to itself for update.

Comment 4 ppecka 2010-09-15 14:10:35 UTC

*** This bug has been marked as a duplicate of bug 625540 ***


Note You need to log in before you can comment on or make changes to this bug.