Description of problem: If a cluster is started recovering queue and messages from disk, then a new node tries to join that cluster, that new node fails to connect. Version-Release number of selected component (if applicable): qpidd-cluster-0.5.752581-1.el5 rhm-0.5.3153-1.el5 How reproducible: 100% Steps to Reproduce: 1. Start two node cluster 2. Create a durable queue 3. Send a persistent messages to that queue 4. Stop the cluster 5. Restart one node using the durable store containing this queue and message data 6. Try and start another node (with an empty store as required) Actual results: 2009-mar-16 14:28:40 notice Journal "TplStore": Created 2009-mar-16 14:28:40 notice Store module initialized; dir=test-data-1 2009-mar-16 14:28:40 notice Recovering from cluster, no recovery from local journal 2009-mar-16 14:28:40 notice SASL disabled: No Authentication Performed 2009-mar-16 14:28:40 notice Listening on TCP port 5674 2009-mar-16 14:28:40 notice 20.0.10.15:20727(INIT) joining cluster grs with url=amqp:tcp:10.16.44.222:5674,tcp:20.0.10.15:5674,tcp:192.168.122.1:5674 libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. This will severely limit memory registrations. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 2009-mar-16 14:28:40 notice Broker running 2009-mar-16 14:28:40 notice Journal "durable-test-queue": Created 2009-mar-16 14:28:41 error Connection exception: framing-error: Unexpected command start frame. (qpid/SessionState.cpp:57) 2009-mar-16 14:28:41 error Connection 10.16.44.222:60749 closed by error: Unexpected command start frame. (qpid/SessionState.cpp:57)(501) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[Bbe; channel=1; {MessageTransferBody: destination=qpid.cluster-update; accept-mode=1; acquire-mode=0; }]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[be; channel=1; header (61 bytes); properties={{MessageProperties: content-length=3; application-headers={sn:F4:int32(2)}; }{DeliveryProperties: delivery-mode=2; exchange=; routing-key=durable-test-queue; }}]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[BEbe; channel=1; content (3 bytes) eos...]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[BEbe; channel=1; {ExchangeUnbindBody: queue=durable-test-queue; exchange=qpid.cluster-update; binding-key=; }]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[BEbe; channel=1; {QueueDeclareBody: queue=qpid.cluster-update; alternate-exchange=; auto-delete=1; arguments={}; }]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 error Channel exception: not-attached: receiving Frame[BEbe; channel=1; {ExecutionSyncBody: }]: channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:79) 2009-mar-16 14:28:41 critical 20.0.10.15:20727(UPDATEE) catch-up connection closed prematurely 20.0.10.15:20727-1(local,catchup) 2009-mar-16 14:28:41 notice 20.0.10.15:20727(LEFT) leaving cluster grs 2009-mar-16 14:28:41 notice Shut down Expected results: Second node shoul join the cluster as expected.
Fixed on trunk by r755316.
The issue has been fixed, validated on RHEL 5.2 / 5.3 i386 / x86_64 on packages: [root@intel-d3x1311-01 bz490506]# rpm -qa | egrep '(qpid|rhm)' | sort -u python-qpid-0.5.752581-1.el5 qpidc-0.5.752581-3.el5 qpidc-devel-0.5.752581-3.el5 qpidc-perftest-0.5.752581-3.el5 qpidc-rdma-0.5.752581-3.el5 qpidc-ssl-0.5.752581-3.el5 qpidd-0.5.752581-3.el5 qpidd-acl-0.5.752581-3.el5 qpidd-cluster-0.5.752581-3.el5 qpidd-devel-0.5.752581-3.el5 qpidd-rdma-0.5.752581-3.el5 qpidd-ssl-0.5.752581-3.el5 qpidd-xml-0.5.752581-3.el5 qpid-java-client-0.5.751061-1.el5 qpid-java-common-0.5.751061-1.el5 rhm-0.5.3206-1.el5 rhm-docs-0.5.756148-1.el5 ->VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0434.html