Red Hat Bugzilla – Bug 503024
If there are unacked messages on a queue with ring policy when new member joins cluster, queue becomes inconsistent
Last modified: 2009-06-12 13:39:20 EDT
Description of problem:
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. start one node
2. create a queue with ring policy and fixed count (e.g. qpid-config add queue test-queue --max-queue-count 5 --limit-policy ring)
3. send some messages to the queue (e.g. for m in one two three four five; do echo $m; done | ./qpid/cpp/src/tests/sender)
4. receive but don't yet ack a message (e.g. start and leave running: ./qpid/cpp/src/tests/receiver --ack-frequency 10 --credit-window 1)
5. add new node to cluster
6. release (or consume) message from step 4 (e.g. kill receiver started in step 4)
7. check messages on each node (e.g. by running ./qpid/cpp/src/tests/receiver --browse --messages 5 against each node in turn)
Nodes show different set of messages e.g.
Messages reported from each node should be the same.
Sorry, missed one step in reproducer above. Between steps 5 and 6 need to send another message to the queue (e.g. echo six | ./qpid/cpp/src/tests/sender)
Created attachment 345774 [details]
Fix (created against code for 752581-8)
Fixed in qpidd-0.5.752581-10.
Tested on RHEL 5.3 i386/x86_64 qpidd-0.5.752581-10.el5 and it works --> VERIFIED.
This is not fully fixed; if the sixth message is sent *before* the second node is added to the cluster in the case above then the queues remain inconsitent after the join.
Created attachment 346706 [details]
Further fix (created against code for 752581-12)
This second case is fixed in qpidd-0.5.752581-13.el5
Tested on RHEL 5.3 i386/x86_64 qpidd-0.5.752581-13.el5 and it works as we expected in Comment 5 -->
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.