506769 – Broker redelivering dequeued messages (?)

Bug 506769 - Broker redelivering dequeued messages (?)

Summary: Broker redelivering dequeued messages (?)

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	qpid-cpp
Sub Component:
Version:	1.1.1
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ken Giusti
QA Contact:	MRG Quality Engineering
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-06-18 15:53 UTC by Gordon Sim
Modified:	2025-02-10 03:13 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2025-02-10 03:13:27 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Apache JIRA	QPID-3079	0	None	None	None	Never

Description Gordon Sim 2009-06-18 15:53:58 UTC

Description of problem:

See https://issues.apache.org/jira/browse/QPID-1917

Version-Release number of selected component (if applicable):

Qpid built from trunk on 18th June 2009.

Comment 1 Carl Trieloff 2009-06-18 20:25:41 UTC


The bug is that we never wait for the async IO to complete on dequeue. In a txn we block on the AIO on the queue.

On enqueue we don't ack until the AIO completed. On dequeue, we don't wait to send the ack... isDequeueComplete()...  before sending the ack for dequeue.

Comment 2 Kim van der Riet 2009-06-25 17:47:51 UTC

Initial test overview:
* Start a broker with a store:
./qpidd --load-module path/to/msgstore.so --auth no --data-dir path/to/datadir --log-enable info+ --port 0

* Load the broker (and store) with 24 persistent messages (contained in file messages.txt):
./sender -b localhost -p ${PORT} --exchange TEST_EXCHANGE --routing-key TEST_QUEUE --durable yes < messages.txt

* Consume 8 messages, 5 at a time, then kill the broker:
./receiver -b localhost -p ${PORT} --messages 8 --ack-frequency 5 --credit-window 5 --queue TEST_QUEUE --trace; kill -9 ${BROKER_PID}

* Restart the broker and consume the remaining messages, making sure that there are no duplicates. This usually fails and all of the 8 messages consumed in the previous step are resent.

Immediately after the test completes and the broker is killed, the store dequeue records in the store write cache are lost as the store's flush timer has not fired yet. However, if the broker is stopped rather than killed (using -TERM), then the store is flushed and the records are written prior to the broker terminating, and the test passes (no dups). The java test referred to in the Apache Jira above was thought to be stopping (rather than killing) the broker, but appears to first stop then immediately kill the broker.

However, in spite of this, there is still a missing piece in the broker which is exposed by this kill test - which should not fail. The receiver should not exit until the dequeues have hit the disk. Currently, the dequeues are not flushed by the broker at the end of the receive portion of the test, even though the broker affirms that the MessageAcceptBodys are complete.

Comment 3 Kim van der Riet 2009-06-26 12:56:25 UTC

More details on the test above:

1. Create a message file messages.txt containing 24 messages, one message per line:
message_01
message_02
...
message_24

2. In one window, start a broker after removing the previous store:
rm -rf ~/.qpidd /tmp/lock /tmp/systemId /tmp/rhm
./qpidd --load-module /path/to/msgstore.so --auth no --data-dir /tmp --log-enable info+ --port 0
pgrep qpidd

Note both the port number printed by the broker and the return of the pgrep for the pid - these are needed for steps 3 and 4.

3. In another window in the test dir, prepare the broker and store by doing the following (the port number is that observed above):
export PORT=12345
qpid-config -a localhost:${PORT} add exchange direct TEST_EXCHANGE
qpid-config -a localhost:${PORT} add queue TEST_QUEUE --durable
qpid-config -a localhost:${PORT} bind TEST_EXCHANGE TEST_QUEUE TEST_QUEUE
./sender -b localhost -p ${PORT} --exchange TEST_EXCHANGE --routing-key TEST_QUEUE --durable yes < messages.txt

4. Now extract a small number of messages and immediately kill the broker (the PID used in the kill is that observed in step 2 above):
./receiver -b localhost -p ${PORT} --messages 8 --ack-frequency 5 --credit-window 5 --queue TEST_QUEUE; kill -9 4321

5. Restart the broker and rerun receiver to receive the remaining messages:
./qpidd --load-module /path/to/msgstore.so --auth no --data-dir /tmp --log-enable info+ --port 0
./receiver -b localhost -p ${PORT} --messages 16 --ack-frequency 5 --credit-window 5 --queue TEST_QUEUE

Alternatively, look at the store and check for the presence of the 8 dequeue records. If the store source is checked out, then the jhexdump script will help:
./tests/jrnl/jhexdump /tmp/rhm/jrnl/000d/TEST_QUEUE
will create files j0.txt through j7.txt. Look at the j0.txt file to see the enqueue and dequeue records.

Comment 5 Gordon Sim 2011-05-17 13:06:12 UTC

See also https://issues.apache.org/jira/browse/QPID-3079

Comment 7 Red Hat Bugzilla 2025-02-10 03:13:27 UTC

This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Note You need to log in before you can comment on or make changes to this bug.