Bug 1146849 - Re-subscribing to auto-delete queue fails when the queue is just being deleted
Summary: Re-subscribing to auto-delete queue fails when the queue is just being deleted
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 2.5
Hardware: All
OS: All
low
medium
Target Milestone: 3.2.2
: ---
Assignee: messaging-bugs
QA Contact: Zdenek Kraus
URL:
Whiteboard:
Depends On:
Blocks: 1377722
TreeView+ depends on / blocked
 
Reported: 2014-09-26 08:44 UTC by Pavel Moravec
Modified: 2018-12-09 18:40 UTC (History)
5 users (show)

Fixed In Version: qpid-cpp-0.34-13
Doc Type: Bug Fix
Doc Text:
Cause: The creation of a receiver in the messaging APIs involves two main network interactions when using AMQP 0-10: declaring the queue, then subscribing to it. The semantics of declare is that it will create the queue if necessary. However it does not reset any auto-deletion timer. Consequence: It is possible the the queue expires just after the declare but before the subscribe, meaning the subscribe will fail as there is no queue. Fix: The declare semantics were altered to include resetting of any auto-deletion timer associated with the queue. Result: The queue is unlikely to expire between the declare and the subscribe (usually milliseconds apart at most) and the subscribe call will not fail.
Clone Of:
: 1377722 (view as bug list)
Environment:
Last Closed: 2016-10-11 07:36:08 UTC
Target Upstream Version:


Attachments (Terms of Use)
qpid trace logs from a reproducer (218.96 KB, application/x-gzip)
2014-09-26 08:47 UTC, Pavel Moravec
no flags Details


Links
System ID Priority Status Summary Last Updated
Apache JIRA QPID-7302 None None None 2016-06-15 08:49:40 UTC
Red Hat Product Errata RHBA-2016:2049 normal SHIPPED_LIVE Red Hat Enterprise MRG Messaging 3.2.2 Bug Fix Release 2016-10-11 11:36:01 UTC

Description Pavel Moravec 2014-09-26 08:44:29 UTC
Description of problem:
Having an auto-delete queue with deletion timeout that:
- is just being deleted due to the timeout
- is getting a new subscription / consumer from address containing "create:always" part

there is a probability that the new consumer will fail while getting "not-found: Queue not found" error.

Checking trace logs, this sequence of events happen:
1) A autoDel queue with timeout is created, a consumer subscribes
2) The consumer unsubscribes, broker starts autoDel timer
3) A new consumer will to subscribe to the queue, optionally to create it.
4) The consumer / client sends queue.declare - while the queue is still present
5) Broker deletes the queue as the autoDel timer fires.
6) The consumer / client tries to subscribe to the queue that is not further present, and gets the error.

(sometimes between steps 4 and 6, either client or broker tries to create some bindings, but that is omitted in the scenario. The key point is, the queue with create:always is _not_ present at the end)


Version-Release number of selected component (if applicable):
every currently available (incl. 0.18-31, 0.22-49, upstream)


How reproducible:
100% within a minute (nondeterministic scenario, though)


Steps to Reproduce:
1. Have a script like:

# cat ./resubscribe_autoDel_queues.sh
queues=10
sleep=0.95
while true; do
	for i in $(seq 1 $queues); do
		qpid-receive -a "autoDelQueue_${i}; {create:always, node:{ x-declare:{auto-delete:True, arguments:{'qpid.auto_delete_timeout':1}}}}" --print-content=no -m 10 &
	done
	wait
	sleep $sleep
	date "+%T.%N"
done
#

2. Tune its parameters (esp. "sleep" one) to your system, such that one loop is executed within one second on average.

3. Run it for some time.


Actual results:
After a while, it will log errors like:
2014-09-26 10:34:26 [Client] warning Exception received from broker: not-found: Queue not found: autoDelQueue_1 (/data_xfs/qpid/cpp/src/qpid/broker/QueueRegistry.cpp:127) [caused by 4 \x04:\x07]
qpid-receive: not-found: Queue not found: autoDelQueue_1 (/data_xfs/qpid/cpp/src/qpid/broker/QueueRegistry.cpp:127)


Expected results:
No such errors to be raised.


Additional info:
The key problem is that "queue.create" and "message.subscribe" AMQP primitives are not executed as an atomic operation / in one "transaction". If the queue is deleted meantime, we have this problem.

Potential fix (imho the only one not too complex) would be to cancel the autodelete timer when broker gets the queue create request for already existing queue. Though this change is debatable, as it modifies the broker's behaviour (whenever auto-del queue is re-created, its timer is newly cancelled).

Comment 1 Pavel Moravec 2014-09-26 08:47:42 UTC
Created attachment 941486 [details]
qpid trace logs from a reproducer

Try:

grep "09:43:12" qpidd.log | grep autoDelQueue_8

to see the below:

2014-09-26 09:43:12.883655798 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {ExchangeBoundBody: exchange=autoDelQueue_8; queue=autoDelQueue_8; binding-key=; arguments={}; }]
2014-09-26 09:43:12.883669992 [Broker] trace anonymous.1c947525-353b-4ec1-90fc-e5e7645dacac: recv cmd 1: {ExchangeBoundBody: exchange=autoDelQueue_8; queue=autoDelQueue_8; binding-key=; arguments={}; }
2014-09-26 09:43:12.884029886 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {QueueDeclareBody: queue=autoDelQueue_8; alternate-exchange=; auto-delete=1; arguments={qpid.auto_delete_timeout:F8:int64(1)}; }]
2014-09-26 09:43:12.884058439 [Broker] trace anonymous.1c947525-353b-4ec1-90fc-e5e7645dacac: recv cmd 2: {QueueDeclareBody: queue=autoDelQueue_8; alternate-exchange=; auto-delete=1; arguments={qpid.auto_delete_timeout:F8:int64(1)}; }
2014-09-26 09:43:12.884208606 [Model] debug Create queue. name:autoDelQueue_8 user:anonymous rhost:127.0.0.1:5672-127.0.0.1:51485 durable:F exclusive:F autodelete:T alternateExchange:
2014-09-26 09:43:12.884253534 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {ExchangeBindBody: queue=autoDelQueue_8; exchange=amq.fanout; binding-key=; arguments={}; }]
2014-09-26 09:43:12.884265984 [Broker] trace anonymous.1c947525-353b-4ec1-90fc-e5e7645dacac: recv cmd 3: {ExchangeBindBody: queue=autoDelQueue_8; exchange=amq.fanout; binding-key=; arguments={}; }
2014-09-26 09:43:12.885678185 [Model] debug Auto-delete queue: autoDelQueue_8 user:anonymous rhost:127.0.0.1:5672-127.0.0.1:51474
2014-09-26 09:43:12.885690601 [Broker] debug Unbinding key [autoDelQueue_8] from queue autoDelQueue_8 on exchange  origin=)
2014-09-26 09:43:12.885701060 [Management] trace Management object marked deleted: org.apache.qpid.broker:binding:org.apache.qpid.broker:exchange:,org.apache.qpid.broker:queue:autoDelQueue_8,autoDelQueue_8
2014-09-26 09:43:12.885711864 [Broker] debug Unbinding queue autoDelQueue_8 from exchange amq.fanout origin=)
2014-09-26 09:43:12.885721083 [Management] trace Management object marked deleted: org.apache.qpid.broker:binding:org.apache.qpid.broker:exchange:amq.fanout,org.apache.qpid.broker:queue:autoDelQueue_8,
2014-09-26 09:43:12.885731930 [Management] trace Management object marked deleted: org.apache.qpid.broker:queue:autoDelQueue_8
2014-09-26 09:43:12.884376579 [Broker] debug SemanticState::addBinding [queue=autoDelQueue_8, exchange=amq.fanout, key=, args={}]
2014-09-26 09:43:12.886745915 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageSubscribeBody: queue=autoDelQueue_8; destination=autoDelQueue_8; accept-mode=0; acquire-mode=0; resume-id=; resume-ttl=0; arguments={}; }]
2014-09-26 09:43:12.886772909 [Broker] trace anonymous.1c947525-353b-4ec1-90fc-e5e7645dacac: recv cmd 5: {MessageSubscribeBody: queue=autoDelQueue_8; destination=autoDelQueue_8; accept-mode=0; acquire-mode=0; resume-id=; resume-ttl=0; arguments={}; }
2014-09-26 09:43:12.886931414 [System] debug Exception constructed: Queue not found: autoDelQueue_8 (qpid/broker/SessionAdapter.cpp:683)
2014-09-26 09:43:12.887073649 [Protocol] error Execution exception: not-found: Queue not found: autoDelQueue_8 (qpid/broker/SessionAdapter.cpp:683)
2014-09-26 09:43:12.887157158 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageStopBody: destination=autoDelQueue_8; }]
2014-09-26 09:43:12.887173756 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageSetFlowModeBody: destination=autoDelQueue_8; flow-mode=1; }]
2014-09-26 09:43:12.887184521 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageFlowBody: destination=autoDelQueue_8; unit=0; value=1000; }]
2014-09-26 09:43:12.887194595 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageFlowBody: destination=autoDelQueue_8; unit=1; value=4294967295; }]
2014-09-26 09:43:12.887361036 [Protocol] trace RECV [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {MessageFlushBody: destination=autoDelQueue_8; }]
2014-09-26 09:43:12.887389631 [Protocol] trace SENT [127.0.0.1:5672-127.0.0.1:51485]: Frame[BEbe; channel=1; {ExecutionExceptionBody: error-code=404; command-id=5; class-code=4; command-code=7; field-index=0; description=not-found: Queue not found: autoDelQueue_8 (qpid/broker/SessionAdapter.cpp:683); error-info={}; }]

The traces are in fact commented/described in c#0.

Comment 2 Pavel Moravec 2016-05-16 14:52:38 UTC
Trivial reproducer using auto-delete queue even without timeout:

#include <qpid/messaging/Connection.h>
#include <qpid/messaging/Receiver.h>
#include <qpid/messaging/Session.h>
#include <stdio.h>
#include <iostream>

using namespace qpid::messaging;
using namespace std;

void printUsage(){
    cout << "Usage: QpidTester host BROKER EXHANGE INSTANCE" << endl;
    cout << endl << "Example:" << endl << "\t ./QpidTester bosystem1:10030" << endl;
}

int main(int argc, char** argv) {
    if (argc != 4){
        printUsage();
        return 1;
    }

    string broker = argv[1];
    string exchange = argv[2];
    string instance = argv[3];
    Connection connection = NULL;
    Session session = NULL;
    Receiver receiver = NULL;

    int tryCount = 0;
    while(tryCount++ < 1000){
        try{
            if (connection == NULL)
            {
                connection = Connection(broker, "");
                connection.open();
                session = connection.createSession();
            }
            string queue = exchange+"/foo.#; {create: always, link: {name: 'foo_receiver_queue' , x-declare: { auto-delete: true, exclusive: false } } }";
            receiver = session.createReceiver(queue);
            cout << "receiver created->" << instance << endl;
            receiver.getAvailable();
            cout << "get available->" << instance << endl;
            receiver.close();
            cout << "receiver closed->" << instance << endl;
        }
        catch(const std::exception& e){
            cout << "Instance-> " << instance << "Exception ..."  << e.what() << endl;
            connection.close();
            connection = NULL;
        }
    }
    connection.close();
}


Compile and run 2 instances in parallel like:

./QpidTester localhost:5672 amq.fanout 1 &
./QpidTester localhost:5672 amq.fanout 2 &

and notice occasional errors:

2016-05-16 16:48:25 [Client] warning Exception received from broker: not-found: Bind failed. No such queue: foo_receiver_queue (/builddir/build/BUILD/qpid-cpp-0.34/src/qpid/broker/Broker.cpp:1602) [caused by 6 \x07:\x04]
Instance-> 1Exception ...not-found: Bind failed. No such queue: foo_receiver_queue (/builddir/build/BUILD/qpid-cpp-0.34/src/qpid/broker/Broker.cpp:1602)

Comment 3 Gordon Sim 2016-06-15 08:49:41 UTC
Fixed upstream for timed auto-delete queues: https://svn.apache.org/r1748523

(Note this will address reproducer from original description, but not the alternative given in comment #2).

Comment 4 Pavel Moravec 2016-06-22 13:49:56 UTC
/me verified that 0.34-13 fixes reproducer from c#0 but not fixes reproducer from c#2 (code fix did not attempt to fix that scenario)

Comment 14 Zdenek Kraus 2016-09-20 12:54:51 UTC
moving this back -> ON_QA

Comment 15 Zdenek Kraus 2016-09-20 12:58:28 UTC
This bug was tested, with first scenario on RHEL6 i686, x86_64 and RHEL7 x86_64 with following packages:
qpid-cpp-client-0.34-17
qpid-cpp-client-devel-0.34-17
qpid-cpp-client-devel-docs-0.34-17
qpid-cpp-client-rdma-0.34-17
qpid-cpp-debuginfo-0.34-17
qpid-cpp-server-0.34-17
qpid-cpp-server-devel-0.34-17
qpid-cpp-server-ha-0.34-17
qpid-cpp-server-linearstore-0.34-17
qpid-cpp-server-rdma-0.34-17


Fix Works as expected.

-> VERIFIED

Comment 17 errata-xmlrpc 2016-10-11 07:36:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2049.html


Note You need to log in before you can comment on or make changes to this bug.