Bug 701777

Summary: Qmf Agent stuck in flow-stopped mode
Product: Red Hat Enterprise MRG Reporter: Ted Ross <tross>
Component: qpid-qmfAssignee: Ted Ross <tross>
Status: CLOSED ERRATA QA Contact: Jan Sarenik <jsarenik>
Severity: unspecified Docs Contact:
Priority: urgent    
Version: DevelopmentCC: freznice, iboverma, jneedle, jross, jsarenik, matt
Target Milestone: 2.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qpid-qmf-0.10-7.el6, qpid-qmf-0.10-8.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-23 15:44:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ted Ross 2011-05-03 20:12:59 UTC
Description of problem:

The qmf agent library used by sesame and condor can experience a stuck-in-flow-controlled state even after the message congestion is cleared

Version-Release number of selected component (if applicable):

qpid-qmf-0.10-6

How reproducible:

100%

Steps to Reproduce:
1. Start a broker
2. Build and run the qmf-agent example from SVN: qpid/cpp/examples/qmf-agent
3. Start qpid-tool, verify that there are 3 "parent" objects
4. On one of the parent objects, call the create_child method

   qpid: call <id> create_child child1

   You should get a return value.

5. Create a new queue with a size limit:

   qpid-config add queue test --max-queue-size=1000

6. Bind the queue to receive agent messages:

   qpid-config bind qmf.default.topic test 'agent.#'

7. Wait a few seconds (the queue should quickly go into flow-stop mode)

8. In the qpid-tool window, add another child:

   qpid: call <id> create_child child2

   There will be no response because the agent is flow-stopped.

9. Delete the test queue:

   qpid-config del queue test --force-if-not-empty

   The flow of messages (i.e. the pending method calls) should begin to flow
   again.
  
Actual results:

   The agent remains hung

Expected results:

   The agent resumes operation

Comment 2 Ted Ross 2011-05-03 21:02:50 UTC
Fixed upstream in revision r1099225.

Comment 3 Ted Ross 2011-05-03 21:05:34 UTC
Updated reproducer notes:

1. Start a broker
2. Build and run the qmf-agent example from SVN: qpid/cpp/examples/qmf-agent
3. Start qpid-tool, verify that there are 3 "parent" objects
4. On one of the parent objects, call the create_child method

   qpid: call <id> create_child child1

   You should get a return value.

5. Create a new queue with a flow stop value:

   qpid-config add queue test --flow-stop-count=25

6. Bind the queue to receive agent messages:

   qpid-config bind qmf.default.topic test 'agent.#'

7. Wait a few seconds (the queue should quickly go into flow-stop mode)

8. In the qpid-tool window, add another child:

   qpid: call <id> create_child child2

   There will be no response because the agent is flow-stopped.  You may
   have to do this a couple of times.

9. Drain the test queue:

   drain test -c 10000 -f

   Subsequent method calls should now succeed.

Comment 5 Jan Sarenik 2011-05-26 11:49:41 UTC
I am verifying on RHEL5 x86_64 with qpid-qmf-0.10-9.el5,
qmf-agent from current qpid git tree, drain from
qpid-cpp-client-devel-0.10-7.el5

Even when I run the last step (draining messages from the queue
test), I am still getting following messages round and round
at the qmf-agent's output:
-------------------------------------------------------------------
2011-05-26 13:46:10 warning Exception received from broker: resource-limit-exceeded: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on test, policy: size: max=1000, current=984; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:87) [caused by 9 \x00:\x00]
2011-05-26 13:46:10 error Exception caught in sendMessage: resource-limit-exceeded: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on test, policy: size: max=1000, current=984; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:87)
2011-05-26 13:46:10 error guest.qmfagent-76b1df05-31fb-4c2a-9cae-1cee459f9e7e error: resource-limit-exceeded: resource-limit-exceeded: resource-limit-exceeded: Policy exceeded on test, policy: size: max=1000, current=984; count: unlimited; type=reject (qpid/broker/QueuePolicy.cpp:87)
2011-05-26 13:46:10 warning Connection to the broker has been lost
-------------------------------------------------------------------

Comment 6 Jan Sarenik 2011-05-26 14:38:06 UTC
I have reproduced the bug with qpid-qmf-0.10-6.el5 and the only 
difference is that on this old version I do not get messages to 
the qpid-tool (running since the beginning in another terminal)
which sent create_child calls while queue was congested. Using
the latest version, I get messages on qpid-tool acknowledging
the creation of children as soon as the filled queue starts
to be drained.

Will verify the other architectures as well. So far it is
verified on qpid-qmf-0.10-9.el5.x86_64

Comment 9 Jan Sarenik 2011-05-30 09:44:59 UTC
I assume there is a typo in "Fixed In Version:" field of this bug, for
the latest RHEL6 package I am able to find is 0.10-6.el6 and I have
verified it works there as expected.

Verified both on RHEL6.1 i386 and x86_64.

Comment 10 Jan Sarenik 2011-06-06 10:11:28 UTC
I am sure I have verified this bug both on RHEL5.6 and RHEL6.1
even though it is not clearly written in above comments.

Comment 11 errata-xmlrpc 2011-06-23 15:44:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0890.html