Bug 625541 - C++ New API: Connection::close() hangs on suspended broker
Summary: C++ New API: Connection::close() hangs on suspended broker
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: Development
Hardware: All
OS: Linux
Target Milestone: 1.3
: ---
Assignee: Gordon Sim
QA Contact: MRG Quality Engineering
Depends On:
TreeView+ depends on / blocked
Reported: 2010-08-19 18:58 UTC by Ted Ross
Modified: 2012-12-11 19:09 UTC (History)
2 users (show)

Clone Of:
Last Closed: 2012-12-11 19:09:29 UTC

Attachments (Terms of Use)
testcase (1.33 KB, text/plain)
2010-08-20 08:54 UTC, Gordon Sim
no flags Details
Suggested fix (2.31 KB, patch)
2010-08-20 09:21 UTC, Gordon Sim
tross: review+
tross: review+
Details | Diff

Description Ted Ross 2010-08-19 18:58:47 UTC
Description of problem:

If an application using the C++ API (new style) is connected to a broker and the broker is stopped/suspended, the client application cannot close the connection.  Connection::close() hangs indefinitely.

Version-Release number of selected component (if applicable): MRG1.3 beta5

How reproducible:  100% (I've reproduced this on Fedora 11)

Steps to Reproduce:
1. Write a C++ client app that connects to a broker, delays, closes the connection, then exits.
2. Start the broker, run the app, suspend the broker (^Z, SIGSTOP) during the delay
3. Observe the app hanging and not exiting
4. Resume the broker (fg) and see the app immediately exit.
Actual results:  Hanging app

Expected results:  App exits normally

Comment 1 Ted Ross 2010-08-19 19:00:42 UTC
Impact:  This is likely to be a problem in deployments where the broker is running on a virtualized guest.  If the guest is force-stopped or suspended, client applications that clean up connections on exit will not be able to exit.

Comment 2 Gordon Sim 2010-08-19 19:07:05 UTC
Not a bug in my view; enable heartbeats which are there to allow this sort of condition to be detected.

Comment 3 Gordon Sim 2010-08-19 19:08:34 UTC
I guess we could also support a form of close that was not clean; i.e. abort or detach without attempting to do a clean handshake with the broker. That would not be a 1.3 change however.

Comment 4 Gordon Sim 2010-08-19 19:12:16 UTC
Further to my comment above, note that that would be only a limited solution. If the application also e.g. cancelled a subscription as part of shutdown then that too would hang. Heartbeats are intended to solve exactly this problem, so that is really my recommendation.

Comment 5 Ted Ross 2010-08-19 21:47:01 UTC
More info...  This can be solved by turning on heartbeats, turning on reconnect, and setting a reconnect-limit.  The blockage will clear after the reconnect limit is reached (typically a long time).

There is still a problem if a) reconnect is not desired, or b) reconnect-limit is not desired or is very long.

In these cases, there is no clean way to shut down an application/daemon connected to a stopped/suspended broker.

A related effect: If reconnect is in use and the broker is shut down (cleanly, no need to suspend), the client application/daemon cannot close the session/connection without waiting for the reconnect-limit (if present) to expire.

Comment 6 Gordon Sim 2010-08-20 08:32:31 UTC
Turning on reconnect is not required; that is orthogonal to the problem. 

However my initial assessment was incorrect. Heartbeats *don't* solve this problem. The heartbeat timer is turned off just before the close attempt.

Comment 7 Gordon Sim 2010-08-20 08:54:34 UTC
Created attachment 439888 [details]

Comment 8 Gordon Sim 2010-08-20 09:21:40 UTC
Created attachment 439895 [details]
Suggested fix

The attached patch fixes the issue by waiting only for the length of the heartbeat interval (if specified) for the broker to respond to the close request.

Note You need to log in before you can comment on or make changes to this bug.