Bug 625541 - C++ New API: Connection::close() hangs on suspended broker
C++ New API: Connection::close() hangs on suspended broker
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
All Linux
medium Severity medium
: 1.3
: ---
Assigned To: Gordon Sim
MRG Quality Engineering
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-19 14:58 EDT by Ted Ross
Modified: 2012-12-11 14:09 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-12-11 14:09:29 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
testcase (1.33 KB, text/plain)
2010-08-20 04:54 EDT, Gordon Sim
no flags Details
Suggested fix (2.31 KB, patch)
2010-08-20 05:21 EDT, Gordon Sim
tross: review+
tross: review+
Details | Diff

  None (edit)
Description Ted Ross 2010-08-19 14:58:47 EDT
Description of problem:

If an application using the C++ API (new style) is connected to a broker and the broker is stopped/suspended, the client application cannot close the connection.  Connection::close() hangs indefinitely.

Version-Release number of selected component (if applicable): MRG1.3 beta5

How reproducible:  100% (I've reproduced this on Fedora 11)


Steps to Reproduce:
1. Write a C++ client app that connects to a broker, delays, closes the connection, then exits.
2. Start the broker, run the app, suspend the broker (^Z, SIGSTOP) during the delay
3. Observe the app hanging and not exiting
4. Resume the broker (fg) and see the app immediately exit.
  
Actual results:  Hanging app

Expected results:  App exits normally
Comment 1 Ted Ross 2010-08-19 15:00:42 EDT
Impact:  This is likely to be a problem in deployments where the broker is running on a virtualized guest.  If the guest is force-stopped or suspended, client applications that clean up connections on exit will not be able to exit.
Comment 2 Gordon Sim 2010-08-19 15:07:05 EDT
Not a bug in my view; enable heartbeats which are there to allow this sort of condition to be detected.
Comment 3 Gordon Sim 2010-08-19 15:08:34 EDT
I guess we could also support a form of close that was not clean; i.e. abort or detach without attempting to do a clean handshake with the broker. That would not be a 1.3 change however.
Comment 4 Gordon Sim 2010-08-19 15:12:16 EDT
Further to my comment above, note that that would be only a limited solution. If the application also e.g. cancelled a subscription as part of shutdown then that too would hang. Heartbeats are intended to solve exactly this problem, so that is really my recommendation.
Comment 5 Ted Ross 2010-08-19 17:47:01 EDT
More info...  This can be solved by turning on heartbeats, turning on reconnect, and setting a reconnect-limit.  The blockage will clear after the reconnect limit is reached (typically a long time).

There is still a problem if a) reconnect is not desired, or b) reconnect-limit is not desired or is very long.

In these cases, there is no clean way to shut down an application/daemon connected to a stopped/suspended broker.

A related effect: If reconnect is in use and the broker is shut down (cleanly, no need to suspend), the client application/daemon cannot close the session/connection without waiting for the reconnect-limit (if present) to expire.
Comment 6 Gordon Sim 2010-08-20 04:32:31 EDT
Turning on reconnect is not required; that is orthogonal to the problem. 

However my initial assessment was incorrect. Heartbeats *don't* solve this problem. The heartbeat timer is turned off just before the close attempt.
Comment 7 Gordon Sim 2010-08-20 04:54:34 EDT
Created attachment 439888 [details]
testcase
Comment 8 Gordon Sim 2010-08-20 05:21:40 EDT
Created attachment 439895 [details]
Suggested fix

The attached patch fixes the issue by waiting only for the length of the heartbeat interval (if specified) for the broker to respond to the close request.

Note You need to log in before you can comment on or make changes to this bug.