Bug 505231 - If broker disconnects client for heartbeat timeout, client can crash if it then heartbeat timeouts broker
Summary: If broker disconnects client for heartbeat timeout, client can crash if it th...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp
Version: 1.1.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: messaging-bugs
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-11 06:00 UTC by Andrew Stitcher
Modified: 2020-11-04 17:54 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
combined changes that should fix most of the issues that cause this behaviour (3.10 KB, patch)
2009-06-23 02:38 UTC, Andrew Stitcher
no flags Details | Diff
Corrected patch (not reverse patch) (3.10 KB, patch)
2009-06-23 02:40 UTC, Andrew Stitcher
no flags Details | Diff

Description Andrew Stitcher 2009-06-11 06:00:30 UTC
Description of problem:

When running reproducer for BZ504590.

You occasionally get crashes from the perftest client like:
lt-perftest: ../../../qpid-working/cpp/src/qpid/sys/DispatchHandle.cpp:337: void qpid::sys::DispatchHandle::call(boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >): Assertion `poller' failed.

How reproducible:

Less than 1 in 20 loops of the test

Steps to Reproduce:
(may be reproduceable without a cluster)

Run clustered brokers:
qpidd --auth no --cluster-name ams --port 21022 --no-data-dir
qpidd --auth no --cluster-name ams --port 21023 --no-data-dir
qpidd --auth no --cluster-name ams --port 21024 --no-data-dir

Run perftest in a loop:
while true; do src/tests/perftest --port 21022 --heartbeat 1 & sleep 2 ; kill
-STOP %% ; sleep 4 ; kill -CONT %%; done

Comment 1 Andrew Stitcher 2009-06-11 06:06:34 UTC
I'm pretty sure what's going on here is just after the perftest process is getting the continue signal after the broker has disconnected it.

Sometimes the clients own heartbeat failure code will be triggered just after the client has just destroyed enough of the connection state to make triggering the heartbeat fail.

Comment 2 Andrew Stitcher 2009-06-11 14:50:01 UTC
This issue does not occur in the qpid trunk (code leading to 1.2)

Comment 3 Andrew Stitcher 2009-06-23 02:38:37 UTC
Created attachment 349027 [details]
combined changes that should fix most of the issues that cause this behaviour

Comment 4 Andrew Stitcher 2009-06-23 02:40:49 UTC
Created attachment 349028 [details]
Corrected patch (not reverse patch)

Comment 5 Andrew Stitcher 2009-06-23 02:41:32 UTC
This patch applied to 758581-19


Note You need to log in before you can comment on or make changes to this bug.