Red Hat Bugzilla – Bug 505231
If broker disconnects client for heartbeat timeout, client can crash if it then heartbeat timeouts broker
Last modified: 2011-08-12 12:16:09 EDT
Description of problem:
When running reproducer for BZ504590.
You occasionally get crashes from the perftest client like:
lt-perftest: ../../../qpid-working/cpp/src/qpid/sys/DispatchHandle.cpp:337: void qpid::sys::DispatchHandle::call(boost::function1<void, qpid::sys::DispatchHandle&, std::allocator<boost::function_base> >): Assertion `poller' failed.
Less than 1 in 20 loops of the test
Steps to Reproduce:
(may be reproduceable without a cluster)
Run clustered brokers:
qpidd --auth no --cluster-name ams --port 21022 --no-data-dir
qpidd --auth no --cluster-name ams --port 21023 --no-data-dir
qpidd --auth no --cluster-name ams --port 21024 --no-data-dir
Run perftest in a loop:
while true; do src/tests/perftest --port 21022 --heartbeat 1 & sleep 2 ; kill
-STOP %% ; sleep 4 ; kill -CONT %%; done
I'm pretty sure what's going on here is just after the perftest process is getting the continue signal after the broker has disconnected it.
Sometimes the clients own heartbeat failure code will be triggered just after the client has just destroyed enough of the connection state to make triggering the heartbeat fail.
This issue does not occur in the qpid trunk (code leading to 1.2)
Created attachment 349027 [details]
combined changes that should fix most of the issues that cause this behaviour
Created attachment 349028 [details]
Corrected patch (not reverse patch)
This patch applied to 758581-19