Description of problem: Running a reproducer to be attached, I got one node shutdown with log: 2012-12-20 14:44:55 critical Error delivering frames: Unknown connection: Frame[BEbe; channel=0; {ClusterConnectionDeliverDoOutputBody: limit=2048; }] control 10.34.1.241:24393-3125 (qpid/cluster/Cluster.cpp:554) Random node crashes, usually within 30 minutes. Version-Release number of selected component (if applicable): qpid-cpp-server-0.14-22.el6_3.x86_64 How reproducible: 30% (more time, more probably) Steps to Reproduce: Run reproducer to be attached Actual results: After some time, one node stops with above critical error. Expected results: No cluster de-sync. Additional info:
Created attachment 666734 [details] reproducer To reproduce: 1) unpack 2) Translate some auxiliary C++ client: g++ -lqpidclient -lqpidcommon -lqpidmessaging -lqpidtypes OptionParser.o spout_drain_in_one_session.cpp -o spout_drain_in_one_session 3) Have qpid-receive in $PATH 4) ./889241_reproducer.sh In nutshell, the reproducer sends and consumes messages to/from 30 durable queues in parallel, such that the queues are usually almost empty. Normal 3node (old) cluster is used.
Sometimes, a broker shutted down due to another reason while using the same reproducer: 2012-12-20 15:19:47 error Execution exception: invalid-argument: anonymous.69a3b08c-c6ec-47fa-bc5b-d2be5a725b7d: Known-completed has invalid commands. (qpid/SessionState.cpp:219) 2012-12-20 15:19:47 critical cluster(10.34.1.241:18980 READY/error) local error 10152172 did not occur on member 10.34.1.241:18977: invalid-argument: anonymous.69a3b08c-c6ec-47fa-bc5b-d2be5a725b7d: Known-completed has invalid commands. (qpid/SessionState.cpp:219) 2012-12-20 15:19:47 critical Error delivering frames: local error did not occur on all cluster members : invalid-argument: anonymous.69a3b08c-c6ec-47fa-bc5b-d2be5a725b7d: Known-completed has invalid commands. (qpid/SessionState.cpp:219) (qpid/cluster/ErrorCheck.cpp:89) 2012-12-20 15:19:47 notice cluster(10.34.1.241:18980 LEFT/error) leaving cluster dst 2012-12-20 15:19:47 critical Error in cluster dispatch: Error in CPG dispatch: library (2) 2012-12-20 15:19:47 notice Shut down
(In reply to comment #1) > Created attachment 666734 [details] > reproducer > > To reproduce: > 1) unpack > 2) Translate some auxiliary C++ client: > g++ -lqpidclient -lqpidcommon -lqpidmessaging -lqpidtypes OptionParser.o > spout_drain_in_one_session.cpp -o spout_drain_in_one_session > 3) Have qpid-receive in $PATH > 4) ./889241_reproducer.sh > > > In nutshell, the reproducer sends and consumes messages to/from 30 durable > queues in parallel, such that the queues are usually almost empty. Normal > 3node (old) cluster is used. me-- forgot to attach OptionsParser: take it from qpid-cpp-client-devel package
This scenario does come from my internal testing and is not based by a customer user case.