Hide Forgot
Created attachment 481391 [details] Test script run from qpid/java/tools/bin directory When running the JMS client using attached script to measure topic scalability, the test fails on the 100 test with a Connection reset error. The test increases the number of subscribers to a single topic with a single producer using the progression 1, 3, 10, 30, 100, 300 ..., but has never succeeded beyond the 30 test. This bug is one of two possible failure outcomes while running the 100 test. The chance of failure is 100%, however the probability of this bug's outcome is approximately 50%. Note that it seems odd that the failure would be on a code 200 (success). Error when running test Connection reset org.apache.qpid.AMQConnectionFailureException: Connection reset [error code 200: reply success] at org.apache.qpid.client.AMQConnection.<init>(AMQConnection.java:472) at org.apache.qpid.client.AMQConnection.<init>(AMQConnection.java:246) at org.apache.qpid.tools.PerfBase.setUp(PerfBase.java:55) at org.apache.qpid.tools.PerfConsumer.setUp(PerfConsumer.java:105) at org.apache.qpid.tools.PerfConsumer.test(PerfConsumer.java:222) at org.apache.qpid.tools.PerfConsumer$1.run(PerfConsumer.java:301) at java.lang.Thread.run(Thread.java:636) Caused by: org.apache.qpid.AMQException: Cannot connect to broker: Connection reset [error code 200: reply success] at org.apache.qpid.client.AMQConnectionDelegate_0_10.makeBrokerConnection(AMQConnectionDelegate_0_10.java:197) at org.apache.qpid.client.AMQConnection.makeBrokerConnection(AMQConnection.java:617) at org.apache.qpid.client.AMQConnection.<init>(AMQConnection.java:396) ... 6 more Caused by: org.apache.qpid.transport.ConnectionException: Connection reset at org.apache.qpid.transport.ConnectionException.rethrow(ConnectionException.java:67) at org.apache.qpid.transport.Connection.connect(Connection.java:267) at org.apache.qpid.client.AMQConnectionDelegate_0_10.makeBrokerConnection(AMQConnectionDelegate_0_10.java:178) ... 8 more Caused by: org.apache.qpid.transport.ConnectionException: Connection reset at org.apache.qpid.transport.Connection.exception(Connection.java:511) at org.apache.qpid.transport.network.Assembler.exception(Assembler.java:105) at org.apache.qpid.transport.network.InputHandler.exception(InputHandler.java:197) at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:145) ... 1 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:185) at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:123) ... 1 more
The other outcome described above (connection timed out) is in bug 680943.
While testing on a 2-machine config (ie broker on mrg43, JMS client on mrg42, and while running transient-only topic tests, I see the client freeze up on the 100, 300, and 1000-client section. If I run jdb against the client, I see that several instances of PerfConsumer are waiting for the end of the test, even though the producer has long since exited. See patch attachment for the modifications and scripts which produce this result. In the example below, the 1000-client test hung, and 5 threads are still waiting for the end of the test (although I have seen up to 100 waiting threads on some hangs): [kpvdr@mrg42 java]$ jdb -attach 8000 Set uncaught java.lang.Throwable Set deferred uncaught java.lang.Throwable Initializing jdb ... > threads Group system: (java.lang.ref.Reference$ReferenceHandler)0x7a4 Reference Handler cond. waiting (java.lang.ref.Finalizer$FinalizerThread)0x7a5 Finalizer cond. waiting (java.lang.Thread)0x7a6 Signal Dispatcher running Group main: (java.lang.Thread)0x1 main cond. waiting (java.lang.Thread)0x7a8 Thread-34 cond. waiting (java.lang.Thread)0x7a9 Thread-123 cond. waiting (java.lang.Thread)0x7aa Thread-160 cond. waiting (java.lang.Thread)0x7ab Thread-299 cond. waiting (java.lang.Thread)0x7ac Thread-732 cond. waiting (java.lang.Thread)0x7ad IoSender - /20.0.10.43:5672 cond. waiting (java.lang.Thread)0x7ae IoSender - /20.0.10.43:5672 cond. waiting (java.lang.Thread)0x7af IoSender - /20.0.10.43:5672 cond. waiting (java.lang.Thread)0x7b0 IoReceiver - /20.0.10.43:5672 running (java.lang.Thread)0x7b1 IoReceiver - /20.0.10.43:5672 running (java.lang.Thread)0x7b2 IoSender - /20.0.10.43:5672 cond. waiting (java.lang.Thread)0x7b3 IoSender - /20.0.10.43:5672 cond. waiting (java.lang.Thread)0x7b4 IoReceiver - /20.0.10.43:5672 running (java.lang.Thread)0x7b5 IoReceiver - /20.0.10.43:5672 running (java.lang.Thread)0x7b6 IoReceiver - /20.0.10.43:5672 running (java.util.TimerThread)0x7b7 ack-flusher cond. waiting (java.lang.Thread)0x7b8 Dispatcher-Channel-0 cond. waiting (java.lang.Thread)0x7b9 Dispatcher-Channel-0 cond. waiting (java.lang.Thread)0x7ba Dispatcher-Channel-0 cond. waiting (java.lang.Thread)0x7bb Dispatcher-Channel-0 cond. waiting (java.lang.Thread)0x7bc Dispatcher-Channel-0 cond. waiting > suspend All threads suspended. > thread 0x1 main[1] where [1] java.lang.Object.wait (native method) [2] java.lang.Thread.join (Thread.java:1,160) [3] java.lang.Thread.join (Thread.java:1,213) [4] org.apache.qpid.tools.PerfConsumer.main (PerfConsumer.java:322) main[1] thread 0x7a8 Thread-34[1] where [1] java.lang.Object.wait (native method) [2] java.lang.Object.wait (Object.java:502) [3] org.apache.qpid.tools.PerfConsumer.calcResults (PerfConsumer.java:148) [4] org.apache.qpid.tools.PerfConsumer.test (PerfConsumer.java:225) [5] org.apache.qpid.tools.PerfConsumer$1.run (PerfConsumer.java:301) [6] java.lang.Thread.run (Thread.java:636) Thread-34[1] thread 0x7ad IoSender - /20.0.10.43:5672[1] where [1] java.lang.Object.wait (native method) [2] java.lang.Object.wait (Object.java:502) [3] org.apache.qpid.transport.network.io.IoSender.run (IoSender.java:247) [4] java.lang.Thread.run (Thread.java:636) IoSender - /20.0.10.43:5672[1] thread 0x7b0 IoReceiver - /20.0.10.43:5672[1] where [1] java.net.SocketInputStream.socketRead0 (native method) [2] java.net.SocketInputStream.read (SocketInputStream.java:146) [3] org.apache.qpid.transport.network.io.IoReceiver.run (IoReceiver.java:123) [4] java.lang.Thread.run (Thread.java:636) IoReceiver - /20.0.10.43:5672[1] thread 0x7b8 Dispatcher-Channel-0[1] where [1] java.lang.Object.wait (native method) [2] java.lang.Object.wait (Object.java:502) [3] org.apache.qpid.client.util.FlowControllingBlockingQueue.take (FlowControllingBlockingQueue.java:92) [4] org.apache.qpid.client.AMQSession$Dispatcher.run (AMQSession.java:3,242) [5] java.lang.Thread.run (Thread.java:636)
Created attachment 489213 [details] Diff to produce modified test setup allowing multiple consumers This is the patch for producing the symptoms in Comment #2 above. It includes all the changes to the Java test source and the test script perf-topic.sh (which has its durable section commented out at present so that all runs are transient only).
The symptoms and attachment from comment 2 and comment 3 above are not directly related to this bug; my mistake. Please ignore.
Comment on attachment 489213 [details] Diff to produce modified test setup allowing multiple consumers not related to this bug