Description of problem: As per the following thread dump you can clearly see the deadlock between the failover mutex in AMQConnection.java and the current_exception_lock in AMQSession.java This is a regression and was introduced in rev 985262 Found one Java-level deadlock: ============================= "IoReceiver - localhost/127.0.0.1:15672": waiting to lock monitor 0x0000002ac2ea3b70 (object 0x0000002ab70156b0, a java.lang.Object), which is held by "main" "main": waiting to lock monitor 0x0000002ac28db1b8 (object 0x0000002ab7048d70, a java.lang.Object), which is held by "IoReceiver - localhost/127.0.0.1:15672" Java stack information for the threads listed above: =================================================== "IoReceiver - localhost/127.0.0.1:15672": at org.apache.qpid.client.AMQConnection.exceptionReceived(AMQConnection.java:1297) * waiting to lock<0x0000002ab70156b0> (a java.lang.Object) at org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1033) * locked<0x0000002ab7048d70> (a java.lang.Object) at org.apache.qpid.client.AMQSession_0_10.exception(AMQSession_0_10.java:913) at org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:156) at org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:32) at org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:112) at org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:50) at org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:32) at org.apache.qpid.transport.Method.delegate(Method.java:159) at org.apache.qpid.transport.Session.received(Session.java:528) at org.apache.qpid.transport.Connection.dispatch(Connection.java:404) at org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:64) at org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:40) at org.apache.qpid.transport.MethodDelegate.executionException(MethodDelegate.java:110) at org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:112) at org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:54) at org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:40) at org.apache.qpid.transport.Method.delegate(Method.java:159) at org.apache.qpid.transport.Connection.received(Connection.java:369) at org.apache.qpid.transport.Connection.received(Connection.java:59) at org.apache.qpid.transport.network.Assembler.emit(Assembler.java:95) at org.apache.qpid.transport.network.Assembler.assemble(Assembler.java:196) at org.apache.qpid.transport.network.Assembler.frame(Assembler.java:129) at org.apache.qpid.transport.network.Frame.delegate(Frame.java:133) at org.apache.qpid.transport.network.Assembler.received(Assembler.java:100) at org.apache.qpid.transport.network.Assembler.received(Assembler.java:42) at org.apache.qpid.transport.network.InputHandler.next(InputHandler.java:187) at org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:103) at org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:42) at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:128) at java.lang.Thread.run(Thread.java:619) "main": at org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1025) * waiting to lock<0x0000002ab7048d70> (a java.lang.Object) at org.apache.qpid.client.BasicMessageConsumer_0_10.sendCancel(BasicMessageConsumer_0_10.java:193) at org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:573) * locked<0x0000002ab70156b0> (a java.lang.Object) at org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:535) at org.apache.qpid.client.AMQQueueBrowser.close(AMQQueueBrowser.java:102) at org.apache.qpid.test.client.QueueBrowserAutoAckTest.testFailoverWithQueueBrowser(QueueBrowserAutoAckTest.java:501) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:234) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:120) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:546) Found 1 deadlock Version-Release number of selected component (if applicable): All beta/RC packages created for the 2.0 release. How reproducible: When running QueueBrowserAutoAckTest it happens fairly regularly. Steps to Reproduce: 1. Run ant test -Dprofile=cpp 2. When the tests seems to be stuck in QueueBrowserAutoAckTest find the process and do a kill -3 on it. Actual results: Deadlock happens as per above description. Expected results: Should not deadlock
See also https://bugzilla.redhat.com/show_bug.cgi?id=698657
This is tracked in upstream via QPID-3214 A fix was committed in upstream at rev 1099060 http://svn.apache.org/viewvc?view=revision&revision=1099060 This was ported to the internal mrg_2.0.x release branch at, http://mrg1.lab.bos.redhat.com/cgit/qpid.git/commit/?h=mrg_2.0.x&id=cd703ec7af8dd9c14c6dd10ceb47c445ac177c2b
*** Bug 698657 has been marked as a duplicate of this bug. ***
The fix is included in on qpid-java-0.10-6
This issue has been fixed. Verified on RHEL5.6, RHEL6.1 architectures: i386, x86_64 Java unit tests from qpid-java-0.10-6 package were executed in loop, no deadlock has occurred. During the verification of this bug, I've noticed that several java unit tests fails, please see Bug 709383 packages installed: python-qpid-0.10-1.el5 python-qpid-qmf-0.10-9.el5 qpid-cpp-client-0.10-7.el5 qpid-cpp-client-devel-0.10-7.el5 qpid-cpp-client-devel-docs-0.10-7.el5 qpid-cpp-client-rdma-0.10-7.el5 qpid-cpp-client-ssl-0.10-7.el5 qpid-cpp-mrg-debuginfo-0.10-7.el5 qpid-cpp-server-0.10-7.el5 qpid-cpp-server-cluster-0.10-7.el5 qpid-cpp-server-devel-0.10-7.el5 qpid-cpp-server-rdma-0.10-7.el5 qpid-cpp-server-ssl-0.10-7.el5 qpid-cpp-server-store-0.10-7.el5 qpid-cpp-server-xml-0.10-7.el5 qpid-java-client-0.10-6.el5 qpid-java-common-0.10-6.el5 qpid-java-example-0.10-6.el5 qpid-qmf-0.10-9.el5 qpid-qmf-devel-0.10-9.el5 qpid-tools-0.10-5.el5 -> VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause This happens when the application uses a synchronous operation and an exception is reported by the broker. The Qpid client tries to report the exception via the connection listener and also as a JMS exception thrown during the blocking method call. Consequence This bug causes a deadlock and could cause the application to become unresponsive. Fix The call to connection.exceptionReceived() is done outside the scope of the current_exception_lock in AMQSsession.java Result The Qpid client does not deadlock anymore.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0890.html