Bug 923818
| Summary: | Failed to send entries to node N : InterruptedException: CacheException: InterruptedException | ||
|---|---|---|---|
| Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Ladislav Thon <lthon> |
| Component: | Clustering | Assignee: | Paul Ferraro <paul.ferraro> |
| Status: | CLOSED DEFERRED | QA Contact: | Jitka Kozana <jkudrnac> |
| Severity: | high | Docs Contact: | Russell Dickenson <rdickens> |
| Priority: | unspecified | ||
| Version: | 6.1.1 | CC: | cdewolf, jkudrnac, lcosti, myarboro, rhusar |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | TBD EAP 7 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-03-19 11:08:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Still seeing this with EAP 6.1.0.ER8. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/24/artifact/report/config/jboss-perf18/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/54/artifact/report/config/jboss-perf18/server.log Still seeing this with EAP 6.1.1.ER7. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync/26/ https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/22/ The issue is still with us in EAP 6.2.0.CR3. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/31/artifact/report/config/jboss-perf19/server.log Graceful shutdown is not support in the affected version of EAP 6. The described behavior is to be expected. The RFE to implement graceful shutdown is https://issues.jboss.org/browse/EAP6-7 Issues relating to HTTP can be avoided using mod_cluster and session draining prior to shutdown/undeploy. This bug pertains to clean shutdown which is not scheduled to be implemented in EAP 6.x - and is targeted to be addressed in 7.0. Setting resolution to WONTFIX. Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved |
Seen these exceptions during EAP 6.1.0.ER3 testing: 07:07:57,766 ERROR [org.infinispan.remoting.rpc.RpcManagerImpl] (transport-thread-7) ISPN000073: Unexpected error while replicating: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) [rt.jar:1.6.0_43] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2128) [rt.jar:1.6.0_43] at org.jgroups.blocks.Request.responsesComplete(Request.java:197) at org.jgroups.blocks.Request.execute(Request.java:89) at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:370) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:190) at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:257) at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:187) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43] 07:07:57,771 ERROR [org.infinispan.statetransfer.OutboundTransferTask] (transport-thread-7) Failed to send entries to node perf20/web : java.lang.InterruptedException: org.infinispan.CacheException: java.lang.InterruptedException at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:179) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:190) at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:257) at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:187) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43] Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) [rt.jar:1.6.0_43] at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2128) [rt.jar:1.6.0_43] at org.jgroups.blocks.Request.responsesComplete(Request.java:197) at org.jgroups.blocks.Request.execute(Request.java:89) at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:370) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169) ... 12 more I believe that the first one is actually the cause of the second one. As can be seen in the log, the node receives new view, then logs a bunch of "Caught exception when handling command CacheTopologyControlCommand{cache=dist, type=REBALANCE_CONFIRM, sender=perf19/web, joinInfo=null, topologyId=26, currentCH=null, pendingCH=null, throwable=null, viewId=10}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf19/web for cache dist, we don't have a rebalance in progress" and then these two exceptions appear. Seen in https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-async/23/console-perf18/ So yeah, it's dist/async, but perf20 is online when this happens (and it's in the new view).