923818 – Failed to send entries to node N : InterruptedException: CacheException: InterruptedException

Bug 923818 - Failed to send entries to node N : InterruptedException: CacheException: InterruptedException

Summary: Failed to send entries to node N : InterruptedException: CacheException: Inte...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Clustering
Sub Component:
Version:	6.1.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	TBD EAP 7
Assignee:	Paul Ferraro
QA Contact:	Jitka Kozana
Docs Contact:	Russell Dickenson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-03-20 14:11 UTC by Ladislav Thon
Modified:	2016-02-28 16:47 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-03-19 11:08:30 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	EAP7-86	0	Critical	Closed	Graceful Shutdown and Quiescing	2017-12-21 16:14:20 UTC

Description Ladislav Thon 2013-03-20 14:11:39 UTC

Seen these exceptions during EAP 6.1.0.ER3 testing:

07:07:57,766 ERROR [org.infinispan.remoting.rpc.RpcManagerImpl] (transport-thread-7) ISPN000073: Unexpected error while replicating: java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) [rt.jar:1.6.0_43]
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2128) [rt.jar:1.6.0_43]
	at org.jgroups.blocks.Request.responsesComplete(Request.java:197)
	at org.jgroups.blocks.Request.execute(Request.java:89)
	at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:370)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:190)
	at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:257)
	at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:187)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43]

07:07:57,771 ERROR [org.infinispan.statetransfer.OutboundTransferTask] (transport-thread-7) Failed to send entries to node perf20/web : java.lang.InterruptedException: org.infinispan.CacheException: java.lang.InterruptedException
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:179)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:190)
	at org.infinispan.statetransfer.OutboundTransferTask.sendEntries(OutboundTransferTask.java:257)
	at org.infinispan.statetransfer.OutboundTransferTask.run(OutboundTransferTask.java:187)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43]
Caused by: java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961) [rt.jar:1.6.0_43]
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2128) [rt.jar:1.6.0_43]
	at org.jgroups.blocks.Request.responsesComplete(Request.java:197)
	at org.jgroups.blocks.Request.execute(Request.java:89)
	at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:370)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169)
	... 12 more

I believe that the first one is actually the cause of the second one.

As can be seen in the log, the node receives new view, then logs a bunch of "Caught exception when handling command CacheTopologyControlCommand{cache=dist, type=REBALANCE_CONFIRM, sender=perf19/web, joinInfo=null, topologyId=26, currentCH=null, pendingCH=null, throwable=null, viewId=10}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf19/web for cache dist, we don't have a rebalance in progress" and then these two exceptions appear.

Seen in https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-async/23/console-perf18/

So yeah, it's dist/async, but perf20 is online when this happens (and it's in the new view).

Comment 1 Ladislav Thon 2013-05-15 07:54:26 UTC

Still seeing this with EAP 6.1.0.ER8. For example:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/24/artifact/report/config/jboss-perf18/server.log
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/54/artifact/report/config/jboss-perf18/server.log

Comment 2 Ladislav Thon 2013-08-23 12:33:34 UTC

Still seeing this with EAP 6.1.1.ER7. For example:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync/26/
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/22/

Comment 3 Jitka Kozana 2013-12-09 12:36:58 UTC

The issue is still with us in EAP 6.2.0.CR3.

For example:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/31/artifact/report/config/jboss-perf19/server.log

Comment 4 Radoslav Husar 2013-12-16 23:28:28 UTC

Graceful shutdown is not support in the affected version of EAP 6. The 
described behavior is to be expected.

The RFE to implement graceful shutdown is 
https://issues.jboss.org/browse/EAP6-7

Issues relating to HTTP can be avoided using mod_cluster and session 
draining prior to shutdown/undeploy.

Comment 5 Paul Ferraro 2014-05-21 13:26:04 UTC

This bug pertains to clean shutdown which is not scheduled to be implemented in EAP 6.x - and is targeted to be addressed in 7.0.  Setting resolution to WONTFIX.

Comment 7 Carlo de Wolf 2015-03-19 11:08:30 UTC

Moved to https://issues.jboss.org/browse/EAP7-86

Comment 8 JBoss JIRA Server 2015-08-11 01:39:58 UTC

Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved

Comment 9 JBoss JIRA Server 2015-08-13 12:59:35 UTC

Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened

Comment 10 JBoss JIRA Server 2015-12-09 16:23:51 UTC

Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved

Comment 11 JBoss JIRA Server 2015-12-16 13:13:03 UTC

Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened

Comment 12 JBoss JIRA Server 2016-02-28 16:47:39 UTC

Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved

Note You need to log in before you can comment on or make changes to this bug.