923769 – Infinispan tries to transfer 0 segments

Bug 923769 - Infinispan tries to transfer 0 segments

Summary: Infinispan tries to transfer 0 segments

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Clustering
Sub Component:
Version:	6.1.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	EAP 6.3.0
Assignee:	Paul Ferraro
QA Contact:	Jitka Kozana
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-03-20 12:00 UTC by Ladislav Thon
Modified:	2014-04-23 02:14 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-12-09 14:48:39 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Description Ladislav Thon 2013-03-20 12:00:43 UTC

Found this exception during EAP 6.1.0.ER3 testing:

WARN  [org.infinispan.statetransfer.InboundTransferTask] (transport-thread-10) ISPN000210: Failed to request segments [] of cache org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl from node perf21/ejb (node will not be retried): org.infinispan.remoting.RemoteException: ISPN000217: Received exception from perf21/ejb, see cause for remote stack trace
	at org.infinispan.remoting.transport.AbstractTransport.checkResponse(AbstractTransport.java:72)
	at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:102)
	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:541)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169)
	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:195)
	at org.infinispan.statetransfer.InboundTransferTask.requestSegments(InboundTransferTask.java:141)
	at org.infinispan.statetransfer.StateConsumerImpl$1.run(StateConsumerImpl.java:707)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43]
	at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43]
Caused by: java.lang.IllegalArgumentException: Segments must not be null or empty
	at org.infinispan.statetransfer.OutboundTransferTask.<init>(OutboundTransferTask.java:103)
	at org.infinispan.statetransfer.StateProviderImpl.startOutboundTransfer(StateProviderImpl.java:286)
	at org.infinispan.statetransfer.StateRequestCommand.perform(StateRequestCommand.java:91)
	at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101)
	at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122)
	at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:247)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220)
	at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
	at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
	at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
	at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
	at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130)
	at org.jgroups.JChannel.up(JChannel.java:707)
	at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
	at org.jgroups.protocols.RSVP.up(RSVP.java:188)
	at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
	at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
	at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
	at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
	at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245)
	at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765)
	at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420)
	at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606)
	at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143)
	at org.jgroups.protocols.FD.up(FD.java:253)
	at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
	at org.jgroups.protocols.MERGE3.up(MERGE3.java:290)
	at org.jgroups.protocols.Discovery.up(Discovery.java:359)
	at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616)
	at org.jgroups.protocols.TP.passMessageUp(TP.java:1263)
	at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1825)
	at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1798)
	... 3 more

Seen in https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-jvmkill-dist-sync/35/console-perf19/ and indeed, the perf21 log contains the causing exception ("Segments must not be null or empty").

Note that this exception is logged immediately after this one:

03:40:18,218 WARN  [org.infinispan.topology.CacheTopologyControlCommand] (OOB-426,shared=udp) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl, type=REBALANCE_CONFIRM, sender=perf21/ejb, joinInfo=null, topologyId=32, currentCH=null, pendingCH=null, throwable=null, viewId=12}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf21/ejb for cache org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl, we don't have a rebalance in progress
	at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:206)
	at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160)
	at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:253)
	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220)
	at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
	at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
	at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
	at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
	at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130)
	at org.jgroups.JChannel.up(JChannel.java:707)
	at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
	at org.jgroups.protocols.RSVP.up(RSVP.java:172)
	at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
	at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
	at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
	at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
	at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245)
	at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765)
	at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420)
	at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606)
	at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143)
	at org.jgroups.protocols.FD.up(FD.java:253)
	at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
	at org.jgroups.protocols.MERGE3.up(MERGE3.java:290)
	at org.jgroups.protocols.Discovery.up(Discovery.java:359)
	at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616)
	at org.jgroups.protocols.TP.passMessageUp(TP.java:1263)
	at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1825)
	at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1798)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43]

This happens while the entire cluster is being brought down as the test finishes. This particular node (perf19) is still running, and the other nodes mentioned in the exceptions (perf21) are running as well. But note that the exceptions on perf19 start to appear immediately after it receives new view, because perf18 (the first node that is being shut down) is already gone.

Comment 1 Jitka Kozana 2013-05-13 14:01:13 UTC

Again, seen in ER8.

Link to job:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/

Link to server log:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/artifact/report/config/jboss-perf20/server.log
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/artifact/report/config/jboss-perf19/server.log

Comment 2 Ladislav Thon 2013-08-23 12:27:11 UTC

Still seeing this with EAP 6.1.1.ER7. For example:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync/26/
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/22/

Comment 3 Paul Ferraro 2013-12-09 14:48:39 UTC

This is expected when shutting down a cluster that uses DIST, you need to either:
* Stagger node shutdown
* Ignore exceptions during shutdown

Comment 4 Scott Mumford 2014-04-23 02:14:07 UTC

Marking for exclusion from Release Notes documentation as not a bug.

Note You need to log in before you can comment on or make changes to this bug.