Found this exception during EAP 6.1.0.ER3 testing: WARN [org.infinispan.statetransfer.InboundTransferTask] (transport-thread-10) ISPN000210: Failed to request segments [] of cache org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl from node perf21/ejb (node will not be retried): org.infinispan.remoting.RemoteException: ISPN000217: Received exception from perf21/ejb, see cause for remote stack trace at org.infinispan.remoting.transport.AbstractTransport.checkResponse(AbstractTransport.java:72) at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:102) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:541) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:169) at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:195) at org.infinispan.statetransfer.InboundTransferTask.requestSegments(InboundTransferTask.java:141) at org.infinispan.statetransfer.StateConsumerImpl$1.run(StateConsumerImpl.java:707) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_43] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43] Caused by: java.lang.IllegalArgumentException: Segments must not be null or empty at org.infinispan.statetransfer.OutboundTransferTask.<init>(OutboundTransferTask.java:103) at org.infinispan.statetransfer.StateProviderImpl.startOutboundTransfer(StateProviderImpl.java:286) at org.infinispan.statetransfer.StateRequestCommand.perform(StateRequestCommand.java:91) at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101) at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122) at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:247) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220) at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484) at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130) at org.jgroups.JChannel.up(JChannel.java:707) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) at org.jgroups.protocols.RSVP.up(RSVP.java:188) at org.jgroups.protocols.FRAG2.up(FRAG2.java:181) at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245) at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765) at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420) at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143) at org.jgroups.protocols.FD.up(FD.java:253) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) at org.jgroups.protocols.Discovery.up(Discovery.java:359) at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616) at org.jgroups.protocols.TP.passMessageUp(TP.java:1263) at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1825) at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1798) ... 3 more Seen in https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-jvmkill-dist-sync/35/console-perf19/ and indeed, the perf21 log contains the causing exception ("Segments must not be null or empty"). Note that this exception is logged immediately after this one: 03:40:18,218 WARN [org.infinispan.topology.CacheTopologyControlCommand] (OOB-426,shared=udp) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl, type=REBALANCE_CONFIRM, sender=perf21/ejb, joinInfo=null, topologyId=32, currentCH=null, pendingCH=null, throwable=null, viewId=12}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf21/ejb for cache org.jboss.test.clusterbench.ejb.stateful.RemoteStatefulSBImpl, we don't have a rebalance in progress at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:206) at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160) at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:253) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220) at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484) at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130) at org.jgroups.JChannel.up(JChannel.java:707) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) at org.jgroups.protocols.RSVP.up(RSVP.java:172) at org.jgroups.protocols.FRAG2.up(FRAG2.java:181) at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245) at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765) at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420) at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143) at org.jgroups.protocols.FD.up(FD.java:253) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) at org.jgroups.protocols.Discovery.up(Discovery.java:359) at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2616) at org.jgroups.protocols.TP.passMessageUp(TP.java:1263) at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1825) at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1798) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_43] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_43] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_43] This happens while the entire cluster is being brought down as the test finishes. This particular node (perf19) is still running, and the other nodes mentioned in the exceptions (perf21) are running as well. But note that the exceptions on perf19 start to appear immediately after it receives new view, because perf18 (the first node that is being shut down) is already gone.
Again, seen in ER8. Link to job: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/ Link to server log: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/artifact/report/config/jboss-perf20/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbservlet-undeploy-dist-sync/14/artifact/report/config/jboss-perf19/server.log
Still seeing this with EAP 6.1.1.ER7. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync/26/ https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/22/
This is expected when shutting down a cluster that uses DIST, you need to either: * Stagger node shutdown * Ignore exceptions during shutdown
Marking for exclusion from Release Notes documentation as not a bug.