EAP 6.1.0.DR3 When using DIST cache, we are seeing the following exception at (graceful) server shutdown: 04:15:47,719 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-2) ISPN000082: Stopping the RpcDispatcher 04:15:47,721 ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-11) Failed to start rebalance: null: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1302) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:227) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.get(FutureTask.java:91) [rt.jar:1.6.0_38] at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:567) at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:410) at org.infinispan.topology.ClusterTopologyManagerImpl.startRebalance(ClusterTopologyManagerImpl.java:399) at org.infinispan.topology.ClusterTopologyManagerImpl.access$000(ClusterTopologyManagerImpl.java:66) at org.infinispan.topology.ClusterTopologyManagerImpl$1.call(ClusterTopologyManagerImpl.java:128) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] 04:15:47,724 INFO [org.jboss.as] (MSC service thread 1-8) JBAS015950: JBoss EAP 6.1.0.DR3 (AS 7.2.0.Alpha1-redhat-3) stopped in 33738ms See the full log here: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Failover/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync-tcpStack/2/artifact/report/config/jboss-perf18/server.log
Seen something that looks related: WARN [org.infinispan.topology.CacheTopologyControlCommand] (transport-thread-9) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=dist, type=REBALANCE_CONFIRM, sender=perf18/ejb, joinInfo=null, topologyId=25, currentCH=null, pendingCH=null, throwable=null, viewId=10}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf18/ejb for cache dist, we don't have a rebalance in progress [JBossINF] at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:206) [JBossINF] at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160) [JBossINF] at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137) [JBossINF] at org.infinispan.topology.LocalTopologyManagerImpl$1.call(LocalTopologyManagerImpl.java:281) [JBossINF] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_38] [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_38] [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] [JBossINF] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] Which also looks similar to https://issues.jboss.org/browse/ISPN-2349 Full log: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync-3owners/6/console-perf18/ This is EAP 6.1.0.ER1.
And this one looks related too: ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-22) Failed to start rebalance: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly: java.util.concurrent.ExecutionException: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.get(FutureTask.java:91) [rt.jar:1.6.0_38] at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:567) at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:410) at org.infinispan.topology.ClusterTopologyManagerImpl.startRebalance(ClusterTopologyManagerImpl.java:399) at org.infinispan.topology.ClusterTopologyManagerImpl.access$000(ClusterTopologyManagerImpl.java:66) at org.infinispan.topology.ClusterTopologyManagerImpl$1.call(ClusterTopologyManagerImpl.java:128) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] Caused by: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:96) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:541) at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:549) at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:546) ... 5 more Caused by: java.lang.IllegalStateException: transport was closed at org.jgroups.blocks.GroupRequest.transportClosed(GroupRequest.java:273) at org.jgroups.blocks.RequestCorrelator.stop(RequestCorrelator.java:269) at org.jgroups.blocks.MessageDispatcher.stop(MessageDispatcher.java:152) at org.jgroups.blocks.MessageDispatcher.channelDisconnected(MessageDispatcher.java:455) at org.jgroups.Channel.notifyChannelDisconnected(Channel.java:507) at org.jgroups.JChannel.disconnect(JChannel.java:363) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.stop(JGroupsTransport.java:258) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_38] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_38] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38] at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38] at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203) at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:883) at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:690) at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:568) at org.infinispan.factories.GlobalComponentRegistry.stop(GlobalComponentRegistry.java:260) at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:739) at org.infinispan.manager.AbstractDelegatingEmbeddedCacheManager.stop(AbstractDelegatingEmbeddedCacheManager.java:179) at org.jboss.as.clustering.infinispan.subsystem.EmbeddedCacheManagerService.stop(EmbeddedCacheManagerService.java:76) at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1911) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:1874) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] ... 3 more Full log: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/10/console-perf18/ Again EAP 6.1.0.ER1.
Maybe related: https://bugzilla.redhat.com/show_bug.cgi?id=923769
Still seeing this with EAP 6.1.0.ER8. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/24/artifact/report/config/jboss-perf21/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/54/artifact/report/config/jboss-perf21/server.log
Still seeing this with EAP 6.1.1.ER7. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/28/ https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/58/
Seen in 6.2.0.CR3 (GA), this is one of the more common warnings: 14:16:22,320 WARN [org.infinispan.topology.CacheTopologyControlCommand] (OOB-13,shared=udp) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=default-host/clusterbench, type=REBALANCE_CONFIRM, sender=perf19/web, joinInfo=null, topologyId=26, currentCH=null, pendingCH=null, throwable=null, viewId=9}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf19/web for cache default-host/clusterbench, we don't have a rebalance in progress at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:207) at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:169) at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:146) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:253) at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220) at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484) at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600) at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130) at org.jgroups.JChannel.up(JChannel.java:707) at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1025) at org.jgroups.protocols.RSVP.up(RSVP.java:172) at org.jgroups.protocols.FRAG2.up(FRAG2.java:181) at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245) at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765) at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420) at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:645) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:147) at org.jgroups.protocols.FD.up(FD.java:253) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) at org.jgroups.protocols.Discovery.up(Discovery.java:359) at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2607) at org.jgroups.protocols.TP.passMessageUp(TP.java:1260) at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1822) at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1795) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45] Link to sample build: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/ Links to server logs: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf18/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf19/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf21/server.log
Graceful shutdown is not support in the affected version of EAP 6. The described behavior is to be expected. The RFE to implement graceful shutdown is https://issues.jboss.org/browse/EAP6-7 Issues relating to HTTP can be avoided using mod_cluster and session draining prior to shutdown/undeploy.
This bug pertains to clean shutdown which is not scheduled to be implemented in EAP 6.x - and is targeted to be addressed in 7.0. Setting resolution to WONTFIX.
Moved to https://issues.jboss.org/browse/EAP7-86
Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved
Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened