Bug 908333
| Summary: | DIST: Failed to start rebalance: null: java.lang.InterruptedException at server shutdown | ||
|---|---|---|---|
| Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Jitka Kozana <jkudrnac> |
| Component: | Clustering | Assignee: | Paul Ferraro <paul.ferraro> |
| Status: | CLOSED DEFERRED | QA Contact: | Jitka Kozana <jkudrnac> |
| Severity: | medium | Docs Contact: | Russell Dickenson <rdickens> |
| Priority: | unspecified | ||
| Version: | 6.1.1, 6.2.0 | CC: | cdewolf, jkudrnac, lcosti, lthon, myarboro, rhatlapa, rhusar, rjanik, rsvoboda |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | TBD EAP 7 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-03-19 11:04:13 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jitka Kozana
2013-02-06 13:04:46 UTC
Seen something that looks related:
WARN [org.infinispan.topology.CacheTopologyControlCommand] (transport-thread-9) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=dist, type=REBALANCE_CONFIRM, sender=perf18/ejb, joinInfo=null, topologyId=25, currentCH=null, pendingCH=null, throwable=null, viewId=10}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf18/ejb for cache dist, we don't have a rebalance in progress
[JBossINF] at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:206)
[JBossINF] at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160)
[JBossINF] at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137)
[JBossINF] at org.infinispan.topology.LocalTopologyManagerImpl$1.call(LocalTopologyManagerImpl.java:281)
[JBossINF] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_38]
[JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_38]
[JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38]
[JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38]
[JBossINF] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38]
Which also looks similar to https://issues.jboss.org/browse/ISPN-2349
Full log: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-ejb-ejbremote-shutdown-dist-sync-3owners/6/console-perf18/
This is EAP 6.1.0.ER1.
And this one looks related too: ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread-22) Failed to start rebalance: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly: java.util.concurrent.ExecutionException: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.get(FutureTask.java:91) [rt.jar:1.6.0_38] at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:567) at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastRebalanceStart(ClusterTopologyManagerImpl.java:410) at org.infinispan.topology.ClusterTopologyManagerImpl.startRebalance(ClusterTopologyManagerImpl.java:399) at org.infinispan.topology.ClusterTopologyManagerImpl.access$000(ClusterTopologyManagerImpl.java:66) at org.infinispan.topology.ClusterTopologyManagerImpl$1.call(ClusterTopologyManagerImpl.java:128) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [rt.jar:1.6.0_38] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] Caused by: org.infinispan.CacheException: Remote (perf20/ejb) failed unexpectedly at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:96) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:541) at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:549) at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:546) ... 5 more Caused by: java.lang.IllegalStateException: transport was closed at org.jgroups.blocks.GroupRequest.transportClosed(GroupRequest.java:273) at org.jgroups.blocks.RequestCorrelator.stop(RequestCorrelator.java:269) at org.jgroups.blocks.MessageDispatcher.stop(MessageDispatcher.java:152) at org.jgroups.blocks.MessageDispatcher.channelDisconnected(MessageDispatcher.java:455) at org.jgroups.Channel.notifyChannelDisconnected(Channel.java:507) at org.jgroups.JChannel.disconnect(JChannel.java:363) at org.infinispan.remoting.transport.jgroups.JGroupsTransport.stop(JGroupsTransport.java:258) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_38] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_38] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38] at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38] at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203) at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:883) at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:690) at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:568) at org.infinispan.factories.GlobalComponentRegistry.stop(GlobalComponentRegistry.java:260) at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:739) at org.infinispan.manager.AbstractDelegatingEmbeddedCacheManager.stop(AbstractDelegatingEmbeddedCacheManager.java:179) at org.jboss.as.clustering.infinispan.subsystem.EmbeddedCacheManagerService.stop(EmbeddedCacheManagerService.java:76) at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1911) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:1874) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] ... 3 more Full log: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/10/console-perf18/ Again EAP 6.1.0.ER1. Maybe related: https://bugzilla.redhat.com/show_bug.cgi?id=923769 Still seeing this with EAP 6.1.0.ER8. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/24/artifact/report/config/jboss-perf21/server.log https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/54/artifact/report/config/jboss-perf21/server.log Still seeing this with EAP 6.1.1.ER7. For example: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/28/ https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-sync/58/ Seen in 6.2.0.CR3 (GA), this is one of the more common warnings:
14:16:22,320 WARN [org.infinispan.topology.CacheTopologyControlCommand] (OOB-13,shared=udp) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=default-host/clusterbench, type=REBALANCE_CONFIRM, sender=perf19/web, joinInfo=null, topologyId=26, currentCH=null, pendingCH=null, throwable=null, viewId=9}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf19/web for cache default-host/clusterbench, we don't have a rebalance in progress
at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:207)
at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:169)
at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:146)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:253)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:220)
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600)
at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130)
at org.jgroups.JChannel.up(JChannel.java:707)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1025)
at org.jgroups.protocols.RSVP.up(RSVP.java:172)
at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:245)
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:765)
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:420)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:645)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:147)
at org.jgroups.protocols.FD.up(FD.java:253)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
at org.jgroups.protocols.MERGE3.up(MERGE3.java:290)
at org.jgroups.protocols.Discovery.up(Discovery.java:359)
at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2607)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1260)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1822)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1795)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) [rt.jar:1.6.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) [rt.jar:1.6.0_45]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]
Link to sample build:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/
Links to server logs:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf18/server.log
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf19/server.log
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-shutdown-dist-async/67/artifact/report/config/jboss-perf21/server.log
Graceful shutdown is not support in the affected version of EAP 6. The described behavior is to be expected. The RFE to implement graceful shutdown is https://issues.jboss.org/browse/EAP6-7 Issues relating to HTTP can be avoided using mod_cluster and session draining prior to shutdown/undeploy. This bug pertains to clean shutdown which is not scheduled to be implemented in EAP 6.x - and is targeted to be addressed in 7.0. Setting resolution to WONTFIX. Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved Radim Hatlapatka <rhatlapa> updated the status of jira EAP7-86 to Reopened Jason Greene <jason.greene> updated the status of jira EAP7-86 to Resolved |