Hide Forgot
We can see "CacheException: Initial state transfer timed out" which causes EAP server to abort startup in our failover tests with invalidation cache and shared cache store. Scenario description: HTTP traffic accessing clustered web application that has replicated sessions (uses a mod_cluster load balancer). Delay between sending a new request after receiving a response is 1000 ms (for each client). Session size is 34 KB. 4-node EAP cluster + 4-node JDG cluster, one EAP node at time is shut down and after some time started again, while 6000 standalone clients keep calling the application. Configuration: - 4-node EAP cluster with an invalidation cache + a shared cache store (remote JDG cluster) - 4-node JDG cluster with distributed cache - 4 nodes generating load (6000 clients in total) - cache mode: ASYNC or SYNC (for both invalidation and distributed caches, also "write-behind" element is set for "remote-store" element accordingly) - Links to configuration files: EAP http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stressfailover-remote-jdg-session-shutdown-invalidation-async-4nodes-perf17/4/artifact/report/config/jboss-perf18/standalone-ha.xml JDG http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stressfailover-remote-jdg-session-shutdown-invalidation-async-4nodes-perf17/4/artifact/report/config/jboss-perf22/clustered.xml When EAP server is being started up (after previous shutdown), it sometimes logs this error, which causes EAP server to abort startup: [JBossINF] [0m[31m22:17:51,081 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 55) MSC000001: Failed to start service jboss.infinispan.web.dist: org.jboss.msc.service.StartException in service jboss.infinispan.web.dist: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl [JBossINF] at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:91) [jboss-as-clustering-common-7.5.8.Final-redhat-2.jar:7.5.8.Final-redhat-2] [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_91] [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_91] [JBossINF] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_91] [JBossINF] at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.2.Final-redhat-1.jar:2.1.2.Final-redhat-1] [JBossINF] Caused by: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl [JBossINF] at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205) [JBossINF] at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886) [JBossINF] at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657) [JBossINF] at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646) [JBossINF] at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549) [JBossINF] at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217) [JBossINF] at org.infinispan.CacheImpl.start(CacheImpl.java:582) [JBossINF] at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686) [JBossINF] at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649) [JBossINF] at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545) [JBossINF] at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:559) [JBossINF] at org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:113) [JBossINF] at org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:104) [JBossINF] at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78) [JBossINF] at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:86) [jboss-as-clustering-common-7.5.8.Final-redhat-2.jar:7.5.8.Final-redhat-2] [JBossINF] ... 4 more [JBossINF] Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache dist on perf19/web [JBossINF] at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216) [JBossINF] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_91] [JBossINF] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_91] [JBossINF] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_91] [JBossINF] at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_91] [JBossINF] at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203) [JBossINF] ... 18 more Please note, that the state transfer times out for *dist* cache, which is not actually used for anything in this test. The only utilized cache is the invalidation cache named "offload". This issue occurs only under high load. Link to server log: http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stressfailover-remote-jdg-session-shutdown-invalidation-async-4nodes-perf17/4/console-perf19/