Bug 918791 - Starting a server with multiple web apps, causes deployment failures
Summary: Starting a server with multiple web apps, causes deployment failures
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Clustering
Version: 6.0.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: DR1
: EAP 6.2.0
Assignee: Paul Ferraro
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-06 21:08 UTC by Shay Matasaro
Modified: 2023-09-14 01:41 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
In some cases, web applications on a clustered server would fail to deploy if multiple applications were being deployed. Each application in this situation would attempt to lock the cache manager to create its cache, and the first application that obtained a lock would deploy successfully. However, depending on the time taken to deploy, any other deployments could timeout while waiting for access to the cache manager and fail to deploy. JBoss EAP 6 now includes a `GlobalComponentRegistryService` which handles this scenario and applications now deploy successfully in this situation.
Clone Of:
Environment:
Last Closed: 2013-12-15 16:12:45 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker AS7-6685 0 Major Closed Starting a server with multiple web apps , causes deploymeent failures 2018-07-27 01:54:02 UTC
Red Hat Issue Tracker ISPN-2898 0 Major Open Web Cache fails to start when multiple apps are trying to create it 2018-07-27 01:54:02 UTC

Description Shay Matasaro 2013-03-06 21:08:45 UTC
When the server starts with multiple web apps , and each one tries to create its cache , the first one locks all other because of a lock on cache manager see ISPN-2898

this results in deployment failure , and the apps will not get redeployed

Comment 1 JBoss JIRA Server 2013-03-06 21:11:23 UTC
Dennis Reed <dereed> made a comment on jira ISPN-2898

The issue is the cacheCreateLock in org.infinispan.manager.DefaultCacheManager#wireCache

When multiple caches are started at the same time, some are timing out waiting for this lock.

Comment 2 JBoss JIRA Server 2013-03-07 02:45:20 UTC
jaikiran pai <jpai> made a comment on jira AS7-6685

Which version of AS7 is this reproducible against and is it always reproducible?

Comment 3 Radoslav Husar 2013-03-07 10:19:34 UTC
You really need to specify affected versions. Bug reports without it are useless.

Also please paste the exception, makes much easier looking for whether its a known issue or not.

Comment 4 Shay Matasaro 2013-03-07 14:38:56 UTC
version numbers added to all 3 bugs

Comment 5 Shay Matasaro 2013-03-07 14:41:21 UTC
the issue depends on the time it takes to start the cachemanager and the number of deployed apps , i assume that if the manager goes up really fast deployments might not fail,  but generally this is pretty consistent

Comment 6 Shay Matasaro 2013-03-07 15:14:52 UTC
:22:40,823 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 60) MSC000001: Failed to start service jboss.infinispan.web.default-host/FT: org.jboss.msc.service.StartException in service jboss.infinispan.web.default-host/FT: org.infinispan.CacheException: Unable to acquire lock on cache with name default-host/FT
	at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:87) [jboss-as-clustering-common-7.1.3.Final-redhat-4.jar:7.1.3.Final-redhat-4]
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_37]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_37]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_37]
	at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.0.0.GA-redhat-2.jar:2.0.0.GA-redhat-2]
Caused by: org.infinispan.CacheException: Unable to acquire lock on cache with name default-host/FT
	at org.infinispan.manager.DefaultCacheManager.wireCache(DefaultCacheManager.java:675)
	at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:549)
	at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:563)
	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:125)
	at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:116)
	at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)
	at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:82) [jboss-as-clustering-common-7.1.3.Final-redhat-4.jar:7.1.3.Final-redhat-4]
	... 4 more

Comment 7 JBoss JIRA Server 2013-04-09 07:30:25 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-2898

What is the lockAcquisitionTimeout on the caches being started? I think increasing that timeout should be a good enough workaround.

As for Infinispan code changes, we should add a global startupTimeout that should be used for the global cacheCreationLock acquisition instead of each cache's lockAcquisitionTimeout. Changing DefaultCacheManager.start() to start all the components in the global component registry, including the transport, would help as well.

Comment 8 JBoss JIRA Server 2013-04-09 14:47:15 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-2898

Correction: the proper workaround is to increase the lockAcquisitionTimeout of the *default* cache, not of the cache being started. I have tested it and it works.

Comment 10 Paul Ferraro 2013-08-29 22:53:36 UTC
We can backport the GlobalComponentRegistryService from upstream master.  This should alleviate the problem significantly.

Comment 11 Paul Ferraro 2013-09-03 12:39:17 UTC
https://github.com/jbossas/jboss-eap/pull/327

Comment 18 Red Hat Bugzilla 2023-09-14 01:41:59 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.