Description of problem: If a EAP domain deployment takes more then a couple of minutes to upload or takes more then 10 seconds to copy to the assigned managed servers, the update operation can fail. Version-Release number of selected component (if applicable): 3.3.7 How reproducible: Always Steps to Reproduce: 1. Install, configure, and start JBoss ON system and install latest EAP plug-in pack update. 2. Install, configure, and start EAP 6.4 domain 3. Import EAP domain into JBoss ON inventory and configure connection settings. 4. Create new DomainDeployment child resource with version 1 of WAR. 5. Assign newly imported WAR to one or more server groups that have managed servers running in them. 6. Create JBoss ON content repository names WAR Repo and subscribe version 1 of WAR to it from its content subscription page. 7. Create version 2 of WAR and add it to the WAR Repo. Version 2 of the WAR should be very large. In my testing I was using 240 MB but the bigger the better. 8. Attempt to update the existing WAR with version 2 of the WAR from the new content page. Using these steps, with slight variation, we should have a total of 4 scenarios. Scenario 1: The upload of the WAR needs to take more then 2 minutes. Scenario 2: The upload of the WAR needs to take less then 1 minute and the "add" operation which adds the WAR to the managed servers in the assigned server group needs to take more then 10 seconds. Scenario 3 and 4 are a repeat of scenario 1 and 2 but using version numbers in the file names of the WARs. This is because domain deployments that contain version identifiers in their file names are deployed using a different set of operations (not sure why but this is what we do). For example, in scenario 1 and 2, no version info is used such as jboss-helloworld.war. In scenario 3 and 4, version info is included in the file name such as jboss-helloworld-1.0.war and jboss-helloworld-2.0.war. When deploying the content in all scenarios, the run-time name should be jboss-helloworld.war. Actual results: For scenario 1 and 3, the JBoss ON UI and EAP server will reflect version 1 of the WAR is still deployed. For a short period of time, WAR version 2 will be visible in the EAP server's domain/data/content/ directory but it will not actually be deployed. For scenario 2 and 4, the JBoss ON UI will reflect version 1 of the WAR is still deployed. The EAP server will reflect version 2 of the WAR is deployed and version 1 has been replaced. Expected results: For all scenarios, JBoss ON and EAP should reflect version 2 of the WAR being deployed. Additional info: The cause of this is the hard-coded 10 second operation timeout seen in ASConnection when executing API operations. In many cases, it will take longer then 10 seconds for the EAP host controller to copy a WAR to all the other host controllers in a domain and then copy them to their respective servers. This 10 second timeout may be acceptable for basic get attribute or read resource operations but start, stop, add, and even remove operations can be blocking. 10 seconds is not nearly enough. This should probably be 10 minutes. Additionally, the add-content REST operation performed to upload the WAR to the domain's content repository can take some time. I am not certain if an internal timeout is occurring or the issue is just that resource discovery doesn't get executed. However, in either case, the actual content never gets replaced if the upload takes too long. I think this is a result of the add-content operation timing out and the actual add/deploy operations do not get executed.
This issue seems to have been reported in the past and perhaps not totally fixed.
Fixed in the master: commit 0a67b2da3772c3671e4088c3238e40b68719e68a Author: Michael Burman <miburman> Date: Wed Mar 8 16:03:22 2017 +0200 [BZ 1381640] Add configurable timeout for domain deployment server-group assignment
Created attachment 1261293 [details] Assign to group example with timeout
Additional commit from the master: commit 616c609dc51baac9ea31297da991f09146ddc959 Author: Michael Burman <miburman> Date: Wed Mar 8 17:02:23 2017 +0200 Fix potential NPE in the DomainDeployment
In the master: commit e59d8ba96237664a4cfcdb60ac0096e868084906 Author: Michael Burman <miburman> Date: Wed Apr 19 18:07:07 2017 +0300 [BZ 1381640] Use userProvidedTimeoutMillis for ASUploadConnection timeout and increase the default timeout
And in the master: commit 35f31d0d61f0efa3b923cb55c3666b5bcd53050f Author: Michael Burman <miburman> Date: Wed Apr 26 13:49:53 2017 +0300 [BZ 1381640] Add server connection settings to manage deploymentTimeouts