Bug 1381640 - DomainDeployment content update fails due to upload or deploy timeouts
Summary: DomainDeployment content update fails due to upload or deploy timeouts
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Plugin -- JBoss EAP 6
Version: JON 3.3.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ER02
: One-off release
Assignee: Michael Burman
QA Contact: Filip Brychta
URL:
Whiteboard:
Depends On:
Blocks: 1327633
TreeView+ depends on / blocked
 
Reported: 2016-10-04 15:43 UTC by Larry O'Leary
Modified: 2019-12-16 06:59 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
: 1387296 (view as bug list)
Environment:
Last Closed: 2017-07-13 15:17:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Assign to group example with timeout (78.74 KB, image/png)
2017-03-08 14:10 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 707759 0 high CLOSED Deployment of new WAR to EAP server fails due to thread timeout - Need configurable thread timeout 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 785211 0 medium NEW Content deploy in UI can hang forever 2022-03-31 04:28:10 UTC
Red Hat Bugzilla 802796 0 high CLOSED Request more information on Deployment timeout setting 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 812452 0 high CLOSED [eap6] timeout issues - deployment 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1387296 0 high CLOSED DomainDeployment content update fails due to upload or deploy timeouts 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 2664701 0 None None None 2016-10-04 15:44:18 UTC

Internal Links: 707759 785211 802796 812452 1387296

Description Larry O'Leary 2016-10-04 15:43:20 UTC
Description of problem:
If a EAP domain deployment takes more then a couple of minutes to upload or takes more then 10 seconds to copy to the assigned managed servers, the update operation can fail.

Version-Release number of selected component (if applicable):
3.3.7

How reproducible:
Always

Steps to Reproduce:
1. Install, configure, and start JBoss ON system and install latest EAP plug-in pack update.
2. Install, configure, and start EAP 6.4 domain
3. Import EAP domain into JBoss ON inventory and configure connection settings.
4. Create new DomainDeployment child resource with version 1 of WAR.
5. Assign newly imported WAR to one or more server groups that have managed servers running in them.
6. Create JBoss ON content repository names WAR Repo and subscribe version 1 of WAR to it from its content subscription page.
7. Create version 2 of WAR and add it to the WAR Repo. Version 2 of the WAR should be very large. In my testing I was using 240 MB but the bigger the better.
8. Attempt to update the existing WAR with version 2 of the WAR from the new content page.

Using these steps, with slight variation, we should have a total of 4 scenarios.

Scenario 1: The upload of the WAR needs to take more then 2 minutes.
Scenario 2: The upload of the WAR needs to take less then 1 minute and the "add" operation which adds the WAR to the managed servers in the assigned server group needs to take more then 10 seconds. 

Scenario 3 and 4 are a repeat of scenario 1 and 2 but using version numbers in the file names of the WARs. This is because domain deployments that contain version identifiers in their file names are deployed using a different set of operations (not sure why but this is what we do). For example, in scenario 1 and 2, no version info is used such as jboss-helloworld.war. In scenario 3 and 4, version info is included in the file name such as jboss-helloworld-1.0.war and jboss-helloworld-2.0.war.

When deploying the content in all scenarios, the run-time name should be jboss-helloworld.war.

Actual results:
For scenario 1 and 3, the JBoss ON UI and EAP server will reflect version 1 of the WAR is still deployed. For a short period of time, WAR version 2 will be visible in the EAP server's domain/data/content/ directory but it will not actually be deployed. 

For scenario 2 and 4, the JBoss ON UI will reflect version 1 of the WAR is still deployed. The EAP server will reflect version 2 of the WAR is deployed and version 1 has been replaced.

Expected results:
For all scenarios, JBoss ON and EAP should reflect version 2 of the WAR being deployed.

Additional info:
The cause of this is the hard-coded 10 second operation timeout seen in ASConnection when executing API operations. In many cases, it will take longer then 10 seconds for the EAP host controller to copy a WAR to all the other host controllers in a domain and then copy them to their respective servers. 

This 10 second timeout may be acceptable for basic get attribute or read resource operations but start, stop, add, and even remove operations can be blocking. 10 seconds is not nearly enough. This should probably be 10 minutes.

Additionally, the add-content REST operation performed to upload the WAR to the domain's content repository can take some time. I am not certain if an internal timeout is occurring or the issue is just that resource discovery doesn't get executed. However, in either case, the actual content never gets replaced if the upload takes too long. I think this is a result of the add-content operation timing out and the actual add/deploy operations do not get executed.

Comment 1 Larry O'Leary 2016-10-04 15:44:19 UTC
This issue seems to have been reported in the past and perhaps not totally fixed.

Comment 5 Michael Burman 2017-03-08 14:05:00 UTC
Fixed in the master:

commit 0a67b2da3772c3671e4088c3238e40b68719e68a
Author: Michael Burman <miburman>
Date:   Wed Mar 8 16:03:22 2017 +0200

    [BZ 1381640] Add configurable timeout for domain deployment server-group assignment

Comment 7 Michael Burman 2017-03-08 14:10:34 UTC
Created attachment 1261293 [details]
Assign to group example with timeout

Comment 9 Michael Burman 2017-03-08 15:04:33 UTC
Additional commit from the master:

commit 616c609dc51baac9ea31297da991f09146ddc959
Author: Michael Burman <miburman>
Date:   Wed Mar 8 17:02:23 2017 +0200

    Fix potential NPE in the DomainDeployment

Comment 18 Michael Burman 2017-04-19 15:08:17 UTC
In the master:

commit e59d8ba96237664a4cfcdb60ac0096e868084906
Author: Michael Burman <miburman>
Date:   Wed Apr 19 18:07:07 2017 +0300

    [BZ 1381640] Use userProvidedTimeoutMillis for ASUploadConnection timeout and increase the default timeout

Comment 20 Michael Burman 2017-04-26 10:51:43 UTC
And in the master:

commit 35f31d0d61f0efa3b923cb55c3666b5bcd53050f
Author: Michael Burman <miburman>
Date:   Wed Apr 26 13:49:53 2017 +0300

    [BZ 1381640] Add server connection settings to manage deploymentTimeouts


Note You need to log in before you can comment on or make changes to this bug.