Bug 951738

Summary: [hotfix] Use Commons HttpClient in ASConnection instead of JDK's HttpUrlConnection
Product: [JBoss] JBoss Operations Network Reporter: Larry O'Leary <loleary>
Component: Plugin -- JBoss EAP 6Assignee: Larry O'Leary <loleary>
Status: CLOSED ERRATA QA Contact: Libor Zoubek <lzoubek>
Severity: high Docs Contact:
Priority: urgent    
Version: JON 3.1.2CC: theute, tsegismo
Target Milestone: ---   
Target Release: JON 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 950660 Environment:
Last Closed: 2013-05-03 14:15:11 UTC Type: Support Patch
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 950660    
Bug Blocks: 951739, 952330, 952331, 952333, 958984    

Description Larry O'Leary 2013-04-12 22:41:21 UTC
Back-port to JBoss ON 3.1.2 as the current implementation used results in connection management/caching issues that can not be fixed due to the underlying implementation in the JVM.

This fix is necessary to resolve other AS7 plug-in issues in JBoss ON 3.1.2. For example Bug 918677 - [as7] ASConnection becomes invalid and stale if security realm is reset or has a partition event resulting in loss of management and monitoring of AS7 resources

+++ This bug was initially created as a clone of Bug #950660 +++

+++ This bug was initially created as a clone of Bug #929200 +++

+++ This bug was initially created as a clone of Bug #929198 +++

In AS7 plugin, ASConnection class currently uses JDK's HttpUrlConnection to communicate with an AS7 instance.

This class should be updated to use Commons HttpClient, like ASUploadConnection class does since release 4.6 (see BZ887320)

Commons HttpClient gives us more control on the communication process and has proven to be more reliable.

--- Additional comment from Thomas Segismont on 2013-03-29 14:28:38 CET ---

See BZ918677 for the type of problems we have with JDK's HttpUrlConnection

--- Additional comment from Charles Crouch on 2013-03-29 15:34:29 CET ---

Since this seems to be related more to fixing a bug than adding a feature I've removed the FutureFeature keyword. I've also unset the Target Release field to make sure this gets triaged.

A change in the http library used by the AS7 plugin will require a full regression test run, since http communication is used so widely to talk to AS7

--- Additional comment from Thomas Segismont on 2013-04-02 10:57:36 CEST ---

BZ918677 has been partially fixed (only works on some platforms). Moving ASConnection to Commons HttpClient will allow to get rid of the problem on all platforms.

ASConnection is the core communication component of AS7 plugin. Changing its implementation will involve extensive non regression testing from QE before the next JON release. But as ASUploadConnection has already been moved to commons HttpClient due to BZ887320, it has to be done anyway.

As for the benefits of this change:
* cohesion between ASConnection and ASUploadConnection implementations
* easier debugging as JDK's HttpUrlConnection is closed source (at least in Oracle's VM)
* Commons HttpClient gives the developer more control on the communication process (when/how open or close connections, configurable pooling, unshared authentication)

--- Additional comment from Thomas Segismont on 2013-04-02 11:56:28 CEST ---

And for the risks.

No instance of HttpUrlConnection is shared outside ASConnection. So components in AS7 plugin only rely on ASConnection contract.

Consequently, while ASConnection is the core communication component of AS7 plugin, the risk of regression is very low.

--- Additional comment from Thomas Segismont on 2013-04-08 23:42:52 CEST ---

Fixed in master - 21ec5e2

AS7 plugin itests passed succesfully

Beware that this touches the core connection part (ASConnection class) and hence requires a full non regresssion testing of AS7 plugin.

--- Additional comment from Thomas Segismont on 2013-04-09 12:08:13 CEST ---

Reworked to increased delay in AS7 itests to discover all resources
master - 3b18013

--- Additional comment from Thomas Segismont on 2013-04-10 11:17:22 EDT ---

Changes reported from master to release/jon3.1.x - 89d941d

All unit and integration tests pass on my box and on Jenkins:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/RHQ/job/rhq-master/org.rhq$rhq-jboss-as-7-plugin/

Comment 2 Larry O'Leary 2013-04-15 17:35:34 UTC
Needs verified from hotfix patch based on build org.jboss.on-jboss-on-parent-3.1.2.GA-11[1].



[1]: https://brewweb.devel.redhat.com//buildinfo?buildID=266751

Comment 3 Larry O'Leary 2013-04-15 17:48:43 UTC
The as-7 plug-in from build org.jboss.on-jboss-on-parent-3.1.2.GA-11 is contained as attachment 736012 [details] of Bug 952330 - EAP Plugin Pack Hotfix-02 for JBoss ON 3.1.2

Comment 4 Libor Zoubek 2013-04-17 09:33:27 UTC
Above patch causes regression.

I am randomly getting this error on various stuff I do with EAP 6.0 server

java.lang.Exception: The target server failed to respond

Reload operation, Creating/Deleting resources,deployments

Another thing I observe: it seems that new http connection instance cannot be shared among threads.

I run "discovery -f" on agent. While it's running, I run "reload" operation on AS7 .. such operation seems to be blocked for a long time (usually it takes abou 5-10 seconds). In this case it takes about 2 minutes and then it fails with "The target server failed to respond"

Comment 6 Thomas Segismont 2013-04-17 15:04:34 UTC
(In reply to comment #4)
> Above patch causes regression.
> 
> I am randomly getting this error on various stuff I do with EAP 6.0 server
> 
> java.lang.Exception: The target server failed to respond

This indeed may come from HttpClient library. I'll have to check why it happens.

> 
> Reload operation, Creating/Deleting resources,deployments

Do you mean you got the previous exception while doing these actions?

> 
> Another thing I observe: it seems that new http connection instance cannot
> be shared among threads.
> 
> I run "discovery -f" on agent. While it's running, I run "reload" operation
> on AS7 .. such operation seems to be blocked for a long time (usually it
> takes abou 5-10 seconds). In this case it takes about 2 minutes and then it
> fails with "The target server failed to respond"

How can I reproduce this? On my box, "discovery -f" completes before I can click the UI.

Comment 7 Thomas Segismont 2013-04-17 20:26:39 UTC
Partially fixed in master - 915ab3d (no more "The target server failed to respond" errors).

Keepalive timeout paramater was too high. The server could close connection server-side and the agent would still try to reuse it.
    
Now HttpClient uses the same value as HttpUrlConnection did to expire connections (5 seconds if the server does not send its keepalive timeout in reponse headers).

Deployments should just work as they don't use persistent connections from ASConnection class. My testing confirms that.

As for the reload operation problem, the operation will fail even if you don't trigger a full discovery on the agent at the same time. I tried the same steps with unpatched 3.1.2 build and it also fails. The message is just different.

Working on a complementary fix.

Comment 8 Thomas Segismont 2013-04-18 09:11:09 UTC
Pushed commit 1c01db5 to release/jon3.1.x branch (cherry-picked from master - 915ab3d) as integration tests passed:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/RHQ/job/rhq-master/org.rhq$rhq-jboss-as-7-plugin/2168/

Comment 12 Larry O'Leary 2013-05-03 14:15:11 UTC
Fixed in EAP Plugin Pack Hotfix-02 for JBoss ON 3.1.2.