Bug 1166383 - JBoss Remoting SSL transport fails when performing streaming due to stale connections
Summary: JBoss Remoting SSL transport fails when performing streaming due to stale con...
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Communications Subsystem
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER01
: JON 3.3.1
Assignee: John Mazzitelli
QA Contact: Mike Foley
URL:
Whiteboard:
Keywords: Regression
Depends On: 1176183
Blocks: 1175851
TreeView+ depends on / blocked
 
Reported: 2014-11-20 22:44 UTC by Larry O'Leary
Modified: 2019-02-15 13:53 UTC (History)
4 users (show)

(edit)
Clone Of:
: 1175851 (view as bug list)
(edit)
Last Closed: 2015-02-27 19:58:13 UTC


Attachments (Terms of Use)
patch to add new transport param (13.73 KB, patch)
2014-11-21 19:08 UTC, John Mazzitelli
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 511833 None None None Never
Red Hat Bugzilla 1049009 None None None Never
Red Hat Bugzilla 1175957 None None None Never

Internal Trackers: 1049009 1175957

Description Larry O'Leary 2014-11-20 22:44:08 UTC
Description of problem:
If the remote connection between the JBoss ON server and agent is using SSL and it has been idle for 1 minute, communication between the JON server and agent will fail on a subsequent request. For example, when deploying a resource bundle.

This is due to certain SSLExceptions being interpreted as non-recoverable conditions.

Version-Release number of selected component (if applicable):
3.2.3

How reproducible:
Always

Steps to Reproduce:
1.  Install JBoss ON 3.1.2 system.
2.  Configure agent/server SSL encryption.
3.  Start JBoss ON system.
4.  Import platform resource into inventory.
5.  Add platform resource to resource group.
6.  Create helloworld-bundle bundle.
7.  Check agent to ensure server is not currently connected to agent:

        _count=0; while true; do netstat -anpt | grep 16163; sleep 2s; _count=$(($_count+2)); echo "$_count seconds"; done

8.  Invoke the *View Process List* platform operation.
9.  Wait about a minute and check to see that the server's network socket to the agent is in the state *CLOSE_WAIT*.
10. Deploy helloworld-bundle to platform resource group.

Actual results:
Bundle fails to deploy and server.log includes the following error:

    ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] (http-/0.0.0.0:7080-5) {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=EhtA72he/fcxoEpuDEQITmL04UgkR4+Jvmaqz9vcYSE3+8D9XOkC4HnvW/uKbSeMi8Y=, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: org.jboss.remoting.InvocationFailureException:Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:java.net.SocketException: Broken pipe -> java.net.SocketException:Broken pipe. Cause: org.jboss.remoting.InvocationFailureException: Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

Expected results:
No errors and bundle gets deployed.

Additional Info:
This issue was fixed in JBoss Remoting 2.5.4 as identified in https://issues.jboss.org/browse/JBREM-1245.

Although the JBoss ON is using this version of JBoss Remoting -- 2.5.4.SP5 as of JBoss ON 3.2 -- the fix provided in JBREM-1245 also required the remoting transport parameter generalizeSocketException to be set to true.

I recommend that we add generalizeSocketException=true to the remoting transport-params if it isn't explicitly defined by user provided configuration.

Comment 2 John Mazzitelli 2014-11-21 19:08:14 UTC
Created attachment 959926 [details]
patch to add new transport param

attaching patch that should fix this. Have not done replication procedures to test, but I have run normally (using non-secure endpoints in server and agent) and it worked. So at least the defaults didn't break anything. Also added two unit-tests to make sure the transport params get this new param added properly if it wasn't specified by the user already.

Comment 3 Larry O'Leary 2014-11-22 00:03:13 UTC
Mazz, I built your patch and tested it locally with the reproducer steps mentioned above and this seems to resolve the issue.

I also tested overriding the value (i.e. setting it to false) and that too works as expected. Thank you very much.

I would say this should be good to get into master and queued up for a cherry-pick into JBoss ON 3.3.1 when we are ready.

Comment 4 John Mazzitelli 2014-11-22 00:08:46 UTC
committed to master branch:

commit 2a2ffa4dc443f0064365e7ef56deee4b9e3c688d
Author: John Mazzitelli <mazz@redhat.com>
Date:   Fri Nov 21 19:07:43 2014 -0500

    BZ 1166383 - ensure a transport param is added to the remoting locator URL

Comment 5 Giuseppe Bonocore 2014-12-02 12:37:19 UTC
Hello, there is any possibility to get this fix in JON 3.2 ?

Comment 7 Larry O'Leary 2014-12-03 01:40:06 UTC
The fix for this is already available for 3.2 by applying the configuration update mentioned in the solution 511833[1]. If you need further assistance, please contact Red Hat Global Support Services.

[1]: https://access.redhat.com/solutions/511833

Comment 9 Michael Burman 2015-01-15 08:40:14 UTC
Cherry-picked to release/jon3.3.x:

commit 9ae3875a38f4b70bb3ecf04413c81a87609fb2dc
Author: John Mazzitelli <mazz@redhat.com>
Date:   Fri Nov 21 19:07:43 2014 -0500

    BZ 1166383 - ensure a transport param is added to the remoting locator URL
    
    (cherry picked from commit 2a2ffa4dc443f0064365e7ef56deee4b9e3c688d)
    
    Conflicts:
        modules/enterprise/agent/src/test/java/org/rhq/enterprise/agent/AgentConfigurationTest.java

Comment 10 Larry O'Leary 2015-01-29 16:41:08 UTC
To verify, in addition to the steps listed in comment 0, confirm that the generalizeSocketException=true parameter appears in the agent's end-point address under the agent topology page. (Administration -> Topology > Agents >> view agent details and verify Remote Endpoint contains generalizeSocketException=true)

Comment 11 Larry O'Leary 2015-01-29 17:00:39 UTC
Moving this to ON_QA as this was in ER01 and ready for verification.


Note You need to log in before you can comment on or make changes to this bug.