Bug 1166383 - JBoss Remoting SSL transport fails when performing streaming due to stale connections
Summary: JBoss Remoting SSL transport fails when performing streaming due to stale con...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Communications Subsystem
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER01
: JON 3.3.1
Assignee: John Mazzitelli
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On: 1176183
Blocks: 1175851
TreeView+ depends on / blocked
 
Reported: 2014-11-20 22:44 UTC by Larry O'Leary
Modified: 2019-02-15 13:53 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1175851 (view as bug list)
Environment:
Last Closed: 2015-02-27 19:58:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
patch to add new transport param (13.73 KB, patch)
2014-11-21 19:08 UTC, John Mazzitelli
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1049009 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Bugzilla 1175957 0 urgent CLOSED Retry streaming connection if it fails due to stale connection in pool 2021-02-22 00:41:40 UTC
Red Hat Knowledge Base (Solution) 511833 0 None None None Never

Internal Links: 1049009 1175957

Description Larry O'Leary 2014-11-20 22:44:08 UTC
Description of problem:
If the remote connection between the JBoss ON server and agent is using SSL and it has been idle for 1 minute, communication between the JON server and agent will fail on a subsequent request. For example, when deploying a resource bundle.

This is due to certain SSLExceptions being interpreted as non-recoverable conditions.

Version-Release number of selected component (if applicable):
3.2.3

How reproducible:
Always

Steps to Reproduce:
1.  Install JBoss ON 3.1.2 system.
2.  Configure agent/server SSL encryption.
3.  Start JBoss ON system.
4.  Import platform resource into inventory.
5.  Add platform resource to resource group.
6.  Create helloworld-bundle bundle.
7.  Check agent to ensure server is not currently connected to agent:

        _count=0; while true; do netstat -anpt | grep 16163; sleep 2s; _count=$(($_count+2)); echo "$_count seconds"; done

8.  Invoke the *View Process List* platform operation.
9.  Wait about a minute and check to see that the server's network socket to the agent is in the state *CLOSE_WAIT*.
10. Deploy helloworld-bundle to platform resource group.

Actual results:
Bundle fails to deploy and server.log includes the following error:

    ERROR [org.rhq.enterprise.communications.command.client.ClientCommandSenderTask] (http-/0.0.0.0:7080-5) {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=EhtA72he/fcxoEpuDEQITmL04UgkR4+Jvmaqz9vcYSE3+8D9XOkC4HnvW/uKbSeMi8Y=, rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[schedule], targetInterfaceName=org.rhq.core.clientapi.agent.bundle.BundleAgentService}]]. Cause: org.jboss.remoting.InvocationFailureException:Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe -> javax.net.ssl.SSLException:java.net.SocketException: Broken pipe -> java.net.SocketException:Broken pipe. Cause: org.jboss.remoting.InvocationFailureException: Unable to perform invocation; nested exception is: 
        javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe

Expected results:
No errors and bundle gets deployed.

Additional Info:
This issue was fixed in JBoss Remoting 2.5.4 as identified in https://issues.jboss.org/browse/JBREM-1245.

Although the JBoss ON is using this version of JBoss Remoting -- 2.5.4.SP5 as of JBoss ON 3.2 -- the fix provided in JBREM-1245 also required the remoting transport parameter generalizeSocketException to be set to true.

I recommend that we add generalizeSocketException=true to the remoting transport-params if it isn't explicitly defined by user provided configuration.

Comment 2 John Mazzitelli 2014-11-21 19:08:14 UTC
Created attachment 959926 [details]
patch to add new transport param

attaching patch that should fix this. Have not done replication procedures to test, but I have run normally (using non-secure endpoints in server and agent) and it worked. So at least the defaults didn't break anything. Also added two unit-tests to make sure the transport params get this new param added properly if it wasn't specified by the user already.

Comment 3 Larry O'Leary 2014-11-22 00:03:13 UTC
Mazz, I built your patch and tested it locally with the reproducer steps mentioned above and this seems to resolve the issue.

I also tested overriding the value (i.e. setting it to false) and that too works as expected. Thank you very much.

I would say this should be good to get into master and queued up for a cherry-pick into JBoss ON 3.3.1 when we are ready.

Comment 4 John Mazzitelli 2014-11-22 00:08:46 UTC
committed to master branch:

commit 2a2ffa4dc443f0064365e7ef56deee4b9e3c688d
Author: John Mazzitelli <mazz>
Date:   Fri Nov 21 19:07:43 2014 -0500

    BZ 1166383 - ensure a transport param is added to the remoting locator URL

Comment 5 Giuseppe Bonocore 2014-12-02 12:37:19 UTC
Hello, there is any possibility to get this fix in JON 3.2 ?

Comment 7 Larry O'Leary 2014-12-03 01:40:06 UTC
The fix for this is already available for 3.2 by applying the configuration update mentioned in the solution 511833[1]. If you need further assistance, please contact Red Hat Global Support Services.

[1]: https://access.redhat.com/solutions/511833

Comment 9 Michael Burman 2015-01-15 08:40:14 UTC
Cherry-picked to release/jon3.3.x:

commit 9ae3875a38f4b70bb3ecf04413c81a87609fb2dc
Author: John Mazzitelli <mazz>
Date:   Fri Nov 21 19:07:43 2014 -0500

    BZ 1166383 - ensure a transport param is added to the remoting locator URL
    
    (cherry picked from commit 2a2ffa4dc443f0064365e7ef56deee4b9e3c688d)
    
    Conflicts:
        modules/enterprise/agent/src/test/java/org/rhq/enterprise/agent/AgentConfigurationTest.java

Comment 10 Larry O'Leary 2015-01-29 16:41:08 UTC
To verify, in addition to the steps listed in comment 0, confirm that the generalizeSocketException=true parameter appears in the agent's end-point address under the agent topology page. (Administration -> Topology > Agents >> view agent details and verify Remote Endpoint contains generalizeSocketException=true)

Comment 11 Larry O'Leary 2015-01-29 17:00:39 UTC
Moving this to ON_QA as this was in ER01 and ready for verification.


Note You need to log in before you can comment on or make changes to this bug.