Bug 1383507

Summary: [GSS](6.4.z) AbstractHandleableCloseable.close infinite wait
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Robert Bost <rbost>
Component: RemotingAssignee: Enrique Gonzalez Martinez <egonzale>
Status: CLOSED CURRENTRELEASE QA Contact: Jitka Kozana <jkudrnac>
Severity: high Docs Contact:
Priority: high    
Version: 6.4.6CC: bmaxwell, cdewolf, david.lloyd, rfoyle
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-17 07:39:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Robert Bost 2016-10-10 20:43:48 UTC
Description of problem: Copied from upstream bug report from JIRA (https://issues.jboss.org/browse/REM3-232):

In a context where Endpoint is closed although some connections are in progress, we are observing that the close method can never return. Log analysis has shown that the following scenario can happen (using Wildly CLI process):
The CLI creates a new Endpoint.
The CLI attemps to connect to the server. It takes 10 seconds for the server to reply to the connection and the client to receive the "authentication complete".
The CLI main thread detects that the connection failed (there is a timeout for connection of 5 seconds) and throws an Exception. This does fire the shutdownhook.
The Endpoint is closed synchronously and the Remoting remote connection provider is closed (locking the connectionLock then unlocking it). The Endpoint has been switched to CLOSING state.
Just after the close of the provider, we can see a trace that shows that the .ClientConnectionOpenListener$Authentication is registered. It means that although the provider has been closed, it is accepting a new connection and updates its data structure. There is a check to see if the EndpoinImpl has not been closed (isCloseFlagSet). The EndpointImpl is in CLOSING, so the new incoming connection is not closed.
This scenario should be causing the hang that we are observing.


Version-Release number of selected component (if applicable): 3.3.4. Final


Actual results: Thread hang in AbstractHandleableCloseable


Expected results: No hang


Additional info:
Workaround seems to be to increase client timeout (--timeout).

Comment 1 Enrique Gonzalez Martinez 2016-10-24 09:19:16 UTC
The customer is using "3.3.7.Final-redhat-1".... 3.3.8 contains fixes related to a race condition that could have something to do with this problem. 3.3.8 was upgraded in 6.4.z stream. The upgrade was done already in 6.4.9+
From 3.3.7 to 3.3.8 contains a couple of fixes related to these sort of race conditions.


https://github.com/jboss-remoting/jboss-remoting/compare/3.3.7.Final...3.3.8.Final

If the customer has some sort of reproducer the way to proceed would be to upgrade jboss eap and try.

Comment 2 Richard Foyle 2016-10-25 19:25:17 UTC
What EAP version would the customer need to test with? They are able to reproduce at will in their environment.

Comment 3 Richard Foyle 2016-10-25 20:29:37 UTC
I asked the customer to reproduce the issue on EAP 6.4.11.