Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1603973

Summary: [GSS](6.4.z) HornetQ cannot failover with network disconnected
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Clebert Suconic <csuconic>
Component: HornetQAssignee: jboss-set
Status: CLOSED CURRENTRELEASE QA Contact: Peter Mackay <pmackay>
Severity: unspecified Docs Contact:
Priority: high    
Version: 6.4.21CC: bmaxwell, csuconic, dcihak, mrobson, msvehla, rstancel
Target Milestone: CR1   
Target Release: EAP 6.4.21   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-19 12:42:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1567790, 1610346, 1610355    

Description Clebert Suconic 2018-07-19 19:44:46 UTC
Description of problem:

HornetQ is using Netty 3 with OIO, which is using BlockedIO, which is basically an SocketOutputStream.
There is an issue with disconnecting the cable or disabling network, in which the writer will lock the Pinger, Pinger will not be able to cleanup the connection.

Notice this not apply to Artemis which is 100 non blocking, so I'm not sure these fixes will be applied upstream in Artemis.

Version-Release number of selected component (if applicable):


How reproducible:
100% reproduceable.

Steps to Reproduce:
1. Have a producer connection so much that it flow control the server. (disable flow control for easier reproducing)
2. Disconnect network (ifconfig down or pull of cable)

or:

1. I have developed NettyManualFailoverTest which will be part of the fix. follow the steps of the test and it will show the issue.

Actual results:

Connection writer hangs forever, Pinger can't disconnect connection.

Expected results:

Connection failure or retry happening regularly.

Additional info: