Bug 959789 - HQ core bridge does not failover
Summary: HQ core bridge does not failover
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER5
: EAP 6.1.1
Assignee: Clebert Suconic
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 994214
TreeView+ depends on / blocked
 
Reported: 2013-05-05 18:42 UTC by Miroslav Novak
Modified: 2013-09-16 20:27 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
In previous versions of JBoss EAP 6, a HornetQ core bridge server would not properly failover to a backup HornetQ server when the primary HornetQ server became unavailable. This issue occurred because the HornetQ core bridge server would attempt to reconnect to any other server node, rather than the correct backup HornetQ server. This issue has been fixed in this release of JBoss EAP 6, and a HornetQ core bridge server will now always retry to connect to the backup HornetQ server when the primary HornetQ server becomes unavailable.
Clone Of:
Environment:
Last Closed: 2013-09-16 20:27:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 900764 0 high CLOSED Core bridge doesn't failover in HornetQ live-backup setup 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker HORNETQ-1218 0 None None None Never

Internal Links: 900764

Description Miroslav Novak 2013-05-05 18:42:22 UTC
HornetQ core bridge does not failover from live to backup. I wrote this test during trying to verify bz#900764. But this appears to be a little different scenario.

Test scenario:
1. Start two EAP 6.1.0.ER6 servers - HornetQ live/backup pair with deployed OutQueue
2. Start third EAP 6.1.0.ER6 which has deployed HQ core bridge and InQueue:
                <bridges>
                    <bridge name="myBridge">
                        <queue-name>jms.queue.InQueue</queue-name>
                        <forwarding-address>jms.queue.OutQueue</forwarding-address>
                        <ha>true</ha>
                        <reconnect-attempts>-1</reconnect-attempts>
                        <use-duplicate-detection>true</use-duplicate-detection>
                           <discovery-group-ref discovery-group-name="dg-group1"/>
                    </bridge>
                </bridges>
3. Start producer which sends messages to InQueue to third server
4. Start consumer which reads messages from OutQueue from first live server
5. Kill first live server
6. Check whether consumer from step 4. is still receiving messages from OutQueue. This will verify that HQ core bridge and consumer failovered to backup.

Result:
After step 6. consumer failovered to backup but can't read any more messages. HQ core bridge did not failover.

Console log from third server:
20:38:47,765 INFO  [org.hornetq.core.server] (Thread-2 (HornetQ-server-HornetQServerImpl::serverUUID=918186ea-b5b1-11e2-81e1-77319a9992e3-2075307577)) HQ221027: Bridge BridgeImpl@1a349871 [name=myBridge, queue=QueueImpl[name=jms.queue.InQueue, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=918186ea-b5b1-11e2-81e1-77319a9992e3]]@73043027 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=192-168-40-1], discoveryGroupConfiguration=DiscoveryGroupConfiguration{name='dg-group1', refreshTimeout=10000, discoveryInitialWaitTimeout=10000}]] is connected
20:39:29,976 WARN  [org.hornetq.core.server] (Thread-1 (HornetQ-client-global-threads-824748279)) HQ222095: Connection failed with failedOver=false: HornetQException[errorType=INTERNAL_ERROR message=HQ119005: Exception in Netty transport]
	at org.hornetq.core.remoting.impl.netty.HornetQChannelHandler.exceptionCaught(HornetQChannelHandler.java:107) [hornetq-core-client-2.3.0.Final-redhat-1.jar:2.3.0.Final-redhat-1]
	at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:130) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.exceptionCaught(SimpleChannelUpstreamHandler.java:153) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:112) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:77) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.socket.oio.OioWorker.run(OioWorker.java:51) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:175) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_15]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_15]
	at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_15]
Caused by: java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:189) [rt.jar:1.7.0_15]
	at java.net.SocketInputStream.read(SocketInputStream.java:121) [rt.jar:1.7.0_15]
	at java.net.SocketInputStream.read(SocketInputStream.java:203) [rt.jar:1.7.0_15]
	at java.io.FilterInputStream.read(FilterInputStream.java:83) [rt.jar:1.7.0_15]
	at java.io.PushbackInputStream.read(PushbackInputStream.java:139) [rt.jar:1.7.0_15]
	at org.jboss.netty.channel.socket.oio.OioWorker.process(OioWorker.java:64) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.socket.oio.AbstractOioWorker.run(AbstractOioWorker.java:73) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	at org.jboss.netty.channel.socket.oio.OioWorker.run(OioWorker.java:51) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
	... 4 more

20:39:29,985 WARN  [org.hornetq.jms.server] (Thread-3 (HornetQ-client-global-threads-824748279)) Notified of connection failure in xa discovery, we will retry on the next recovery: HornetQException[errorType=NOT_CONNECTED message=HQ119006: Channel disconnected]
	at org.hornetq.core.client.impl.ClientSessionFactoryImpl.connectionDestroyed(ClientSessionFactoryImpl.java:418) [hornetq-core-client-2.3.0.Final-redhat-1.jar:2.3.0.Final-redhat-1]
	at org.hornetq.core.remoting.impl.netty.NettyConnector$Listener$1.run(NettyConnector.java:882) [hornetq-core-client-2.3.0.Final-redhat-1.jar:2.3.0.Final-redhat-1]
	at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106) [hornetq-core-client-2.3.0.Final-redhat-1.jar:2.3.0.Final-redhat-1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_15]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_15]
	at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_15]

Comment 1 Clebert Suconic 2013-05-29 23:59:31 UTC
https://github.com/hornetq/hornetq/pull/1099

Comment 2 Dimitris Andreadis 2013-08-01 11:29:34 UTC
Which HQ release this fix will be part of so it can be included in EAP 6.1.1?

https://issues.jboss.org/browse/HORNETQ-1218 points to 2.4.0.Alpha2

Comment 3 Francisco Borges 2013-08-01 11:43:40 UTC
(In reply to Dimitris Andreadis from comment #2)
> Which HQ release this fix will be part of so it can be included in EAP 6.1.1?
> 
> https://issues.jboss.org/browse/HORNETQ-1218 points to 2.4.0.Alpha2

Indeed you have a point. 

If you look at the "Source" tab of https://issues.jboss.org/browse/HORNETQ-1218 you'll notice commits addressing the issue applied to 2.2.eap5 2.2.x and master but NOT to 2.3.x (which is the branch from which we will create 2.3.3). I'll ask Clebert to verify why there was no commit to 2.3.x.

Comment 4 Clebert Suconic 2013-08-01 13:30:36 UTC
You're right.. I made a mistake here... I just cherry picked it and submitted a PR

Comment 7 Miroslav Novak 2013-08-16 12:56:25 UTC
Failover of HornetQ core bridge is ok. Verified in EAP 6.1.1.ER6. Nice work!


Note You need to log in before you can comment on or make changes to this bug.