Bug 1017559 - lack of synchronization on cluster.copy versus message delivery
Summary: lack of synchronization on cluster.copy versus message delivery
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ
Version: 6.1.0
Hardware: All
OS: All
urgent
urgent
Target Milestone: ER6
: EAP 6.2.0
Assignee: Clebert Suconic
QA Contact: Martin Svehla
Russell Dickenson
URL:
Whiteboard:
Depends On: 1016141
Blocks: 1392427
TreeView+ depends on / blocked
 
Reported: 2013-10-10 07:31 UTC by Tom Ross
Modified: 2018-12-03 20:14 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
: 1392427 (view as bug list)
Environment:
Last Closed: 2013-12-15 16:15:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker HORNETQ-1264 0 Blocker Closed org.jboss.netty.buffer.AbstractChannelBuffer.setIndex(AbstractChannelBuffer.java:63) 2019-08-30 09:47:12 UTC

Description Tom Ross 2013-10-10 07:31:45 UTC
Customer sees the following exception in the log file:

2013-10-03 17:11:26,941 WARN  [org.hornetq.core.server] (Thread-22 (HornetQ-server-HornetQServerImpl::serverUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19-6652469)) HQ222151: removing consumer which did not handle a message, consumer=ClusterConnectionBridge@5277a49f [name=sf.CON_IVUrail.9e8e06ce-2c3a-11e3-a54e-e380885b2d1e, queue=QueueImpl[name=sf.CON_IVUrail.9e8e06ce-2c3a-11e3-a54e-e380885b2d1e, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19]]@55b0fa43 targetConnector=ServerLocatorImpl (identity=(Cluster-connection-bridge::ClusterConnectionBridge@5277a49f [name=sf.CON_IVUrail.9e8e06ce-2c3a-11e3-a54e-e380885b2d1e, queue=QueueImpl[name=sf.CON_IVUrail.9e8e06ce-2c3a-11e3-a54e-e380885b2d1e, postOffice=PostOfficeImpl [server=HornetQServerImpl::serverUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19]]@55b0fa43 targetConnector=ServerLocatorImpl [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=21060&host=10-248-144-184], discoveryGroupConfiguration=null]]::ClusterConnectionImpl@1264950074[nodeUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19, connector=TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=21010&host=10-248-144-184, address=jms, server=HornetQServerImpl::serverUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19])) [initialConnectors=[TransportConfiguration(name=netty, factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=21060&host=10-248-144-184], discoveryGroupConfiguration=null]], message=Reference[253234]:RELIABLE:ServerMessage[messageID=253234,priority=4, bodySize=3000,expiration=0, durable=true, address=jms.topic.topic/NotificationDpl-MB.31301201,properties=TypedProperties[{_JBM_DUPL_ID=68a62d2f-7c23-4bd6-bfd4-cdcebef5ed8b, __HQ_CID=8cb065ed-2c3b-11e3-8592-fb893fe81f19, _HQ_ROUTE_TOsf.CON_IVUrail.9e8e06ce-2c3a-11e3-a54e-e380885b2d1e=[B@689692e4}]]@1971711072: java.lang.IndexOutOfBoundsException: Invalid writerIndex: 1506 - Maximum is 0 or 1500
        at org.jboss.netty.buffer.AbstractChannelBuffer.setIndex(AbstractChannelBuffer.java:63) [netty-3.6.2.Final-redhat-1.jar:3.6.2.Final-redhat-1]
        at org.hornetq.core.buffers.impl.ChannelBufferWrapper.setIndex(ChannelBufferWrapper.java:484) [hornetq-commons-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.message.impl.MessageImpl.<init>(MessageImpl.java:170) [hornetq-core-client-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.message.impl.MessageImpl.<init>(MessageImpl.java:134) [hornetq-core-client-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.ServerMessageImpl.<init>(ServerMessageImpl.java:89) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.ServerMessageImpl.copy(ServerMessageImpl.java:209) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.cluster.impl.ClusterConnectionBridge.beforeForward(ClusterConnectionBridge.java:182) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:548) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl.handle(QueueImpl.java:2435) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl.deliver(QueueImpl.java:1922) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl.access$1200(QueueImpl.java:79) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:2695) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106) [hornetq-core-client-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_25]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_25]
        at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_25]

2013-10-03 17:11:26,943 ERROR [org.hornetq.core.server] (Thread-22 (HornetQ-server-HornetQServerImpl::serverUUID=f9b84d7b-2c39-11e3-8592-fb893fe81f19-6652469)) HQ224041: Failed to deliver: java.util.NoSuchElementException
        at org.hornetq.utils.PriorityLinkedListImpl$PriorityLinkedListIterator.repeat(PriorityLinkedListImpl.java:181) [hornetq-core-client-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl.deliver(QueueImpl.java:1944) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl.access$1200(QueueImpl.java:79) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:2695) [hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at org.hornetq.utils.OrderedExecutorFactory$OrderedExecutor$1.run(OrderedExecutorFactory.java:106) [hornetq-core-client-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_25]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_25]
        at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_25]

Comment 1 Miroslav Novak 2013-10-10 09:14:55 UTC
I believe this could be related to bz#997890. In the test there is received mix of small and large messages by one subscriber (in standalone JMS client) from topic when IndexOutOfBoundsException is thrown. 

It was hit in failover test with 2 servers in collocated HA topology with replicated journal but IndexOutOfBoundsException was thrown after successful failback of subscriber when no crash was in progress.

Stack trace from the subscriber (it's different then you posted). I didn't have time to look to the code but there could be some race condition:
12:32:38,562 Thread-253 ERROR [org.jboss.qa.hornetq.apps.clients.SubscriberTransAck:138] Exception was thrown during receiving messages:
java.lang.IndexOutOfBoundsException: Not enough readable bytes - Need 10, maximum is 6
	at org.jboss.netty.buffer.AbstractChannelBuffer.checkReadableBytes(AbstractChannelBuffer.java:661)
	at org.jboss.netty.buffer.AbstractChannelBuffer.readBytes(AbstractChannelBuffer.java:338)
	at org.jboss.netty.buffer.AbstractChannelBuffer.readBytes(AbstractChannelBuffer.java:344)
	at org.hornetq.core.buffers.impl.ChannelBufferWrapper.readBytes(ChannelBufferWrapper.java:334)
	at org.hornetq.utils.TypedProperties.decode(TypedProperties.java:472)
	at org.hornetq.jms.client.HornetQMapMessage.doBeforeReceive(HornetQMapMessage.java:440)
	at org.hornetq.jms.client.HornetQMessageConsumer.getMessage(HornetQMessageConsumer.java:232)
	at org.hornetq.jms.client.HornetQMessageConsumer.receive(HornetQMessageConsumer.java:129)
	at org.jboss.qa.hornetq.apps.clients.SubscriberTransAck.receiveMessage(SubscriberTransAck.java:391)
	at org.jboss.qa.hornetq.apps.clients.SubscriberTransAck.run(SubscriberTransAck.java:110)

Comment 2 Clebert Suconic 2013-10-10 14:46:17 UTC
We can't release without this..

Comment 3 Clebert Suconic 2013-10-10 14:51:02 UTC
(In reply to Miroslav Novak from comment #1)

I don't think it would be releated.. there's no message.copy called and no cluster on the test you mentioned

Comment 4 JBoss JIRA Server 2013-10-16 16:21:06 UTC
Clebert Suconic <clebert.suconic> updated the status of jira HORNETQ-1264 to Closed

Comment 5 Martin Svehla 2013-10-17 12:07:45 UTC
Tested with EAP 6.2.0.ER6 / HornetQ 2.3.9.Final

Comment 7 Vincent Kirsch 2015-08-14 12:08:49 UTC
Hello,

I have seen this kind of behavior happen in HornetQ 2.2.18.

Is there any chance the fix for this problem could be backported to a 2.2.X version? We have this happening on a customer's production site. Long story short, they prefer to not do what they consider hazardous major upgrades of the installed components, so basically we're stuck with HornetQ 2.2 :(

Thanks,
Vincent.


Note You need to log in before you can comment on or make changes to this bug.