Bug 1310537 - Lost large messages if backup is shutdown during synchronization
Summary: Lost large messages if backup is shutdown during synchronization
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ
Version: 6.4.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: baranowb
QA Contact: Miroslav Novak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-22 07:59 UTC by Miroslav Novak
Modified: 2016-09-19 13:13 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-09-19 13:13:26 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1169418 0 unspecified CLOSED [QA](6.4.z) Calling last session.commit() does not get a response and throws "javax.jms.JMSException: HQ119014: Timed ou... 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker JBEAP-3419 0 Critical Verified (7.0.z) Lost large messages if backup is shutdown during synchronization 2020-04-03 14:53:24 UTC

Internal Links: 1169418

Description Miroslav Novak 2016-02-22 07:59:47 UTC
Cloned from https://issues.jboss.org/browse/JBEAP-3419:

Test scenario:
1. Start live server with replicated journal and queue testQueue0
2. Send 500 large messages to testQueue0 t live
3. Start backup server and receiving messages from testQueue0 (session CLIENT_ACKNOWLEDGE)
4. Before backup is announced/synchronized with live, cleanly shutdown backup
5. Wait until receiver consumes all messages

Expected result:
Receiver consumed 500 messages. No losses or duplicates.

Actual result:
There are lost messages. Client did not receive all messages. Messages are not in the journal of live server after the test.

By tracking message Id of the lost message, the message was send to receiver. Because it's large message, receiver tries to ack the message right away. As backup is already shutdown (step 4) and live cannot sync message acknowledge with backup, live does not respond to client until connection with backup times out. If this timeout for cluster connection is longer than receiver's call-timeout then receiver gets JMSException like from consumer.receive() method:

16:26:12,983 Thread-27 ERROR [org.jboss.qa.hornetq.apps.clients.ReceiverClientAck:341] RETRY receive for host: 127.0.0.1, Trying to receive message with count: 57
javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41
	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:350)
	at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendACK(ActiveMQSessionContext.java:421)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.acknowledge(ClientSessionImpl.java:696)
	at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.doAck(ClientConsumerImpl.java:1035)
	at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.acknowledge(ClientConsumerImpl.java:702)
	at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:96)
	at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:38)
	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.getMessage(ActiveMQMessageConsumer.java:212)
	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:119)
	at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.receiveMessage(ReceiverClientAck.java:333)
	at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.run(ReceiverClientAck.java:169)
Caused by: ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41]
	... 11 more

Problem is that message was acked on live server and thus never redelivered to consumer again.

Comment 2 JBoss JIRA Server 2016-02-24 08:25:42 UTC
Andy Taylor <ataylor> updated the status of jira JBEAP-3419 to Resolved

Comment 3 JBoss JIRA Server 2016-02-26 08:41:18 UTC
Miroslav Novak <mnovak> updated the status of jira JBEAP-3419 to Reopened

Comment 5 JBoss JIRA Server 2016-07-26 07:20:29 UTC
Andy Taylor <ataylor> updated the status of jira JBEAP-3419 to Resolved

Comment 6 Petr Penicka 2016-09-19 13:13:26 UTC
Triage: closing as this one is for Artemis, fixed in 7.0.2, not applicable for 6.4.


Note You need to log in before you can comment on or make changes to this bug.