Bug 1310537 - Lost large messages if backup is shutdown during synchronization
Lost large messages if backup is shutdown during synchronization
Status: CLOSED WONTFIX
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ (Show other bugs)
6.4.6
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: baranowb
Miroslav Novak
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-22 02:59 EST by Miroslav Novak
Modified: 2016-09-19 09:13 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-19 09:13:26 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker JBEAP-3419 Critical Verified (7.0.z) Lost large messages if backup is shutdown during synchronization 2017-06-28 02:37 EDT

  None (edit)
Description Miroslav Novak 2016-02-22 02:59:47 EST
Cloned from https://issues.jboss.org/browse/JBEAP-3419:

Test scenario:
1. Start live server with replicated journal and queue testQueue0
2. Send 500 large messages to testQueue0 t live
3. Start backup server and receiving messages from testQueue0 (session CLIENT_ACKNOWLEDGE)
4. Before backup is announced/synchronized with live, cleanly shutdown backup
5. Wait until receiver consumes all messages

Expected result:
Receiver consumed 500 messages. No losses or duplicates.

Actual result:
There are lost messages. Client did not receive all messages. Messages are not in the journal of live server after the test.

By tracking message Id of the lost message, the message was send to receiver. Because it's large message, receiver tries to ack the message right away. As backup is already shutdown (step 4) and live cannot sync message acknowledge with backup, live does not respond to client until connection with backup times out. If this timeout for cluster connection is longer than receiver's call-timeout then receiver gets JMSException like from consumer.receive() method:

16:26:12,983 Thread-27 ERROR [org.jboss.qa.hornetq.apps.clients.ReceiverClientAck:341] RETRY receive for host: 127.0.0.1, Trying to receive message with count: 57
javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41
	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:350)
	at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendACK(ActiveMQSessionContext.java:421)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.acknowledge(ClientSessionImpl.java:696)
	at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.doAck(ClientConsumerImpl.java:1035)
	at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.acknowledge(ClientConsumerImpl.java:702)
	at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:96)
	at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:38)
	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.getMessage(ActiveMQMessageConsumer.java:212)
	at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:119)
	at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.receiveMessage(ReceiverClientAck.java:333)
	at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.run(ReceiverClientAck.java:169)
Caused by: ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41]
	... 11 more

Problem is that message was acked on live server and thus never redelivered to consumer again.
Comment 2 JBoss JIRA Server 2016-02-24 03:25:42 EST
Andy Taylor <ataylor@redhat.com> updated the status of jira JBEAP-3419 to Resolved
Comment 3 JBoss JIRA Server 2016-02-26 03:41:18 EST
Miroslav Novak <mnovak@redhat.com> updated the status of jira JBEAP-3419 to Reopened
Comment 5 JBoss JIRA Server 2016-07-26 03:20:29 EDT
Andy Taylor <ataylor@redhat.com> updated the status of jira JBEAP-3419 to Resolved
Comment 6 Petr Penicka 2016-09-19 09:13:26 EDT
Triage: closing as this one is for Artemis, fixed in 7.0.2, not applicable for 6.4.

Note You need to log in before you can comment on or make changes to this bug.