1310537 – Lost large messages if backup is shutdown during synchronization

Bug 1310537 - Lost large messages if backup is shutdown during synchronization

Summary: Lost large messages if backup is shutdown during synchronization

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	HornetQ
Sub Component:
Version:	6.4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	baranowb
QA Contact:	Miroslav Novak
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-02-22 07:59 UTC by Miroslav Novak
Modified:	2016-09-19 13:13 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-09-19 13:13:26 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1169418	0	unspecified	CLOSED	[QA](6.4.z) Calling last session.commit() does not get a response and throws "javax.jms.JMSException: HQ119014: Timed ou...	2021-02-22 00:41:40 UTC
Red Hat Issue Tracker	JBEAP-3419	0	Critical	Verified	(7.0.z) Lost large messages if backup is shutdown during synchronization	2020-04-03 14:53:24 UTC

Internal Links: 1169418

Description Miroslav Novak 2016-02-22 07:59:47 UTC

Cloned from https://issues.jboss.org/browse/JBEAP-3419:

Test scenario:
1. Start live server with replicated journal and queue testQueue0
2. Send 500 large messages to testQueue0 t live
3. Start backup server and receiving messages from testQueue0 (session CLIENT_ACKNOWLEDGE)
4. Before backup is announced/synchronized with live, cleanly shutdown backup
5. Wait until receiver consumes all messages

Expected result:
Receiver consumed 500 messages. No losses or duplicates.

Actual result:
There are lost messages. Client did not receive all messages. Messages are not in the journal of live server after the test.

By tracking message Id of the lost message, the message was send to receiver. Because it's large message, receiver tries to ack the message right away. As backup is already shutdown (step 4) and live cannot sync message acknowledge with backup, live does not respond to client until connection with backup times out. If this timeout for cluster connection is longer than receiver's call-timeout then receiver gets JMSException like from consumer.receive() method:

16:26:12,983 Thread-27 ERROR [org.jboss.qa.hornetq.apps.clients.ReceiverClientAck:341] RETRY receive for host: 127.0.0.1, Trying to receive message with count: 57
javax.jms.JMSException: AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41
at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.sendBlocking(ChannelImpl.java:350)
at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendACK(ActiveMQSessionContext.java:421)
at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.acknowledge(ClientSessionImpl.java:696)
at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.doAck(ClientConsumerImpl.java:1035)
at org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.acknowledge(ClientConsumerImpl.java:702)
at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:96)
at org.apache.activemq.artemis.core.client.impl.ClientMessageImpl.acknowledge(ClientMessageImpl.java:38)
at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.getMessage(ActiveMQMessageConsumer.java:212)
at org.apache.activemq.artemis.jms.client.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:119)
at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.receiveMessage(ReceiverClientAck.java:333)
at org.jboss.qa.hornetq.apps.clients.ReceiverClientAck.run(ReceiverClientAck.java:169)
Caused by: ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119014: Timed out after waiting 30,000 ms for response when sending packet 41]
... 11 more

Problem is that message was acked on live server and thus never redelivered to consumer again.

Comment 2 JBoss JIRA Server 2016-02-24 08:25:42 UTC

Andy Taylor <ataylor> updated the status of jira JBEAP-3419 to Resolved

Comment 3 JBoss JIRA Server 2016-02-26 08:41:18 UTC

Miroslav Novak <mnovak> updated the status of jira JBEAP-3419 to Reopened

Comment 5 JBoss JIRA Server 2016-07-26 07:20:29 UTC

Andy Taylor <ataylor> updated the status of jira JBEAP-3419 to Resolved

Comment 6 Petr Penicka 2016-09-19 13:13:26 UTC

Triage: closing as this one is for Artemis, fixed in 7.0.2, not applicable for 6.4.

Note You need to log in before you can comment on or make changes to this bug.