Bug 988321 - WARN - Replication Large MessageID 164 is not available on backup server. Ignoring replication message
WARN - Replication Large MessageID 164 is not available on backup server. Ig...
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ (Show other bugs)
6.1.1
Unspecified Unspecified
unspecified Severity medium
: ER1
: EAP 6.2.0
Assigned To: Clebert Suconic
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-25 05:57 EDT by Miroslav Novak
Modified: 2013-12-15 11:14 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
After a certain sequence of events, a large message deletion may be called twice, resulting in an erroneous warning message on a backup HornetQ server, similar to below: ---- 11:33:23,825 WARN [org.hornetq.core.server] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) HQ222090: Replication Large MessageID 164 is not available on backup server. Ignoring replication message ---- This issue was fixed by stopping the erroneous warning from appearing in the JBoss EAP 6 logs. As a result, on a backup HornetQ server there should no longer be erroneous warning messages related to large messages not being available.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-15 11:14:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
server_logs.zip (1.02 MB, application/zip)
2013-07-25 05:59 EDT, Miroslav Novak
no flags Details

  None (edit)
Description Miroslav Novak 2013-07-25 05:57:19 EDT
Description of problem:
There are warning messages:
11:33:23,825 WARN  [org.hornetq.core.server] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) HQ222090: Replication Large MessageID 164  is not available on backup server. Ignoring replication message

during sending/receiving messages to/from live server which has replicated backup. Warning messages are in log of backup server.

Test scenario:
1. Start live/backup pair - replicated journal
2. Start producer and consumer (with transacted session) which are connected to live and sending/receiving messages from queue. Producer sends mix of small and large messages.
3. Check logs of backup server

Additional info:
I added some trace logs to ReplicationEndpoint class and noticed that ReplicationEndpoint.handleLargeMessageEnd is called twice:
11:33:23,824 TRACE [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) Writing buffer for channelID=2
11:33:23,824 TRACE [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) handling packet PACKET(ReplicationLargeMessageEndMessage)[type=100, channelI
D=2, packetObject=ReplicationLargeMessageEndMessage]
11:33:23,825 DEBUG [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) Invocation of interceptor org.hornetq.core.server.impl.ReplicationError on P
ACKET(ReplicationLargeMessageEndMessage)[type=100, channelID=2, packetObject=ReplicationLargeMessageEndMessage] returned true
11:33:23,825 WARN  [org.hornetq.core.server] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) HQ222015: mnovak: large Message 164 was deleted from largeMessages on backup.
11:33:23,825 TRACE [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) Sending packet nonblocking PACKET(ReplicationResponseMessage)[type=90, channelID=2, packetObject=ReplicationResponseMessage] on channeID=2
11:33:23,825 TRACE [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) Writing buffer for channelID=2
11:33:23,825 TRACE [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) handling packet PACKET(ReplicationLargeMessageEndMessage)[type=100, channelID=2, packetObject=ReplicationLargeMessageEndMessage]
11:33:23,825 DEBUG [org.hornetq.core.client] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) Invocation of interceptor org.hornetq.core.server.impl.ReplicationError on PACKET(ReplicationLargeMessageEndMessage)[type=100, channelID=2, packetObject=ReplicationLargeMessageEndMessage] returned true
11:33:23,825 WARN  [org.hornetq.core.server] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) HQ222015: mnovak: large Message 164 was deleted from largeMessages on backup.
11:33:23,825 WARN  [org.hornetq.core.server] (Old I/O client worker ([id: 0x2b37b4ea, /192.168.40.1:60844 => /192.168.40.1:5445])) HQ222090: Replication Large MessageID 164  is not available on backup server. Ignoring replication message

Trace logs from live and backup server attached.

Expected results:
There should be no warnings.
Comment 1 Miroslav Novak 2013-07-25 05:59:59 EDT
Created attachment 778169 [details]
server_logs.zip
Comment 2 Clebert Suconic 2013-07-25 06:33:46 EDT
Francisco Borges is in Paternity leave this week. I'm asking him to look at this as soon as he's back on next week.
Comment 4 Clebert Suconic 2013-07-25 18:30:34 EDT
I just got a notice of this one.. and I understand it being new, right?


Francisco is back early next week.. I've asked him to work on it... I'm also out this week.
Comment 7 Francisco Borges 2013-08-21 10:06:26 EDT
I've looked into it and my conclusion is that this is not a bug. Ok. Let me rephrase it: "not a real bug".

What happens is that on certain circumstances HQ will try to delete the same large-message twice. The "local disk" part of the HQ code (i.e. the non-replicated part) just ignores these, while the replication code will log a warning.

Since HQ seems to have chosen to ignore these locally, I think the best option is to just remove this warning from the replication code.
Comment 8 Miroslav Novak 2013-08-21 10:14:16 EDT
Agreed. It's really called twice. It's just needed to get rid of this warning because it appears in EAP logs and customers would be nervous about it.
Comment 10 Miroslav Novak 2013-09-23 06:03:19 EDT
Fix is not present in 6.2.0.ER1/2. Will be verified with EAP 6.2.0.ER3.
Comment 11 Miroslav Novak 2013-10-04 05:42:33 EDT
Verified in EAP 6.2.0.ER3. There are no more "Replication Large MessageID..." warnings in the log. Setting as verified.

Note You need to log in before you can comment on or make changes to this bug.