Bug 780090 (SOA-2457)

Summary: MessageSucker failures cause the delivery of the failed message to stall
Product: [JBoss] JBoss Enterprise SOA Platform 5 Reporter: david.boeren <david.boeren>
Component: JBoss MessagingAssignee: Rick Wagner <rwagner>
Status: CLOSED NOTABUG QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 5.0.0 GACC: kevin.conner
Target Milestone: ---   
Target Release: One Off Releases   
Hardware: Unspecified   
OS: Unspecified   
URL: http://jira.jboss.org/jira/browse/SOA-2457
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-30 20:02:58 UTC Type: Support Patch
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
helloworld.zip none

Description david.boeren 2010-10-21 15:03:58 UTC
Support Case Reference: https://c.na7.visual.force.com/apex/Case_View?id=500A00000044AOYIA2&sfdc.override=1
project_key: SOA

Note that the customer is already using a patch that may involved overlapping class files here: 
https://jira.jboss.org/browse/JBPAPP-5224 

So, this patch would have to go on top of the existing patch: 

---- 

The MessageSucker is responsible for migrating messages between different members of a cluster, it is a consumer to the remote queue from which it receives messages destined for the queue on the local cluster member. 

The onMessage routine, at its most basic, does the following 

- bookkeeping for the incoming message, including expiry 
- acknowledge the incoming message 
- attempt to deliver to the local queue 

When the delivery fails, the result is the *appearance* of lost messages. Those messages which are processed during the failure are not redelivered, but they still exist in the database. 

The only way I have found to trigger the redelivery of those messages is to redeploy the queue containing the messages and/or restart that app server. Obviously neither approach is acceptable. 

In order to trigger the error I created a SOA cluster which *only* shared the JMS database, and no other. I modified the helloworld quickstart to display a counter of messages consumed, clustered the *esb* queue, and then used byteman to trigger the faults. 

The byteman rule is as follows, the quickstart will be attached. 

RULE throw every fifth send 
INTERFACE ProducerDelegate 
METHOD send 
AT ENTRY 
IF callerEquals("MessageSucker.onMessage", true) && (incrementCounter("throwException") % 5 == 0) 
DO THROW new IllegalStateException("Deliberate exception") 
ENDRULE 

This results in an exception being thrown for every fifth message. Once the delivery has quiesced, examine the JBM_MSG and JBM_MSG_REF tables to see the messages which have not been delivered. 

The clusters are ports-default and ports-01, the client seeds the gateway by sending 300 messages to the default. 

Adding up the counter from each server *plus* the message count from JBM_MSG results in 300 (or multiples thereof for more executions).

Comment 1 david.boeren 2010-10-21 15:04:15 UTC
Attachment: Added: helloworld.zip


Comment 2 david.boeren 2010-10-21 15:04:50 UTC
Link: Added: This issue is related to JBPAPP-5280


Comment 3 Justin Bertram 2011-01-05 20:39:53 UTC
This was opened in the wrong project.  The real issue should have been opened in the JBMESSAGING project.  See JBMESSAGING-1822, and the backport of this for SOA at SOA-2526.

Comment 4 Kevin Conner 2011-01-06 09:17:53 UTC
This was opened in the correct project as it is logged against the platform.  SOA-2526 duplicates this issue and was where the backport work was handled.

Comment 5 Kevin Conner 2011-01-06 09:18:35 UTC
Link: Added: This issue is duplicated by SOA-2526


Comment 6 Rick Wagner 2011-11-30 20:02:58 UTC
Resolved.