Bug 780088 (SOA-2455) - MessageSucker failures cause the delivery of the failed message to stall
Summary: MessageSucker failures cause the delivery of the failed message to stall
Keywords:
Status: CLOSED NEXTRELEASE
Alias: SOA-2455
Product: JBoss Enterprise SOA Platform 5
Classification: JBoss
Component: JBoss Messaging
Version: 5.0.0 GA
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 5.1.0 GA
Assignee: Kevin Conner
QA Contact:
URL: http://jira.jboss.org/jira/browse/SOA...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-21 14:09 UTC by Kevin Conner
Modified: 2011-02-12 13:29 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-12 13:29:11 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
helloworld.zip (12.50 KB, application/zip)
2010-10-21 14:21 UTC, Kevin Conner
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 780197 0 urgent CLOSED Exception "javax.jms.IllegalStateException: Cannot find session with id ..." 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker SOA-2455 0 Blocker Closed MessageSucker failures cause the delivery of the failed message to stall 2012-07-18 18:11:12 UTC

Internal Links: 780197

Description Kevin Conner 2010-10-21 14:09:46 UTC
project_key: SOA

The MessageSucker is responsible for migrating messages between different members of a cluster, it is a consumer to the remote queue from which it receives messages destined for the queue on the local cluster member.

The onMessage routine, at its most basic, does the following

- bookkeeping for the incoming message, including expiry
- acknowledge the incoming message
- attempt to deliver to the local queue

When the delivery fails, the result is the *appearance* of lost messages.  Those messages which are processed during the failure are not redelivered, but they still exist in the database.

The only way I have found to trigger the redelivery of those messages is to redeploy the queue containing the messages and/or restart that app server.  Obviously neither approach is acceptable.

In order to trigger the error I created a SOA cluster which *only* shared the JMS database, and no other.  I modified the helloworld quickstart to display a counter of messages consumed, clustered the *esb* queue, and then used byteman to trigger the faults.

The byteman rule is as follows, the quickstart will be attached.

RULE throw every fifth send
INTERFACE ProducerDelegate
METHOD send
AT ENTRY
IF callerEquals("MessageSucker.onMessage", true) && (incrementCounter("throwException") % 5 == 0)
DO THROW new IllegalStateException("Deliberate exception")
ENDRULE

This results in an exception being thrown for every fifth message.  Once the delivery has quiesced, examine the JBM_MSG and JBM_MSG_REF tables to see the messages which have not been delivered.

The clusters are ports-default and ports-01, the client seeds the gateway by sending 300 messages to the default.

Adding up the counter from each server *plus* the message count from JBM_MSG results in 300 (or multiples thereof for more executions).

Comment 1 Kevin Conner 2010-10-21 14:13:24 UTC
The problem seems to exist in SOA 5.0.0, SOA 5.0.2 and SOA 5.1.0 ER3

Comment 2 Kevin Conner 2010-10-21 14:21:33 UTC
quickstart

Comment 3 Kevin Conner 2010-10-21 14:21:33 UTC
Attachment: Added: helloworld.zip


Comment 4 Kevin Conner 2010-10-21 14:39:01 UTC
Link: Added: This issue depends JBPAPP-5280


Comment 5 Anne-Louise Tangring 2010-10-27 19:14:24 UTC
The JBoss Messaging team is working on a resolution. If that is a One Off patch, we will include it in SOA 5.1.0. 

Comment 6 Kevin Conner 2010-11-26 10:03:14 UTC
The current issues that need checking and porting, if not already present in 1.4.7, are the following.

JBMESSAGING-1822, JBMESSAGING-1805, JBMESSAGING-1809, and JBMESSAGING-1774.

There are also two races which need to be included, JBMESSAGING-1828 and JBMESSAGING-1831

Comment 7 Kevin Conner 2010-12-03 15:44:05 UTC
Link: Added: This issue depends JBPAPP-5505


Comment 8 Kevin Conner 2010-12-03 15:59:55 UTC
Link: Added: This issue related SOA-2578


Comment 9 Julian Coleman 2010-12-05 16:37:51 UTC
Resolved with revision 7524 of:

  build-tools/builders/eap/post-patch/patch_db_persistance_conf.xml

Commit message:

  SOA-2455/SOA-2578
  Add the patches from JBPAPP-5505 to jboss-messaging-client.jar,
  jboss-messaging.jar and the &-persistence-service.xml files.


Comment 10 Julian Coleman 2010-12-05 16:37:51 UTC
Labels: Added: rn-open


Comment 11 Laura Bailey 2010-12-17 01:10:55 UTC
Release Notes Docs Status: Added: Not Yet Documented


Comment 13 Dana Mison 2011-01-05 00:14:27 UTC
Writer: Added: dlesage


Comment 16 Kevin Conner 2011-01-06 09:41:45 UTC
Reopening until we get confirmation from EAP QE team.

Comment 17 Kevin Conner 2011-01-24 12:39:10 UTC
QA passed, no regressions.

Comment 18 David Le Sage 2011-02-11 00:11:01 UTC
Temporarily reopening to update release note status.

Comment 19 David Le Sage 2011-02-11 00:13:54 UTC
Release Notes Docs Status: Removed: Not Yet Documented Added: Documented as Resolved Issue
Release Notes Text: Added: https://issues.jboss.org/browse/SOA-2455

The MessageSucker is responsible for migrating messages between different members of a cluster, it is a consumer of the remote queue from which it receives messages destined for the queue on the local cluster member. If it stalls, failed messages are not redelivered. They remain in the database.  Fixes have been made to JBoss Messaging so that a stall will trigger the redelivery of these messages.




Comment 20 Laura Bailey 2011-02-12 13:27:00 UTC
Reopening to modify release note text, will set back to Closed -> Done shortly.

Comment 21 Laura Bailey 2011-02-12 13:29:11 UTC
Setting back to Closed -> Done, having modified release note text.

Comment 22 Laura Bailey 2011-02-12 13:29:11 UTC
Release Notes Text: Removed: https://issues.jboss.org/browse/SOA-2455

The MessageSucker is responsible for migrating messages between different members of a cluster, it is a consumer of the remote queue from which it receives messages destined for the queue on the local cluster member. If it stalls, failed messages are not redelivered. They remain in the database.  Fixes have been made to JBoss Messaging so that a stall will trigger the redelivery of these messages.

 Added: The MessageSucker migrates messages between different members of a cluster. It consumes from a remote queue, from which it receives messages destined for the queue on the local cluster member. If it stalls, failed messages are not redelivered, and remain in the database. JBoss Messaging has been modified so that a stall triggers message redelivery.





Note You need to log in before you can comment on or make changes to this bug.