Bug 778350 (SOA-835) - JBM stops processing message due to switch to PULL ServerInvokerCallbackHandler
Summary: JBM stops processing message due to switch to PULL ServerInvokerCallbackHandler
Keywords:
Status: CLOSED NEXTRELEASE
Alias: SOA-835
Product: JBoss Enterprise SOA Platform 4
Classification: JBoss
Component: JBoss Messaging, EAP, 3rd Party
Version: 4.3 IR4
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.3 CP01
Assignee: Tim Fox
QA Contact:
URL: http://jira.jboss.org/jira/browse/SOA...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-09-25 13:24 UTC by Jiri Pechanec
Modified: 2009-03-06 06:36 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-03-06 06:36:45 UTC
Type: Bug


Attachments (Terms of Use)
jbm-behaviour.txt (20.52 KB, text/plain)
2008-09-26 04:51 UTC, Jiri Pechanec
no flags Details
client.log.gz (585.22 KB, application/x-gzip)
2008-09-29 07:31 UTC, Jiri Pechanec
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker SOA-835 0 None Closed JBM stops processing message due to switch to PULL ServerInvokerCallbackHandler 2012-01-13 14:33:32 UTC

Description Jiri Pechanec 2008-09-25 13:24:48 UTC
Date of First Response: 2008-09-25 09:55:38
project_key: SOA

On different machines and hevier load it can happen that JBM stops processing of messages without any visible reason.
JBM is configured to use bi-socket transport which for which it PUSH callback is used. Suddenly the PUSH callback is switched to PULL callback which means that when JMS sends message then the sending process is not completed.
The probable root of cause is in JBoss Remoting - from the investigation it seems that sometimes when the remoting session is closed it closes handler callback client for different session.
I have contacted to Ron S. with request for help and continue with the investigation.

Comment 1 Jiri Pechanec 2008-09-25 13:25:27 UTC
Component is not set intentionally because it is still not sure if the problem is Remoting or JBM.

Comment 2 Aleksandar Kostadinov 2008-09-25 13:55:38 UTC
Are you sure correct classes are used for client and server? Could that be some side effect of jbossall-client.jar containing wrong remoting version? Do you see the same with EAP?

Comment 3 Jiri Pechanec 2008-09-25 14:13:36 UTC
Yes, I am sure with correct classes

I do not see the same with EAP - these are ESB tests

Comment 4 Jiri Pechanec 2008-09-26 04:51:14 UTC
Analysis was sent to Ron - see attached file. From the log files it seems that server is closing session that is still used on the client

Comment 5 Jiri Pechanec 2008-09-26 04:51:14 UTC
Attachment: Added: jbm-behaviour.txt


Comment 6 Mark Little 2008-09-27 08:55:17 UTC
Please create a linked JIRA in EAP since all JBM fixes must come through that route.

Comment 7 Jiri Pechanec 2008-09-29 07:30:52 UTC
Len has made analysis of the provided client log - unfortunately I have sent the wrong one :-(
So the investigation is still under process
There is not workaround available yet

Comment 8 Jiri Pechanec 2008-09-29 07:31:47 UTC
Correct client log file

Comment 9 Jiri Pechanec 2008-09-29 07:31:47 UTC
Attachment: Added: client.log.gz


Comment 10 Jiri Pechanec 2008-09-29 07:35:55 UTC
Link: Added: This issue depends JBPAPP-1219


Comment 11 Jiri Pechanec 2008-10-01 05:09:25 UTC
Mail from Ron
Hi Jiri,
 
Yes, I think you've put your finger on it.  Line
 
2008-09-24 05:25:54,111 DEBUG [Timer-10][org.jboss.remoting.ConnectionValidator] ConnectionValidator[SocketClientInvoker[e56328, bisocket://localhost:4457], pingPeriod=2000 ms]'s connections is invalid
 
indicates that the ConnectionValidator ping has timed out, at which point it will
 
1. notify all registered listeners, and, by default,
2. shut down the related LeasePinger,
 
Once the LeasePinger shuts down, the following will happen on the server side:
 
1. the server side Lease will notify its registered listeners of a connection failure, and, by default,
2. the ServerInvokerCallbackHandler will close its client.
 
The latter is what turns push callbacks into apparent pull callbacks.
 
Notice that the ConnectionValidator ping failure occurs about 1 second after the ping is sent.  That's the default ping timeout, and it's too small, which is causing false timeout.  I would suggest:
 
1. adding something like
 
    <attribute name="validatorPingPeriod" isParam="true">10000</attribute>
    <attribute name="validatorPingTimeout" isParam="true">5000</attribute>
 
to remoting-bisocket-service.xml, which will extend the ping timeout and reduce the likelihood of false timeouts, and
 
2. like I mentioned before, add
 
    <attribute name="registerCallbackListener">false</attribute>
 
(if Tim agrees).
 
-Ron

It seems thus that default values are too small - I will try to re-run tests with new values and if this fixes the issue we should consider to change or recommend the change in Release Notes.

Comment 12 Aleksandar Kostadinov 2008-10-02 13:33:31 UTC
The below is not relevant if the transport in question is reliable. Just think it is wiser generally to require few pings missed to mark a connection broken.

Comment 13 Dana Mison 2009-02-06 02:27:38 UTC
Release Note:
To allow greater load tolerance in the default configuration, the attributes of
validatorPingPeriod and validatorPingTime are now defined as 10000 and 5000 milliseconds respectively and registerCallbackListener is set to false.
These are defined in remoting-bisocket-service.xml

Comment 14 Jiri Pechanec 2009-03-06 06:36:45 UTC
Verified in CR3
 - Doc is OK
 - The issues was not observed in CP1 yet


Note You need to log in before you can comment on or make changes to this bug.