Bug 745882 (EDG-116) - HotRod server refuses connections shortly after start
Summary: HotRod server refuses connections shortly after start
Keywords:
Status: CLOSED NEXTRELEASE
Alias: EDG-116
Product: JBoss Data Grid 5
Classification: JBoss
Component: Infinispan
Version: EAP 5.1.0 EDG TP
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: EAP 5.1.0 EDG TP
Assignee: Default User
QA Contact:
URL: http://jira.jboss.org/jira/browse/EDG...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-11 11:25 UTC by Michal Linhard
Modified: 2014-03-17 04:02 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-09-26 19:31:58 UTC
Type: Bug


Attachments (Terms of Use)
results.ods (40.82 KB, application/vnd.oasis.opendocument.spreadsheet)
2011-05-11 11:26 UTC, Michal Linhard
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker EDG-116 0 None None None Never

Description Michal Linhard 2011-05-11 11:25:09 UTC
project_key: EDG

In resilience tests we're seeing connection refused shortly after restart of the node.
we have 4 nodes perf17-perf20. we're failing perf19.
exactly after the perf19 finishes it's join rehash ([JBoss] 05:00:07,062 INFO  [JoinTask] perf19-58461 completed join rehash in 16.22 seconds!)
the driver nodes (perf02-perf10) start trying to connect to it and it's not yet ready to receive the connections.

Is there a period of time between the new node is officially in cluster (and therefore hotrod clients obtain it via topology change piggybacking) and the hotrod server is started ?

Shouldn't we eliminate this period ?

the affected run is:
http://hudson.qa.jboss.com/hudson/view/EDG/job/edg-51x-resilience-client-size4-hotrod/58/
I realized that there are sampling errors not only during node failure but also during node recovery (even more than during failure) and they are the mentioned connection refused exceptions.

Comment 1 Michal Linhard 2011-05-11 11:26:30 UTC
results.ods - attaching compiled data from the hudson run. the approximate times of fail and restore events are marked in the table.

Comment 2 Michal Linhard 2011-05-11 11:26:30 UTC
Attachment: Added: results.ods


Comment 3 Galder Zamarreño 2011-08-03 07:11:27 UTC
Michal, does this need looking into?

Comment 4 Michal Linhard 2011-08-03 08:38:04 UTC
I'll verify this one, it might be applicable also to EDG6 Alpha

Comment 5 Michal Linhard 2011-08-03 15:57:19 UTC
This will take a bit longer, I'll need to get resilience tests going.

Comment 6 Michal Linhard 2011-09-26 19:31:58 UTC
This is now obsolete, when smilar thing occurs for EDG6, we'll create a new JIRA.

Comment 7 Anne-Louise Tangring 2011-10-11 17:09:35 UTC
Docs QE Status: Removed: NEW 



Note You need to log in before you can comment on or make changes to this bug.