Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 745910 (EDG-18)

Summary:	poor performance when using virtual nodes
Product:	[JBoss] JBoss Data Grid 6	Reporter:	Michal Linhard <mlinhard>
Component:	unspecified	Assignee:	Default User <jbpapp-maint>
Status:	CLOSED NEXTRELEASE	QA Contact:
Severity:	high	Docs Contact:
Priority:	high
Version:	unspecified	CC:	bela, galder.zamarreno, jdg-bugs, mlinhard, nobody, prabhat.jha
Target Milestone:	---
Target Release:	6.0.0
Hardware:	Unspecified
OS:	Unspecified
URL:	http://jira.jboss.org/jira/browse/EDG-18
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-09-01 08:24:21 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Michal Linhard 2011-07-27 19:53:50 UTC

project_key: EDG

a run with numVirtualNodes=1 achieves max throughput of cca 180K ops/sec
(see https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-stress-client-hotrod-size4/66/artifact/report/log.txt)
a run with numVirtualNodes=500 stabilizes around cca 45K ops/sec
(see https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-stress-client-hotrod-size4/68/artifact/report/log.txt)

Comment 1 Michal Linhard 2011-07-27 21:01:57 UTC

jprofiler snapshot, four nodes, perf17 was profiled, 1000 clients, 1 iteration
https://hudson.qa.jboss.com/hudson/view/EDG6/job/edg-60-stress-client-hotrod-size4/70/artifact/report/jprofiler-snapshot.jps

Comment 2 Michal Linhard 2011-07-27 21:25:01 UTC

there's something very suspicious there:
all the traffic comes via JGroups replication channel:
19.7% - 811 s - 139,656 inv. org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle
and almost none via netty server interface:
0.3% - 13,367 ms - 484 inv. org.infinispan.server.core.AbstractProtocolDecoder.messageReceived

my first suspect is hotrod client not routing properly - perf17 isn't getting any requests.

Comment 3 Michal Linhard 2011-07-28 07:59:20 UTC

in my laptop I ran 4 instances bound to test1-test4 and hotrod client routest to only two: test1 and test4
digging further ...

Comment 4 Galder Zamarreño 2011-07-28 08:02:43 UTC

I haven't checked the profiler data or anything but as a FYI: As noted in https://issues.jboss.org/browse/ISPN-1090?focusedCommentId=12612485&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12612485, the virtual topologies sent back in this version of the protocol are not the most efficient and we have some improvements coming up in version 2.

Comment 5 Michal Linhard 2011-07-28 08:16:49 UTC

sending of the topologies shouldn't be a performance issue, they're sent only once per topology change, right ?

Comment 6 Galder Zamarreño 2011-07-28 08:25:01 UTC

Indeed. That's why I noted it as a FYI, cos I don't think it's particularly relevant here. If topology was been sent too often, you'd be able to spot by too many calls in HotRodEncoder to writeHashTopologyHeader() method.

Comment 7 Michal Linhard 2011-07-28 09:03:02 UTC

Link: Added: This issue depends ISPN-1273

Comment 8 Michal Linhard 2011-07-28 09:22:24 UTC

the performance effect of ISPN-1273 is clear to me now:
hot rod client gets only last hash id for each server - this means hash ids are likely to end up very closely to each other on the hash wheel - which means picking the first two ones from whatever server count.

this means hotrod client contacts only two of the four servers -> ending up with very inefficient unnecessary network hopping.

Comment 9 Galder Zamarreño 2011-08-02 07:01:46 UTC

ISPN-1273 is resolved now, so this should be closed?

Comment 10 Michal Linhard 2011-08-02 08:03:38 UTC

I'm gonna try a run with snapshot and then I'll close it...

Comment 11 Anne-Louise Tangring 2011-09-26 19:41:26 UTC

Docs QE Status: Removed: NEW