1198556 – Optimize Putall operations

Bug 1198556 - Optimize Putall operations

Summary: Optimize Putall operations

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Data Grid 6
Classification:	JBoss
Component:	Infinispan
Sub Component:
Version:	6.5.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	ER2
Target Release:	6.5.0
Assignee:	William Burns
QA Contact:	Martin Gencur
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1186882 1213367
TreeView+	depends on / blocked

Reported:	2015-03-04 11:29 UTC by Pedro Zapata
Modified:	2015-06-23 12:24 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Clones:	1213367 (view as bug list)
Environment:
Last Closed:	2015-06-23 12:24:57 UTC
Type:	Enhancement
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	ISPN-2183	Minor	Resolved	Add the ability to fetch a set of keys at once (getAll)	2018-07-27 07:40:47 UTC
Red Hat Issue Tracker	ISPN-5264	Major	Resolved	Optimize PutAll operations (library mode)	2018-07-27 07:40:47 UTC
Red Hat Issue Tracker	ISPN-5265	Major	Resolved	Optimize Getall operations over Hot Rod	2018-07-27 07:40:47 UTC
Red Hat Issue Tracker	ISPN-5266	Major	Resolved	Optimize PutAll operations (Hot Rod)	2017-07-19 17:13:28 UTC

Description Pedro Zapata 2015-03-04 11:29:23 UTC

We have an internal need for improving the performance of GetAll and PutAll for existing JDG features.

1) GetAll: to ensure that there’s a way to batch get multiple objects together, both in Library mode and over Hot Rod.
 
2) PutAll: to ensure that customers can load massive amount of data into the grid quickly, both in Library mode and over Hot Rod.
 
Additionally, we have requests for this performance improvement from customers/prospects.

Comment 1 JBoss JIRA Server 2015-03-26 18:42:24 UTC

William Burns <wburns> updated the status of jira ISPN-5264 to Coding In Progress

Comment 2 JBoss JIRA Server 2015-03-31 14:52:27 UTC

William Burns <wburns> updated the status of jira ISPN-5265 to Coding In Progress

Comment 3 JBoss JIRA Server 2015-03-31 15:45:44 UTC

William Burns <wburns> updated the status of jira ISPN-5265 to Open

Comment 4 JBoss JIRA Server 2015-03-31 15:46:59 UTC

William Burns <wburns> updated the status of jira ISPN-5266 to Coding In Progress

Comment 5 JBoss JIRA Server 2015-04-07 09:34:08 UTC

Tristan Tarrant <ttarrant> updated the status of jira ISPN-5263 to Closed

Comment 6 JBoss JIRA Server 2015-04-09 14:21:55 UTC

William Burns <wburns> updated the status of jira ISPN-5264 to Reopened

Comment 7 William Burns 2015-04-15 17:24:43 UTC

PR for putAll https://github.com/infinispan/jdg/pull/613

Comment 8 William Burns 2015-04-20 11:38:39 UTC

This issue is to handle the putAll improvements for both embedded and remote caches.

Comment 9 JBoss JIRA Server 2015-04-23 11:47:19 UTC

William Burns <wburns> updated the status of jira ISPN-5265 to Coding In Progress

Comment 10 Martin Gencur 2015-05-04 07:54:17 UTC

Here are some measurements:
Performance comparison of putAll() operation with ER1, ER2, it looks it is fast when data are loaded on single node. I have tried following configurations:
  a) HR client - 500MBs of data single node, ER2 is 3-4x faster then ER1, 1 entry is 1024Bs, 20bytes for key and rest for value.
  b) HR client - 200MBs of data cluster of 2 nodes, NO PERFORMANCE IMPROVEMENT FOUND
    ON ER1 [Data loading took 0 hours-0 minutes-58 seconds-325 milliseconds]
    ON ER2 [Data loading took 0 hours-0 minutes-58 seconds-197 milliseconds] 
  c) HR client- 500MBs, increased heap size to 3Gbs, with ER2 test fails with OOM so NO PERFORMANCE IMPROVEMENT
    ON ER1 [Data loading took 0 hours-1 minutes-55 seconds-678 milliseconds]
    ON ER2 OOM occurs
         ERROR [org.jgroups.protocols.UNICAST3] (OOB-21,shared=udp) JGRP000039: JDG2/clustered: failed to deliver OOB message [dst: JDG2/clustered, src: JDG1/clustered (3 headers), size=60000 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]: java.lang.OutOfMemoryError: Java heap space
         ERROR [org.jgroups.protocols.UNICAST3] (OOB-16,shared=udp) JGRP000039: JDG1/clustered: failed to deliver OOB message [dst: JDG1/clustered, src: JDG2/clustered (3 headers), size=60000 bytes, flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]: java.lang.OutOfMemoryError: Java heap space
  d) Library mode with EAP, 200MBs, NO PERFORMANCE IMPROVEMENT
    ON ER1 [Data loading took 0 hours-0 minutes-0 seconds-869 milliseconds] 
    ON ER2 [Data loading took 0 hours-0 minutes-0 seconds-897 milliseconds]
  e) Library mode with EAP, 500MBs, NO PERFORMANCE IMPROVEMENT   
    ON ER1 [Data loading took 0 hours-0 minutes-2 seconds-930 milliseconds]
    ON ER2 [Data loading took 0 hours-0 minutes-2 seconds-853 milliseconds]

Response from wburns:

If the library mode is with only 2 nodes, that is to be expected.  The changes for embedded should only improve performance for DIST only when you have more nodes than numOwners.

The remote is odd, it should be substantially faster if you have a good amount of entries > 10 in pretty much all cases.

The extra memory usage is to be expected, unfortunately.  Compared to ER1 it sequentially sent each entry so there was a smaller message overhead, where as now it has to hold the entire map in a message.  So 3 GB for both servers and client with 500MB being inserted is not going to be enough.

I will dig a little to see why it wasn't performing properly in the 200MBs range though.

Martin:
We'll probably need to run a few more tests with the number of nodes > numOwners.

Comment 12 Vojtech Juranek 2015-05-11 08:55:04 UTC

Hi Will,
in order to proceed with verification of this BZ, could you please specify more precisely what should be outcome from this issue, ideally in terms under which circumstances the performance should increase and rough estimate of improvement (i.e. which improvement you want to achieve)?
Thanks
Vojta

Comment 17 William Burns 2015-05-12 12:02:06 UTC

(In reply to Martin Gencur from comment #10)
> Here are some measurements:
> Performance comparison of putAll() operation with ER1, ER2, it looks it is
> fast when data are loaded on single node. I have tried following
> configurations:
>   a) HR client - 500MBs of data single node, ER2 is 3-4x faster then ER1, 1
> entry is 1024Bs, 20bytes for key and rest for value.
>   b) HR client - 200MBs of data cluster of 2 nodes, NO PERFORMANCE
> IMPROVEMENT FOUND
>     ON ER1 [Data loading took 0 hours-0 minutes-58 seconds-325 milliseconds]
>     ON ER2 [Data loading took 0 hours-0 minutes-58 seconds-197 milliseconds] 
>   c) HR client- 500MBs, increased heap size to 3Gbs, with ER2 test fails
> with OOM so NO PERFORMANCE IMPROVEMENT
>     ON ER1 [Data loading took 0 hours-1 minutes-55 seconds-678 milliseconds]
>     ON ER2 OOM occurs
>          ERROR [org.jgroups.protocols.UNICAST3] (OOB-21,shared=udp)
> JGRP000039: JDG2/clustered: failed to deliver OOB message [dst:
> JDG2/clustered, src: JDG1/clustered (3 headers), size=60000 bytes,
> flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]: java.lang.OutOfMemoryError: Java heap
> space
>          ERROR [org.jgroups.protocols.UNICAST3] (OOB-16,shared=udp)
> JGRP000039: JDG1/clustered: failed to deliver OOB message [dst:
> JDG1/clustered, src: JDG2/clustered (3 headers), size=60000 bytes,
> flags=OOB|DONT_BUNDLE|NO_TOTAL_ORDER]: java.lang.OutOfMemoryError: Java heap
> space
>   d) Library mode with EAP, 200MBs, NO PERFORMANCE IMPROVEMENT
>     ON ER1 [Data loading took 0 hours-0 minutes-0 seconds-869 milliseconds] 
>     ON ER2 [Data loading took 0 hours-0 minutes-0 seconds-897 milliseconds]
>   e) Library mode with EAP, 500MBs, NO PERFORMANCE IMPROVEMENT   
>     ON ER1 [Data loading took 0 hours-0 minutes-2 seconds-930 milliseconds]
>     ON ER2 [Data loading took 0 hours-0 minutes-2 seconds-853 milliseconds]
> 
> Response from wburns:
> 
> If the library mode is with only 2 nodes, that is to be expected.  The
> changes for embedded should only improve performance for DIST only when you
> have more nodes than numOwners.
> 
> The remote is odd, it should be substantially faster if you have a good
> amount of entries > 10 in pretty much all cases.
> 
> The extra memory usage is to be expected, unfortunately.  Compared to ER1 it
> sequentially sent each entry so there was a smaller message overhead, where
> as now it has to hold the entire map in a message.  So 3 GB for both servers
> and client with 500MB being inserted is not going to be enough.
> 
> I will dig a little to see why it wasn't performing properly in the 200MBs
> range though.
> 
> Martin:
> We'll probably need to run a few more tests with the number of nodes >
> numOwners.

To further clarify the 3 GB note I put here.  The issue was that both servers and the client were ran on the same JVM.  So in that case 3 GB for all of these combined is not enough.  Normally these would be on separate JVMs and a much smaller heap could be utilized.  Just a bit more than double the target putAll map size should be sufficient (this would allow for the map itself and the message containing the map being fully serialized).

Note You need to log in before you can comment on or make changes to this bug.