1175382 – Async indexing: one node has substantially lower performance

Bug 1175382 - Async indexing: one node has substantially lower performance

Summary: Async indexing: one node has substantially lower performance

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Data Grid 6
Classification:	JBoss
Component:	Infinispan
Sub Component:
Version:	6.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	GA
Target Release:	6.4.0
Assignee:	Gustavo Fernandes
QA Contact:	Martin Gencur
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-12-17 15:46 UTC by Radim Vansa
Modified:	2015-04-22 12:36 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-04-22 12:36:33 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	ISPN-5103	0	Major	Resolved	Inefficient index updates cause high cost merges and increase overall latency	2015-04-22 13:39:08 UTC

Description Radim Vansa 2014-12-17 15:46:49 UTC

I was running indexing performance test with asynchronous indexing (using 1 or 8 indexing threads) with JDG 6.4. On one of the nodes, the performance is substantially lower than on the others:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-perf-query-indexing/26/artifact/results/html/test_test.html

(click in the table on the configuration name - in first column - to show the statistics for each node separately rather than average on the whole node)

Note that the histograms are not displayed correctly (bug in reporting) but it seems that some of the requests (about 5%) take 5-10 seconds and another (again, about 5%) about 1 second - this raises the average (rest of the operations has expected latency distribution).

Comment 2 Gustavo Fernandes 2014-12-18 22:24:19 UTC

I've done some investigation, and this is likely caused by a contention due to the combination of:

* a bounded queue to store async jobs 
* the default configuration of the IndexWriter, that triggers frequent expensive merges of the index

So in theory under certain load the queue gets filled because merge(s) blocks any index subsequent work. As the rejection policy is 'block', it can cause slowdown in the writing thread

I'm saying *likely* because in my tests I did observe some slowdown in the indexing node, but far from being as drastic as yours. 

So could you please re-run the test again with a 'tuned' configuration to validate or deny this assumption?

default.worker.execution "async"
default.max_queue_length "10000"
default.indexwriter.merge_factor "30"
default.index_flush_interval "2000"
default.indexwriter.merge_max_size "1024"
default.indexwriter.ram_buffer_size "512"

Comment 3 Radim Vansa 2014-12-19 14:19:58 UTC

Did not help, at least not much.

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-perf-query-indexing/lastSuccessfulBuild/artifact/results/html/test_test.html

Comment 5 Richa 2015-04-22 12:36:33 UTC

This bug is fixed in JDG 6.4.0

Note You need to log in before you can comment on or make changes to this bug.