Bug 1175382

Summary: Async indexing: one node has substantially lower performance
Product: [JBoss] JBoss Data Grid 6 Reporter: Radim Vansa <rvansa>
Component: InfinispanAssignee: Gustavo Fernandes <gfernand>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Gencur <mgencur>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.4.0CC: gfernand, jdg-bugs, jpallich, pzapataf, rmarwaha
Target Milestone: GA   
Target Release: 6.4.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-04-22 12:36:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Radim Vansa 2014-12-17 15:46:49 UTC
I was running indexing performance test with asynchronous indexing (using 1 or 8 indexing threads) with JDG 6.4. On one of the nodes, the performance is substantially lower than on the others:


(click in the table on the configuration name - in first column - to show the statistics for each node separately rather than average on the whole node)

Note that the histograms are not displayed correctly (bug in reporting) but it seems that some of the requests (about 5%) take 5-10 seconds and another (again, about 5%) about 1 second - this raises the average (rest of the operations has expected latency distribution).

Comment 2 Gustavo Fernandes 2014-12-18 22:24:19 UTC
I've done some investigation, and this is likely caused by a contention due to the combination of:

* a bounded queue to store async jobs 
* the default configuration of the IndexWriter, that triggers frequent expensive merges of the index

So in theory under certain load the queue gets filled because merge(s) blocks any index subsequent work. As the rejection policy is 'block', it can cause slowdown in the writing thread

I'm saying *likely* because in my tests I did observe some slowdown in the indexing node, but far from being as drastic as yours. 

So could you please re-run the test again with a 'tuned' configuration to validate or deny this assumption?

default.worker.execution "async"
default.max_queue_length "10000"
default.indexwriter.merge_factor "30"
default.index_flush_interval "2000"
default.indexwriter.merge_max_size "1024"
default.indexwriter.ram_buffer_size "512"

Comment 5 Richa 2015-04-22 12:36:33 UTC
This bug is fixed in JDG 6.4.0