Bug 1377299

Summary: (6.4.z) Clustering performance regression when using invalidation cache and shared cache store
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Michal Vinkler <mvinkler>
Component: ClusteringAssignee: jboss-set
Status: CLOSED WONTFIX QA Contact: Michal Vinkler <mvinkler>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4.8CC: afield, bmaxwell, paul.ferraro
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-01 12:28:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
performance comparison none

Description Michal Vinkler 2016-09-19 11:49:40 UTC
Created attachment 1202456 [details]
performance comparison

Description of problem:

We can see a performance regression in our stress tests, which test performance of the cluster under increasing load (number of concurrent clients). The regression can be seen when using an invalidation cache + a shared cache store compared to using distributed cache without any cache store.

Scenario description:
HTTP traffic accessing clustered web application that has replicated sessions (uses a mod_cluster load balancer). Delay between sending a new request after receiving a response is 100 ms (for each client). Session size is 34 KB.

Configuration:
 - 4-node EAP cluster with an invalidation cache + a shared cache store (remote JDG cluster)
 - 4-node JDG cluster with distributed cache
 - 4 nodes generating load
 - cache mode: ASYNC or SYNC (for both invalidation and distributed caches, also "write-behind" element is set for "remote-store" element accordingly)

Links to configuration files:
SYNC scenario:
EAP http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stress-remote-jdg-session-invalidation-sync-4nodes-perf17/3/artifact/report/config/jboss-perf18/standalone-ha.xml
JDG http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stress-remote-jdg-session-invalidation-sync-4nodes-perf17/3/artifact/report/config/jboss-perf22/clustered.xml

ASYNC scenario:
EAP http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stress-remote-jdg-session-invalidation-async-4nodes-perf17/3/artifact/report/config/jboss-perf18/standalone-ha.xml
JDG http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-stress-remote-jdg-session-invalidation-async-4nodes-perf17/3/artifact/report/config/jboss-perf22/clustered.xml


See the attachment "stress-test.png" for graphs with results. There are two scenarios (#1 SYNC, #2 ASYNC). Each scenario compares two runs. The graph on the left uses invalidation cache while the run on the right uses distributed cache. 

Link to full report with extended CPU and memory statistics:
http://download.eng.brq.redhat.com/scratch/mvinkler/reports/2016-09-19_13-48-52/stress.html

Horizontal axis represents number of concurrent clients generating load.
Vertical axis represents throughput = TPS (number of requests processed per second). 
Red line represents total TPS. Blue line represents number of requests requests processed in less than 3 seconds.

In short:
SYNC scenario: run with distributed cahce was able to reach 23000 TPS compared to ~4000 TPS in run with invalidation cache.
ASYNC scenario: run with distributed cahce was able to reach 31000 TPS compared to ~12000 TPS in run with invalidation cache.

Comment 1 Paul Ferraro 2016-09-19 13:33:48 UTC
"We can see a performance regression in our stress tests, which test performance of the cluster under increasing load (number of concurrent clients). The regression can be seen when using an invalidation cache + a shared cache store compared to using distributed cache without any cache store."

This is not a regression.  This is the 1st time performance testing has been done on this configuration and there is no reason to expect comparable performance with our default configuration.  In fact, we expected performance of this configuration in EAP6 to be much worse than the default configuration.