Bug 1528566 - Performance Drop observed when cluster.eager-lock turned on [NEEDINFO]
Summary: Performance Drop observed when cluster.eager-lock turned on
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: replicate
Version: rhhi-1.1
Hardware: Unspecified
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.4.0
Assignee: Pranith Kumar K
QA Contact: Nikhil Chawla
URL:
Whiteboard:
Depends On: 1499644
Blocks: 1503137
TreeView+ depends on / blocked
 
Reported: 2017-12-22 07:15 UTC by Nikhil Chawla
Modified: 2018-09-17 14:24 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.12.2-6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-04 06:40:20 UTC
ravishankar: needinfo? (bturner)
pkarampu: needinfo? (ksandha)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 06:42:11 UTC

Description Nikhil Chawla 2017-12-22 07:15:46 UTC
Description of problem:

On my RHHI setup, by default, the cluster.eager-lock was enabled on all the volumes. When I disabled it, and recorded the numbers, I observed a rise in numbers approx. 20-25% in range. Here is the data:

Database for Bench-marking used: MySQL
Database Size: 100GB


16 vcpu / 32G memory in the Database VM				
5 minute run - 8G Bufferpool         |with cluster.eager-lock off on allvolumes					
No of clients	n_overall	opm  |	n_overall	opm
10	            1640	161  |	2501	        221
20	            1828	180  |	2957	        291
40	            1831	112  |	2926     	288
80	            1991	196  |	2833	        279
99	            2763	272  |	2714	        267


Link to the Gluster profiling  Data:

http://perf1.perf.lab.eng.bos.redhat.com/Nikhil/gluster_profile/EagerLockOFF/

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 5 Ravishankar N 2017-12-26 06:20:00 UTC
(In reply to Nikhil Chawla from comment #0)
 
> 16 vcpu / 32G memory in the Database VM				
> 5 minute run - 8G Bufferpool         |with cluster.eager-lock off on
> allvolumes					
> No of clients	n_overall	opm  |	n_overall	opm
> 10	            1640	161  |	2501	        221
> 20	            1828	180  |	2957	        291
> 40	            1831	112  |	2926     	288
> 80	            1991	196  |	2833	        279
> 99	            2763	272  |	2714	        267
> 
> 

Sorry, I don't understand this metric. What is n_overall and opm? 
Just to be clear, {1640, 161} is with eager-lock on and {2501, 221} is with eager-lock off?

> Link to the Gluster profiling  Data:
> 
> http://perf1.perf.lab.eng.bos.redhat.com/Nikhil/gluster_profile/EagerLockOFF/
> 

I see profile info for different no. of clients, but is it for eager lock off or on? We would need both to compare and contrast.

Comment 6 Nikhil Chawla 2017-12-26 06:37:57 UTC
(In reply to Ravishankar N from comment #5)
> (In reply to Nikhil Chawla from comment #0)
>  
> > 16 vcpu / 32G memory in the Database VM				
> > 5 minute run - 8G Bufferpool         |with cluster.eager-lock off on
> > allvolumes					
> > No of clients	n_overall	opm  |	n_overall	opm
> > 10	            1640	161  |	2501	        221
> > 20	            1828	180  |	2957	        291
> > 40	            1831	112  |	2926     	288
> > 80	            1991	196  |	2833	        279
> > 99	            2763	272  |	2714	        267
> > 
> > 
> 
> Sorry, I don't understand this metric. What is n_overall and opm? 
> Just to be clear, {1640, 161} is with eager-lock on and {2501, 221} is with
> eager-lock off?
> Since I am using a MySQL simulator which MySQL queries in the DB. n_overall is the total number of transactions done in 5 minutes of Run-time and "opm" is the average transactions per minute. Yes! The higher number obtained after disabling the eager lock. 
> > Link to the Gluster profiling  Data:
> > 
> > http://perf1.perf.lab.eng.bos.redhat.com/Nikhil/gluster_profile/EagerLockOFF/
> > 
> 
> I see profile info for different no. of clients, but is it for eager lock
> off or on? We would need both to compare and contrast.
> This is for Eager Lock OFF. I don't have the profiling data with eager lock ON, and I can't generate it, because I've to regenerate the database for this, which will take 3-4 days. Although I am trying my best to remove the MySQL caches from all the sources, and regenerate the profiling data with Eager Lock ON.

Comment 7 Ravishankar N 2017-12-26 07:04:59 UTC
(In reply to Nikhil Chawla from comment #6)
> Since I am using a MySQL simulator which MySQL queries in the DB. n_overall is > the total number of transactions done in 5 minutes of Run-time and "opm" is the > average transactions per minute. Yes! The higher number obtained after >disabling the eager lock. 

Thanks Nikhil. It is strange though, that (n_overall divided by 5) does not seem to be near the opm values.

> This is for Eager Lock OFF. I don't have the profiling data with eager lock >> > ON, and I can't generate it, because I've to regenerate the database for this, > which will take 3-4 days. 

Cool, I will wait for the profile info from both runs. You could just run the test for '10 clients' with both eager lock on and off. Maybe something like:

1. gluster volume profile volname eager-lock off

2. gluster volume profile volname info clear
3. Run the test for 10 clients, let it complete.
4  capture `gluster volume profile volname info`

5.gluster volume profile volname eager-lock on
6. Repeat steps 2 to 4.

>Although I am trying my best to remove the MySQL caches from all the sources, >and regenerate the profiling data with Eager Lock ON.

Maybe restarting the VM and the gluster volumes on each run might help to remove caching.

Comment 8 Nikhil Chawla 2017-12-27 06:58:09 UTC
(In reply to Ravishankar N from comment #7)
> (In reply to Nikhil Chawla from comment #6)
> > Since I am using a MySQL simulator which MySQL queries in the DB. n_overall is > the total number of transactions done in 5 minutes of Run-time and "opm" is the > average transactions per minute. Yes! The higher number obtained after >disabling the eager lock. 
> 
> Thanks Nikhil. It is strange though, that (n_overall divided by 5) does not
> seem to be near the opm values.
>
Yupp!, that's strange because I am also observing it for the first time.  
> > This is for Eager Lock OFF. I don't have the profiling data with eager lock >> > ON, and I can't generate it, because I've to regenerate the database for this, > which will take 3-4 days. 
> 
> Cool, I will wait for the profile info from both runs. You could just run
> the test for '10 clients' with both eager lock on and off. Maybe something
> like:
> 
> 1. gluster volume profile volname eager-lock off
> 
> 2. gluster volume profile volname info clear
> 3. Run the test for 10 clients, let it complete.
> 4  capture `gluster volume profile volname info`
> 
> 5.gluster volume profile volname eager-lock on
> 6. Repeat steps 2 to 4.
> 
> >Although I am trying my best to remove the MySQL caches from all the sources, >and regenerate the profiling data with Eager Lock ON.
> 
> Maybe restarting the VM and the gluster volumes on each run might help to
> remove caching.

Restarting the VM didn't help. 

So, I created a new DB with 10 GB data and here are the performance numbers:

eager lock ON:
n_overall : 18570 
opm : 1661

eager lock OFF:
n_overall :  23516
opm : 4634

And the Gluster Profile Info for both the cases is available here:

http://perf1.perf.lab.eng.bos.redhat.com/Nikhil/gluster_profile/

Comment 9 Ravishankar N 2018-01-19 06:23:07 UTC
(In reply to Nikhil Chawla from comment #8)

> And the Gluster Profile Info for both the cases is available here:
> 
> http://perf1.perf.lab.eng.bos.redhat.com/Nikhil/gluster_profile/

Sorry for the delay Nikhil. Compared the profile infos of eager lock ON vs OFF, there is not much significant difference in the *percentage* change in numbers for a given FOP between the two cases. However in both cases, FINODELK seems to be taking maximum % latency (around 77%) which should not be the case for a write intensive workload (mysql). 

Was the VM disk on which the test was run pre-allocated or thinly provisioned? If thinly provisioned, can you re-run the test with a pre-allocated disk and see if that makes a difference in the performance? I think shard xlator sends fxattrops (involving FINODELKS) to update size and block count when creating the individual shards, so preallocating the image before hand could mitigate that behaviour.

Comment 15 Pranith Kumar K 2018-03-23 13:15:04 UTC
https://code.engineering.redhat.com/gerrit/133659 storage/posix: Add active-fd-count option in gluster
https://code.engineering.redhat.com/gerrit/133660 cluster/afr: Switch to active-fd-count for open-fd checks
https://code.engineering.redhat.com/gerrit/131944 cluster/afr: Remove unused code paths
https://code.engineering.redhat.com/gerrit/131945 cluster/afr: Make AFR eager-locking similar to EC

Comment 18 Nikhil Chawla 2018-04-05 07:02:04 UTC
fio sequential write results:
 
from VM with eager-lock on:
write throughput: 2MB/s
 
from VM with eager-lock off:
write throughput: 351 MB/s

Comment 21 Nikhil Chawla 2018-04-30 06:32:37 UTC
Latest Performance numbers with RHHI 1.1 and RHHI 2.0(with Eager-lock fixes)

---------------
RHHI 1.1 

Eager Lock ON : 2.037MB/s
Eager Lock OFF :  107MB/s

---------------

RHHI 2.0 

Eager Lock ON : 291MB/s 
Eager Lock OFF : 34.0MB/s
---------------

Iostat logs and FIO results are available here :
 
http://file.blr.redhat.com/nichawla/work/RHHI_Seq_RUNS/

Comment 22 Pranith Kumar K 2018-04-30 10:27:00 UTC
(In reply to Nikhil Chawla from comment #21)
> Latest Performance numbers with RHHI 1.1 and RHHI 2.0(with Eager-lock fixes)
> 
> ---------------
> RHHI 1.1 
> 
> Eager Lock ON : 2.037MB/s
> Eager Lock OFF :  107MB/s
> 
> ---------------
> 
> RHHI 2.0 
> 
> Eager Lock ON : 291MB/s 
> Eager Lock OFF : 34.0MB/s
> ---------------

Looks like there is a regression in eager-lock off? I am suspecting this because of cluster.full-lock option that we introduced to prevent split-brains. Could you do a test with RHHI-2.0 with eager-lock off and also cluster.full-lock off and post the result?

> 
> Iostat logs and FIO results are available here :
>  
> http://file.blr.redhat.com/nichawla/work/RHHI_Seq_RUNS/

Comment 28 Karan Sandha 2018-08-24 06:43:50 UTC
tested this bz on 3.12.2.16 , running sequential writes over 3 way replicated volume. 

gluster v set testvol cluster.eager-lock on

iteration 1:- 1703305.84 KB/sec
iteration 2:- 1693176.39 KB/sec

gluster v set testvol cluster.eager-lock off

iteration 1: 1715066.19 KB/sec
iteration 2: 1703044.18 KB/sec

Hence marking it verified.

Comment 29 Pranith Kumar K 2018-08-24 06:51:59 UTC
(In reply to Karan Sandha from comment #28)
> tested this bz on 3.12.2.16 , running sequential writes over 3 way
> replicated volume. 
> 
> gluster v set testvol cluster.eager-lock on
> 
> iteration 1:- 1703305.84 KB/sec
> iteration 2:- 1693176.39 KB/sec
> 
> gluster v set testvol cluster.eager-lock off
> 
> iteration 1: 1715066.19 KB/sec
> iteration 2: 1703044.18 KB/sec
> 
> Hence marking it verified.

Is the test done in a VM on a RHHI setup?

Comment 33 errata-xmlrpc 2018-09-04 06:40:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.