Bug 1502455 - disperse eager-lock degrades performance for file create workloads
Summary: disperse eager-lock degrades performance for file create workloads
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: 3.12
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On: 1502610 1512460 1530519
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-16 05:22 UTC by Manoj Pillai
Modified: 2018-01-11 10:09 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1502610 (view as bug list)
Environment:
Last Closed: 2018-01-11 10:09:15 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Manoj Pillai 2017-10-16 05:22:47 UTC
Description of problem:
The current behavior of the option disperse.eager-lock is not optimal:

disperse.eager-lock on: good performance on large-file read/write, but actually degrades performance for many file create workloads. The degradation for file create workloads seems to be due to lock contention on directories.

disperse.eager-lock off: loses the performance advantages of eager-locking for large-file access, but better performance on file create workloads than with disperse.eager-lock on.

We should fix eager locking so that it can be kept on without incurring a performance penalty on file create workloads.

Version-Release number of selected component (if applicable):
glusterfs*-3.12.1-2.el7.x86_64

How reproducible:
Consistently

Comment 1 Manoj Pillai 2017-10-16 05:32:52 UTC
IMO, a solution that adds a separate option to control eager locking in the case of directories would be acceptable, and probably simple. 

So the default could be:
disperse.dir-eager-lock off: applies to directories
disperse.eager-lock on: applies to regular files

Would that work?

Comment 2 Xavi Hernandez 2017-10-16 08:12:26 UTC
I guess this problem happens when multiple clients are creating files on the same directory, right ? otherwise, eager-locking shouldn't interfere with file creation (in fact it should be faster).

In cases where multiple clients access the same directory, then yes, we could keep a separate configuration for this purpose. However, is it really necessary to have it disabled by default ? I think that an scenario where multiple clients are writing to the same directory is less probable than one where all writes to a single directory come from the same client.

Comment 3 Manoj Pillai 2017-10-16 10:59:40 UTC
BTW, I realize that the norm would be to provide performance results as supporting evidence. I'm currently waiting on some systems to become available, so can't do that right away. Will do so as soon as I can.

But this problem has been seen multiple times in the past, in customer cases as well as our internal testing, and this bz has been long pending. So decided to go ahead and open it to avoid more delays.

(In reply to Xavier Hernandez from comment #2)
> I guess this problem happens when multiple clients are creating files on the
> same directory, right ? otherwise, eager-locking shouldn't interfere with
> file creation (in fact it should be faster).

Yes, with multiple clients. But we have seen degradation even when each client is creating its data set in its own private directory. Seemed like there was contention on directories in the path.

Comment 4 Xavi Hernandez 2017-10-16 12:06:16 UTC
I've uploaded a patch on branch master (Bug #1502610) to create the new option. I've named the option 'other-eager-lock' since it will control eager locking for entries other than regular files (directories, symbolic links, pipes, ...).

For now I've left the default value to 'on', but you can change it via 'gluster volume set' to do tests. Depending on the results, we can decide the final default value.

Comment 5 Manoj Pillai 2017-11-08 05:04:45 UTC
I have built the rpms from the master. Will try it out once I take care of some other work.

Comment 6 Xavi Hernandez 2018-01-11 10:09:15 UTC
After some discussion, it was decided that this option won't be backported to 3.12 because it's considered a new feature. It's present on 3.13+.


Note You need to log in before you can comment on or make changes to this bug.