Bug 1320412 - disperse: Provide an option to enable/disable eager lock
Summary: disperse: Provide an option to enable/disable eager lock
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: disperse
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.1.3
Assignee: Ashish Pandey
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On: 1314649
Blocks: 1311817 1318965
TreeView+ depends on / blocked
 
Reported: 2016-03-23 07:29 UTC by Ashish Pandey
Modified: 2018-01-23 12:06 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.9-2
Doc Type: Enhancement
Doc Text:
Before a file operation starts, a lock is placed on the file. The lock remains in place until the file operation is complete. After the file operation completed, the lock remained in place either until lock contention was detected, or for 1 second in order to check for another request for that file from the same client. This reduced performance, but improved access efficiency. This update provides a new volume option, disperse.eager-lock, to give users more control over lock time. If eager-lock is on (default), the previous behavior applies. If eager-lock is off, locks release immediately after file operations complete, improving performance for some operations, but reducing access efficiency.
Clone Of: 1314649
Environment:
Last Closed: 2016-06-23 05:13:53 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1240 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 Update 3 2016-06-23 08:51:28 UTC

Description Ashish Pandey 2016-03-23 07:29:47 UTC
+++ This bug was initially created as a clone of Bug #1314649 +++

Description of problem:

If a fop takes lock, and completes its operation, it waits for 1 second before 
releasing the lock. However, If ec find any lock contention within this time period, it release the lock immediately before time expires.

As we take lock on first brick, In some operation, like read, it might happen that discovery of lock contention might take long time and can degrades the performance.


Provide an option to enable/disable eager lock 



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Vijay Bellur on 2016-03-04 03:06:06 EST ---

REVIEW: http://review.gluster.org/13605 (cluster/ec: Provide an option to enable/disable eager lock) posted (#1) for review on master by Ashish Pandey (aspandey@redhat.com)

--- Additional comment from Vijay Bellur on 2016-03-11 02:44:55 EST ---

REVIEW: http://review.gluster.org/13605 (cluster/ec: Provide an option to enable/disable eager lock) posted (#2) for review on master by Ashish Pandey (aspandey@redhat.com)

--- Additional comment from Vijay Bellur on 2016-03-14 03:13:38 EDT ---

REVIEW: http://review.gluster.org/13605 (cluster/ec: Provide an option to enable/disable eager lock) posted (#3) for review on master by Ashish Pandey (aspandey@redhat.com)

--- Additional comment from Vijay Bellur on 2016-03-15 01:41:30 EDT ---

REVIEW: http://review.gluster.org/13605 (cluster/ec: Provide an option to enable/disable eager lock) posted (#4) for review on master by Ashish Pandey (aspandey@redhat.com)

--- Additional comment from Vijay Bellur on 2016-03-16 00:54:35 EDT ---

COMMIT: http://review.gluster.org/13605 committed in master by Pranith Kumar Karampuri (pkarampu@redhat.com) 
------
commit 23ccabbeb7879fd05f415690124bd7b4a74d4d33
Author: Ashish Pandey <aspandey@redhat.com>
Date:   Fri Mar 4 13:05:09 2016 +0530

    cluster/ec: Provide an option to enable/disable eager lock
    
    Problem: If a fop takes lock, and completes its operation,
    it waits for 1 second before releasing the lock. However,
    If ec find any lock contention within this time period,
    it release the lock immediately before time expires. As we
    take lock on first brick, for few operations, like read, it
    might happen that discovery of lock contention might take
    long time and can degrades the performance.
    
    Solution: Provide an option to enable/disable eager lock.
    If eager lock is disabled, lock will be released as soon
    as fop completes.
    
    gluster v set <VOLUME NAME> disperse.eager-lock on
    gluster v set <VOLUME NAME> disperse.eager-lock off
    
    Change-Id: I000985a787eba3c190fdcd5981dfbf04e64af166
    BUG: 1314649
    Signed-off-by: Ashish Pandey <aspandey@redhat.com>
    Reviewed-on: http://review.gluster.org/13605
    Smoke: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>

Comment 5 Nag Pavan Chilakam 2016-05-03 11:18:44 UTC
FOllowing is the QATP for validating this bug.
TC#1 and TC#6 are the main cases which are expected to pass so as to confirm that the fix is working
TC#1,2,3,7 are CLI validation
TC#4,5 are to check for regressing of the options
TC#6 is the main case to validate if the fix is really working

QATP:

    TC#1: User must be able to enable/disable eager lock option on an EC volume 

    1.Create an EC volume and start it

    2. Now try to enable disperse eager lock on the volume using  gluster v set <VOLUME NAME> disperse.eager-lock on      

    3. Now get the options of volume and check fo the above option

    The option must be enabled on the volume

    4. Now disable the option using  gluster v set <VOLUME NAME> disperse.eager-lock off

    The option must get disabled


    TC#2: by default EC volume must have disperse.eager-lock enabled

    1.Create an EC volume and start it

    2.Now get the options of volume and check fo the disperse.eager-lock  option

    The option must be enabled on the volume by default


    TC#3: by default EC volume must have disperse.eager-lock enabled

    1.Create an EC volume and start it

    2.Now get the options of volume and check fo the disperse.eager-lock  option

    The option must be enabled on the volume by default

             

    TC#4: AFR Eager lock must not have effect on  EC volume

    1.Create an EC volume and start it

    2.Now set the eager lock option meant for AFR volume on EC volume by turning on cluster.eager-lock

    The option must not be allowed--->but this works will file a bug

    3.Now mount the volume and use dd command to create a 1GB file

    4. Keep viewing the profile of the volume.

    Expected behavior: The INODELK count must keep increasing rapidly


    TC#5: EC Eager lock must not have effect on  AFR volume

    1.Create an AFR volume and start it

    2.Now set the eager lock option meant for EC volume on AFR volume by turning on disperse.eager-lock

    The option must not be allowed--->but this works will file a bug

    3.Now mount the volume and use dd command to create a 1GB file

    4. Keep viewing the profile of the volume.

    Expected behavior: The INODELK count must keep increasing rapidly


    TC#6: Eager lock Functional validation: Eager lock must reduce the number of locks being taken when writing to the file continuosly

    1.Create an AFR volume and start it

    2.Now set the eager lock option  by turning on disperse.eager-lock

    3.Now mount the volume and use dd command to create a 1GB file

    4. Keep viewing the profile of the volume.

    Expected behavior: The INODELK count must be incrementing at a very low pace, may be in total there must be about 10 locks for each brick(when compared to when options is turned off with inodelk being in range of about 9000)


    TC#7: CLI sanity of disperse eager lock

    1.Create an EC volume

    2.Now try to set the eager lock option  by turning on disperse.eager-lock by using different inputs. 

    Only booleans like 1 or 0 true or false and on/off must be allowed

    3. try to enable the option on already enabled volume

    It must fail, saying it is already enabled

    4. Try to disable on already disable volume

    It must fail, saying it is already disabled

    5. check for the help

    the help must have sufficent data to be useful

Comment 6 Nag Pavan Chilakam 2016-05-03 11:50:42 UTC
Results of executing QATP:
=============================
TC#1-->PASSED
TC#2-->PASSED
TC#3-->PASSED
TC#4-->PASSED(step#2 failed raised a bug 1332523 - do not allow cluster.eager-lock option to be enabled on EC volume)
TC#5-->PASSED (step#2 failed which is mentioned in bug raised on TC#7)
TC#6-->PASSED
TC#7-->FAILED(Raised a bug#1332516 - disperse: User experience improvements: disperse eager lock needs better help, finetune input options and meant for only EC volume)


As mentioned in QATP, Main criteria for moving this bug to verified is TC#1 and 6 must pass
As both passed moving to verified


version of test:
[root@dhcp35-191 ~]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.9-2.el7rhgs.x86_64
glusterfs-server-3.7.9-2.el7rhgs.x86_64
python-gluster-3.7.5-19.el7rhgs.noarch
gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64
vdsm-gluster-4.16.30-1.3.el7rhgs.noarch
glusterfs-3.7.9-2.el7rhgs.x86_64
glusterfs-api-3.7.9-2.el7rhgs.x86_64
glusterfs-cli-3.7.9-2.el7rhgs.x86_64
glusterfs-geo-replication-3.7.9-2.el7rhgs.x86_64
gluster-nagios-common-0.2.3-1.el7rhgs.noarch
glusterfs-libs-3.7.9-2.el7rhgs.x86_64
glusterfs-fuse-3.7.9-2.el7rhgs.x86_64
glusterfs-rdma-3.7.9-2.el7rhgs.x86_64

Comment 8 Ashish Pandey 2016-06-10 04:42:44 UTC
Laura,

I have already verified the doc text.

Comment 10 errata-xmlrpc 2016-06-23 05:13:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240


Note You need to log in before you can comment on or make changes to this bug.