Bug 1100567 - [barrier] reconfiguration of barrier time out does not work
Summary: [barrier] reconfiguration of barrier time out does not work
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.0
Hardware: Unspecified
OS: All
high
medium
Target Milestone: ---
: RHGS 3.0.0
Assignee: Atin Mukherjee
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On: 1085671
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-23 06:15 UTC by Atin Mukherjee
Modified: 2016-09-17 14:38 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.6.0.3-1.el6rhs
Doc Type: Bug Fix
Doc Text:
barrier timeout reconfiguration was not working through volume set command. Investigation revealed the following: Reconfiguration of barrier timeout through gluster volume set shows a success but it never changes the default timeout value which is 120 seconds. After digging into the code deeper, it was found that timeout is never modified in reconfigure() as the first check i.e. whether barrier is already enabled or disabled always fails since barrier option is not modified in this request. Fix --- Introduced notify() in barrier translator which will take care of the rpc request to enable/disable barrier. reconfigure() will simply set barrier enable/disable and timeout options blindly without any validation.
Clone Of: 1085671
Environment:
Last Closed: 2014-09-22 19:39:12 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2014:1278 0 normal SHIPPED_LIVE Red Hat Storage Server 3.0 bug fix and enhancement update 2014-09-22 23:26:55 UTC

Description Atin Mukherjee 2014-05-23 06:15:54 UTC
+++ This bug was initially created as a clone of Bug #1085671 +++

Description of problem:

When seting barrier timeout to a x seconds, the timeout event still relies on the default timeout value.
Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Set the barrier timeout to 18000 seconds. 
2. Take a statedump of the volume and verify the timeout


Actual results:
timeout still reflects a default value.

Expected results:
timeout should be reconfigured.

Additional info:

--- Additional comment from Anand Avati on 2014-04-09 02:32:24 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#1) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-09 03:11:40 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#2) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-09 04:47:58 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#3) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-09 05:17:34 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#4) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-09 05:19:08 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#5) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-10 00:58:55 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#6) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-10 04:59:59 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#7) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-10 05:08:50 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#8) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-10 06:00:11 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#9) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-11 02:19:21 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#10) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-15 02:33:03 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#11) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-15 02:55:27 EDT ---

REVIEW: http://review.gluster.org/7428 (glusterfs-server : barrier timeout tuning fix) posted (#12) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Anand Avati on 2014-04-22 16:03:18 EDT ---

COMMIT: http://review.gluster.org/7428 committed in master by Vijay Bellur (vbellur) 
------
commit b6cc23204f1941184cb08ec3d84beecd2d06fd91
Author: Atin Mukherjee <amukherj>
Date:   Wed Apr 9 11:53:33 2014 +0530

    glusterfs-server : barrier timeout tuning fix
    
    Problem : Reconfiguration of barrier timeout through gluster volume set shows a
    success but it never changes the default timeout value which is 120 seconds.
    After digging into the code deeper, it was found that timeout is never modified
    in reconfigure() as the first check i.e. whether barrier is already enabled or
    disabled always fails since barrier option is not modified in this request.
    
    Fix : Introduced notify() in barrier translator which will take care of the rpc
    request to enable/disable barrier. reconfigure() will simply set barrier
    enable/disable and timeout options blindly without any validation.
    
    Please note this patch only contains the changes in barrier translator however
    from complete code flow perspective the caller in the glusterfsd mgmt should
    call notify instead of reconfigure to fix this problem.
    
    Change-Id: I1371b294935f6054da7c1dc6a9a19f1d861e60fb
    BUG: 1085671
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: http://review.gluster.org/7428
    Reviewed-by: Varun Shastry <vshastry>
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 1 Atin Mukherjee 2014-05-23 06:20:21 UTC
RCA
---
configuration of barrier timeout through gluster volume set shows a success but it never changes the default timeout value which is 120 seconds. After digging into the code deeper, it was found that timeout is never modified in reconfigure() as the first check i.e. whether barrier is already enabled or   disabled always fails since barrier option is not modified in this request.

Fix
---
Introduced notify() in barrier translator which will take care of the rpc   request to enable/disable barrier. reconfigure() will simply set barrier    enable/disable and timeout options blindly without any validation.
    
Please note this patch only contains the changes in barrier translator however    from complete code flow perspective the caller in the glusterfsd mgmt should    call notify instead of reconfigure to fix this problem.

Fix http://review.gluster.org/7428 is backported in downstream.

Comment 3 SATHEESARAN 2014-05-28 08:39:50 UTC
Verified with glusterfs-3.6.0.8-1.el6rhs

Followed the following steps,
1. Set the barrier-timeout 600 seconds
(ie.) gluster volume set <vol-name> barrier-timeout 600
2. Enable barrier on the volume
3. Take the statedump of the volume
(i.e) gluster volume statedump <vol-name>
4. Remove a file from the mount and calculate the time taken
(i.e) time rm -rf <file-on-mount>

Result :
1. Statedump had barrier-timeout value as 600

2. It took ~10 min for which unlink operation was hung
[root@rhs-client10 test]# time rm -rf file5

real    9m52.024s
user    0m0.001s
sys     0m0.001s

Repeated above test for various values of barrier-timeout and found it was set correctly

Comment 5 errata-xmlrpc 2014-09-22 19:39:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html


Note You need to log in before you can comment on or make changes to this bug.