Bug 1100567
Summary: | [barrier] reconfiguration of barrier time out does not work | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Atin Mukherjee <amukherj> |
Component: | core | Assignee: | Atin Mukherjee <amukherj> |
Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | rhgs-3.0 | CC: | gluster-bugs, kparthas, nsathyan, rhs-bugs, sasundar, sdharane, storage-qa-internal |
Target Milestone: | --- | ||
Target Release: | RHGS 3.0.0 | ||
Hardware: | Unspecified | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.6.0.3-1.el6rhs | Doc Type: | Bug Fix |
Doc Text: |
barrier timeout reconfiguration was not working through volume set command. Investigation revealed the following:
Reconfiguration of barrier timeout through gluster volume set shows a success but it never changes the default timeout value which is 120 seconds. After digging into the code deeper, it was found that timeout is never modified in reconfigure() as the first check i.e. whether barrier is already enabled or disabled always fails since barrier option is not modified in this request.
Fix
---
Introduced notify() in barrier translator which will take care of the rpc request to enable/disable barrier. reconfigure() will simply set barrier enable/disable and timeout options blindly without any validation.
|
Story Points: | --- |
Clone Of: | 1085671 | Environment: | |
Last Closed: | 2014-09-22 19:39:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1085671 | ||
Bug Blocks: |
Description
Atin Mukherjee
2014-05-23 06:15:54 UTC
RCA --- configuration of barrier timeout through gluster volume set shows a success but it never changes the default timeout value which is 120 seconds. After digging into the code deeper, it was found that timeout is never modified in reconfigure() as the first check i.e. whether barrier is already enabled or disabled always fails since barrier option is not modified in this request. Fix --- Introduced notify() in barrier translator which will take care of the rpc request to enable/disable barrier. reconfigure() will simply set barrier enable/disable and timeout options blindly without any validation. Please note this patch only contains the changes in barrier translator however from complete code flow perspective the caller in the glusterfsd mgmt should call notify instead of reconfigure to fix this problem. Fix http://review.gluster.org/7428 is backported in downstream. Verified with glusterfs-3.6.0.8-1.el6rhs Followed the following steps, 1. Set the barrier-timeout 600 seconds (ie.) gluster volume set <vol-name> barrier-timeout 600 2. Enable barrier on the volume 3. Take the statedump of the volume (i.e) gluster volume statedump <vol-name> 4. Remove a file from the mount and calculate the time taken (i.e) time rm -rf <file-on-mount> Result : 1. Statedump had barrier-timeout value as 600 2. It took ~10 min for which unlink operation was hung [root@rhs-client10 test]# time rm -rf file5 real 9m52.024s user 0m0.001s sys 0m0.001s Repeated above test for various values of barrier-timeout and found it was set correctly Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html |