Bug 1426059

Summary: gluster fuse client losing connection to gluster volume frequently
Product: [Community] GlusterFS Reporter: Milind Changire <mchangir>
Component: rpcAssignee: Milind Changire <mchangir>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: mainlineCC: ahatfiel, amukherj, bkunal, bugs, csaba, knakai, mchangir, moagrawa, nbalacha, nchilaka, pdhange, rcyriac, rgowdapp, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.11.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1408354
: 1452038 1452132 (view as bug list) Environment:
Last Closed: 2017-05-30 18:45:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1408354, 1452038, 1452132    

Comment 1 Milind Changire 2017-02-23 07:16:19 UTC
Some applications like database management systems require aggressive socket timeout to fail fast and reestablish connection to alternate servers.

Socket level TCP tunables need to be used to fail fast.

Comment 2 Worker Ant 2017-02-23 07:25:01 UTC
REVIEW: https://review.gluster.org/16731 (rpc: add options to manage socket keepalive lifespan) posted (#1) for review on master by Milind Changire (mchangir)

Comment 3 Worker Ant 2017-04-11 07:02:04 UTC
REVIEW: https://review.gluster.org/16731 (rpc: add options to manage socket keepalive lifespan) posted (#2) for review on master by Milind Changire (mchangir)

Comment 4 Worker Ant 2017-04-11 07:16:37 UTC
REVIEW: https://review.gluster.org/16731 (rpc: add options to manage socket keepalive lifespan) posted (#3) for review on master by Milind Changire (mchangir)

Comment 5 Worker Ant 2017-04-12 09:44:03 UTC
COMMIT: https://review.gluster.org/16731 committed in master by Raghavendra G (rgowdapp) 
------
commit 6b8df081b46ac4f485c86a5052fc30472e74bfbb
Author: Milind Changire <mchangir>
Date:   Tue Apr 11 12:30:06 2017 +0530

    rpc: add options to manage socket keepalive lifespan
    
    Problem:
    Default values for handling socket timeouts for brick responses are
    insufficient for aggressive applications such as databases.
    
    Solution:
    Add 1:1 gluster options for keepalive, keepalive-idle,
    keepalive-interval and keepalive-timeout as per the socket level options
    available as per tcp(7) man page.
    
    Default values for options are NOT agressive and continue to be values
    which result in default timeout when only the keep alive option is
    turned on.
    
    These options are Linux specific and will not be applicable to the
    *BSDs.
    
    Change-Id: I2a08ecd949ca8ceb3e090d336ad634341e2dbf14
    BUG: 1426059
    Signed-off-by: Milind Changire <mchangir>
    Reviewed-on: https://review.gluster.org/16731
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 6 Shyamsundar 2017-05-30 18:45:26 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.0, please open a new bug report.

glusterfs-3.11.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-May/000073.html
[2] https://www.gluster.org/pipermail/gluster-users/