Bug 1354405 - process glusterd set TCP_USER_TIMEOUT failed
Summary: process glusterd set TCP_USER_TIMEOUT failed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rpc
Version: 3.8.0
Hardware: All
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On: 1349657
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-11 09:12 UTC by Niels de Vos
Modified: 2016-08-12 09:47 UTC (History)
1 user (show)

Fixed In Version: glusterfs-3.8.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1349657
Environment:
Last Closed: 2016-08-12 09:47:17 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2016-07-11 09:12:03 UTC
+++ This bug was initially created as a clone of Bug #1349657 +++

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Zhou Zhengping on 2016-06-24 00:29:40 CEST ---

Description of problem:
We can see glusterd's error log like this:
2016-06-24 04:34:30.252200] I [MSGID: 106544] [glusterd.c:155:glusterd_uuid_init] 0-management: retrie
ved UUID: b49bd7b5-9cb5-4e87-8e23-8997bfb9c479
[2016-06-24 04:34:30.275351] I [MSGID: 106498] [glusterd-handler.c:3644:glusterd_friend_add_from_peerin
fo] 0-management: connect returned 0
[2016-06-24 04:34:30.275494] I [rpc-clnt.c:991:rpc_clnt_connection_init] 0-management: setting frame-ti
meout to 600
[2016-06-24 04:34:30.276462] W [socket.c:979:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEO
UT -1000 on socket 13, Invalid argument
[2016-06-24 04:34:30.276527] E [socket.c:3087:socket_connect] 0-management: Failed to set keep-alive: I
nvalid argument

Additional info:
in linux kernel function do_tcp_setsockopt ,we can see code like this:
       case TCP_USER_TIMEOUT:
                /* Cap the max timeout in ms TCP will retry/retrans
                 * before giving up and aborting (ETIMEDOUT) a connection.
                 */
                if (val < 0) 
                        err = -EINVAL;
                else 
                        icsk->icsk_user_timeout = msecs_to_jiffies(val);
                break;

this code means , the timeout should not be lower than 0.But in function glusterd_transport_keepalive_options_get, if option "transport.tcp-user-timeout" not set ,the priv->timout will be -1, it will cause socksetopt failed.

--- Additional comment from Vijay Bellur on 2016-06-24 00:44:19 CEST ---

REVIEW: http://review.gluster.org/14785 (rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT) posted (#1) for review on master by Zhou Zhengping (johnzzpcrystal)

--- Additional comment from Vijay Bellur on 2016-06-25 14:12:23 CEST ---

REVIEW: http://review.gluster.org/14785 (rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT) posted (#2) for review on master by Zhou Zhengping (johnzzpcrystal)

--- Additional comment from Vijay Bellur on 2016-07-11 03:59:08 CEST ---

COMMIT: http://review.gluster.org/14785 committed in master by Jeff Darcy (jdarcy) 
------
commit b2c73cbf423de6201f956f522b7429615c88869d
Author: Zhou Zhengping <johnzzpcrystal>
Date:   Fri Jun 24 06:33:16 2016 +0800

    rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT
    
    If option "transport.tcp-user-timeout" hasn't been setted, glusterd's
    priv->timeout will be -1, which will cause invalid argument when
    set TCP_USER_TIMEOUT.
    
    Change-Id: Ibc16264ceac0e69ab4a217ffa27c549b9fa21df9
    BUG: 1349657
    Signed-off-by: Zhou Zhengping <johnzzpcrystal>
    Reviewed-on: http://review.gluster.org/14785
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Jeff Darcy <jdarcy>

--- Additional comment from Oleksandr Natalenko on 2016-07-11 11:10:29 CEST ---

I believe, this should be backported to 3.7 branch as we observe same warnings with 3.7.11/12/13.

Comment 1 Vijay Bellur 2016-07-11 09:54:55 UTC
REVIEW: http://review.gluster.org/14888 (rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT) posted (#1) for review on release-3.8 by Niels de Vos (ndevos)

Comment 2 Vijay Bellur 2016-07-12 12:07:32 UTC
COMMIT: http://review.gluster.org/14888 committed in release-3.8 by Jeff Darcy (jdarcy) 
------
commit a2c96bebcda9d49a0fea9d3e0b284669f65d1b4b
Author: Niels de Vos <ndevos>
Date:   Mon Jul 11 11:52:41 2016 +0200

    rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT
    
    If option "transport.tcp-user-timeout" hasn't been setted, glusterd's
    priv->timeout will be -1, which will cause invalid argument when
    set TCP_USER_TIMEOUT.
    
    Cherry picked from commit b2c73cbf423de6201f956f522b7429615c88869d:
    > Change-Id: Ibc16264ceac0e69ab4a217ffa27c549b9fa21df9
    > BUG: 1349657
    > Signed-off-by: Zhou Zhengping <johnzzpcrystal>
    > Reviewed-on: http://review.gluster.org/14785
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > Reviewed-by: Jeff Darcy <jdarcy>
    
    Change-Id: Ibc16264ceac0e69ab4a217ffa27c549b9fa21df9
    BUG: 1354405
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/14888
    Reviewed-by: Zhou Zhengping <johnzzpcrystal>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>

Comment 3 Niels de Vos 2016-08-12 09:47:17 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.2, please open a new bug report.

glusterfs-3.8.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/announce/2016-August/000058.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.