Bug 1349657
| Summary: | process glusterd set TCP_USER_TIMEOUT failed | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Zhou Zhengping <johnzzpcrystal> | |
| Component: | rpc | Assignee: | Zhou Zhengping <johnzzpcrystal> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | low | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | mainline | CC: | bugs, oleksandr, sarumuga | |
| Target Milestone: | --- | Keywords: | Triaged | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | 3.9.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1354404 1354405 (view as bug list) | Environment: | ||
| Last Closed: | 2017-01-03 11:22:02 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1354404, 1354405 | |||
|
Description
Zhou Zhengping
2016-06-23 22:21:50 UTC
Description of problem:
We can see glusterd's error log like this:
2016-06-24 04:34:30.252200] I [MSGID: 106544] [glusterd.c:155:glusterd_uuid_init] 0-management: retrie
ved UUID: b49bd7b5-9cb5-4e87-8e23-8997bfb9c479
[2016-06-24 04:34:30.275351] I [MSGID: 106498] [glusterd-handler.c:3644:glusterd_friend_add_from_peerin
fo] 0-management: connect returned 0
[2016-06-24 04:34:30.275494] I [rpc-clnt.c:991:rpc_clnt_connection_init] 0-management: setting frame-ti
meout to 600
[2016-06-24 04:34:30.276462] W [socket.c:979:__socket_keepalive] 0-socket: failed to set TCP_USER_TIMEO
UT -1000 on socket 13, Invalid argument
[2016-06-24 04:34:30.276527] E [socket.c:3087:socket_connect] 0-management: Failed to set keep-alive: I
nvalid argument
Additional info:
in linux kernel function do_tcp_setsockopt ,we can see code like this:
case TCP_USER_TIMEOUT:
/* Cap the max timeout in ms TCP will retry/retrans
* before giving up and aborting (ETIMEDOUT) a connection.
*/
if (val < 0)
err = -EINVAL;
else
icsk->icsk_user_timeout = msecs_to_jiffies(val);
break;
this code means , the timeout should not be lower than 0.But in function glusterd_transport_keepalive_options_get, if option "transport.tcp-user-timeout" not set ,the priv->timout will be -1, it will cause socksetopt failed.
REVIEW: http://review.gluster.org/14785 (rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT) posted (#1) for review on master by Zhou Zhengping (johnzzpcrystal) REVIEW: http://review.gluster.org/14785 (rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT) posted (#2) for review on master by Zhou Zhengping (johnzzpcrystal) COMMIT: http://review.gluster.org/14785 committed in master by Jeff Darcy (jdarcy) ------ commit b2c73cbf423de6201f956f522b7429615c88869d Author: Zhou Zhengping <johnzzpcrystal> Date: Fri Jun 24 06:33:16 2016 +0800 rpc: invalid argument when function setsockopt sets option TCP_USER_TIMEOUT If option "transport.tcp-user-timeout" hasn't been setted, glusterd's priv->timeout will be -1, which will cause invalid argument when set TCP_USER_TIMEOUT. Change-Id: Ibc16264ceac0e69ab4a217ffa27c549b9fa21df9 BUG: 1349657 Signed-off-by: Zhou Zhengping <johnzzpcrystal> Reviewed-on: http://review.gluster.org/14785 CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Jeff Darcy <jdarcy> I believe, this should be backported to 3.7 branch as we observe same warnings with 3.7.11/12/13. |