Bug 1409189
| Summary: | Failed to set TCP_USER_TIMEOUT msgs seen in logs | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Raghavendra G <rgowdapp> |
| Component: | glusterd | Assignee: | Raghavendra G <rgowdapp> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | mainline | CC: | amukherj, bugs, johnzzpcrystal, rgowdapp, sasundar |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | Flags: | ykaul:
needinfo+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | glusterfs-4.1.3 (or higher) | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-08-29 03:18:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Raghavendra G
2016-12-30 05:19:38 UTC
I had a chat regarding this issue with Niels and I am feeling bad that I haven't followed up. Just pasting that mail conversation, so that it could help On Wed, Jul 22, 2015 at 11:49:56AM +0530, SATHEESARAN wrote: > Hi Niels, > > I have observed warning messages in glusterd related to TCP_USER_TIMEOUT : > > <snip> > [2015-07-22 11:38:41.979511] I [rpc-clnt.c:972:rpc_clnt_connection_init] > 0-management: setting frame-timeout to 600 > [2015-07-22 11:38:41.987058] W [socket.c:923:__socket_keepalive] 0-socket: > failed to set TCP_USER_TIMEOUT -1000 on socket 8, Invalid argument > [2015-07-22 11:38:41.987099] E [socket.c:3018:socket_connect] 0-management: > Failed to set keep-alive: Invalid argument > </snip> > > Do you know what these messages mean ? That means that __socket_keepalive() is called with a timeout (last parameter) of -1. This value normally comes from a socket_private_t structure and is set through "transport.tcp-user-timeout". Most functions expect the timeout be an unsigned int, how it gets to a -1 value is unclear to me. Does this happen always, or only in some configuration/environment? Thanks, Niels Raghavendra, Could you please provide the steps/command in order to reproduce the issue. So that I can debug and root cause the issue. Raghavendra, Could you please provide the steps/command in order to reproduce the issue. So that I can debug and root cause the issue Maybe it's the same probelm as the following : https://review.gluster.org/#/c/14785/ (In reply to Zhou Zhengping from comment #4) > Maybe it's the same probelm as the following : > > https://review.gluster.org/#/c/14785/ Yes, that seems to be the case, given the patch is merged now, moving this bug to MODIFIED. |