Hide Forgot
On Tue, Mar 23, 2010 at 11:28 PM, Anand Avati <avati> wrote: > >> http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/#preventingdisconnection >> > >> > Fixing this problem in glusterfs is very simple, just call on socket >> fd: >> > optval = 1; >> > optlen = sizeof(optval); >> > if(setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) >> { >> > /* ERROR */ >> > } >> > >> > This is very light weight on the network - by default it sends a >> > keepalive packet every 2 hours (configurable in /proc). > > Is keepalive interval tunable per-socket from the systemcall level? setting the tcp keepalive is simple and easy, except that I'm wondering if 2-hrs is sufficient. Can we make it something like 10mins? http://www.linux.org/docs/ldp/howto/TCP-Keepalive-HOWTO/programming.html yeah looks like that can be configured too (per socket fd) with this call: getsockopt(s, SOL_TCP, TCP_KEEPIDLE, &optval, &optlen) you can choose the default value for this based on your discretion and make it configurable. shall i confirm to Humedica that we will do this in our code? Krishna
Customer has a firewall between clients and servers. This firewall breaks idle TCP connections and this causes the first access on the client mount point to return error. Subsequent access works fine. Hence automated scripts that are run the first time after idle connection is broken will fail. Apparently this is a common problem: http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/#preventingdisconnection Fixing this problem in glusterfs is very simple, just call on socket fd: optval = 1; optlen = sizeof(optval); if(setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) { /* ERROR */ } This is very light weight on the network - by default it sends a keepalive packet every 2 hours (configurable in /proc). This will be useful for other customers who have firewalls that terminate idle connections. Krishna
Setting p1 to show up in my list of prio bugs
Created attachment 204 [details] Proposed patch (as in full description) as an attachment Trace of a connection sending keep-alives every ten seconds. View using wireshark. Option for enabling keep-alive: option transport.socket.keepalive-interval 10
PATCH: http://patches.gluster.com/patch/3287 in master (socket: Support TCP-KEEPALIVE)
PATCH: http://patches.gluster.com/patch/3288 in release-3.0 (socket: Support TCP-KEEPALIVE)
PATCH: http://patches.gluster.com/patch/3303 in master (socket: make tcp keepalive work on OS X)
PATCH: http://patches.gluster.com/patch/3322 in release-3.0 (socket: make tcp keepalive work on OS X)