Bug 1048188

Summary: socket doesn't notify disconnect due to packet drop, simulated using iptables, to higher layers
Product: [Community] GlusterFS Reporter: krishnan parthasarathi <kparthas>
Component: transportAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, gluster-bugs, nsathyan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.6.0beta1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-11 08:26:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description krishnan parthasarathi 2014-01-03 11:02:49 UTC
Description of problem:
rpc-clnt sockets go into a connect retry loop, when outgoing packets to its remote endpoint are 'dropped' using iptables(8) rules. This means that higher layers build on top of the transport layer, fail to perceive a disconnect, when a non-blocking connect(2) has failed.

For eg,
 iptables -I OUTPUT -p tcp --dport 24007 -j DROP

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Set the following iptables rules in one of the nodes in the cluster,
iptables -I OUTPUT -p tcp --dport 24007 -j DROP

2. Issue any of the gluster CLI command.
3. The command would hang.

Actual results:
From the logs we could see that socket layer perceived a connection failure which it fails to propagate as a disconnect to higher layers.

Expected results:
Higher layers, like glusterd/rpc must perceive the disconnect as perceived by transport/socket layer.


Additional info:

Comment 1 Anand Avati 2014-01-03 11:05:17 UTC
REVIEW: http://review.gluster.org/6627 (socket: propogate connect failure in socket_event_handler) posted (#3) for review on master by Krishnan Parthasarathi (kparthas)

Comment 2 Anand Avati 2014-01-06 18:56:36 UTC
REVIEW: http://review.gluster.org/6627 (socket: propogate connect failure in socket_event_handler) posted (#4) for review on master by Krishnan Parthasarathi (kparthas)

Comment 3 Anand Avati 2014-01-07 17:35:37 UTC
REVIEW: http://review.gluster.org/6661 (protocol/client: reset ping_started on transport disconnect) posted (#1) for review on master by Krishnan Parthasarathi (kparthas)

Comment 4 Anand Avati 2014-01-09 06:53:58 UTC
REVIEW: http://review.gluster.org/6661 (protocol/client: reset ping_started on transport disconnect) posted (#2) for review on master by Krishnan Parthasarathi (kparthas)

Comment 5 Anand Avati 2014-01-14 09:04:42 UTC
REVIEW: http://review.gluster.org/6627 (socket: propogate connect failure in socket_event_handler) posted (#5) for review on master by Anand Avati (avati)

Comment 6 Anand Avati 2014-01-14 09:04:53 UTC
COMMIT: http://review.gluster.org/6627 committed in master by Anand Avati (avati) 
------
commit 7d89ec77763dc5076379753c736f7fce2bedd9ec
Author: Krishnan Parthasarathi <kparthas>
Date:   Thu Jan 2 20:11:19 2014 +0530

    socket: propogate connect failure in socket_event_handler
    
    This patch prevents spurious handling of pollin/pollout events on an
    'un-connected' socket, when outgoing packets to its remote endpoint are
    'dropped' using iptables(8) rules.
    
    For eg,
     iptables -I OUTPUT -p tcp --dport 24007 -j DROP
    
    
    Change-Id: I1d3f3259dc536adca32330bfb7566e0b9a521e3c
    BUG: 1048188
    Signed-off-by: Krishnan Parthasarathi <kparthas>
    Reviewed-on: http://review.gluster.org/6627
    Reviewed-by: Anand Avati <avati>
    Tested-by: Anand Avati <avati>

Comment 7 Niels de Vos 2014-09-22 12:34:32 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 8 Niels de Vos 2014-11-11 08:26:30 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users