Bug 161898

Summary: network connections stalled with kernel 2.6 and standard iptables
Product: Red Hat Enterprise Linux 4 Reporter: Jose Traver <traverj>
Component: kernelAssignee: Thomas Graf <tgraf>
Status: CLOSED CANTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: caronc, davej, jbaron, rkhan, ssnodgra, villapla
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-03 12:55:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ethereal capture of a stalled connection
none
IT:105585 none

Description Jose Traver 2005-06-28 10:39:24 UTC
Description of problem:

A problem has been detected when using iptables and transfering large files
through the network. Any file transfer is affected if it is filtered by iptables
(SSH, FTP, SMTP, etc.)


Version-Release number of selected component (if applicable):

The problem has been detected in systems with RHEL4 starting connections and
iptables activated, but also happens in FC3 anf FC4. Destination systems that
have been tested are RHEL3, RHEL4, FC3 and FC4 with or without firewall
installed. No SELinux activated.

Systems starting connections with kernel 2.4 (FC1, RHEL3) does not present this
problem. Some systems have been running RHEL3 and the same iptables rules
without a failure, then re-installed as RHEL4/FC3 and they began to fail.

Currently trying RHEL4 running kernel 2.6.9-11.EL with iptables 1.2.11-3.1

How reproducible:
Always


Steps to Reproduce:
1.scp a large file from a server with default iptables activated for SSH
2.Connection gets stalled at X % of transfer
3.service iptables stop
4.scp the same file to the same system and everything is OK.
  
Actual results:
Connection gets stalled at X % of transfer

Expected results:
File tranferred.

Additional info:

The bandwidth and the number of connections in the receiver system make variable
the size of stalled files in these connections. It happens sooner in a
connection from a system in a network with high bandwidth (i.e. university, data
center) to a system in a network with low bandwitdh (ADSL/Cable connection).
With high bandwith, it may be needed to transfer 1-3 GB while on a low bandwidth
it happens with just 1-30 MB. 

Some bugs in bugzilla has been found which could be related to this problem
(stalled connections) but none of the workarounds solved the problem. Those bugs
are:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=129204
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=126626
http://lwn.net/Articles/91976/

None of the workarounds described solved the problem (tcp_win_scale,
tcp_moderate_rcvbuf, tcp_ecn, etc.)

I found a similar case in
http://www.derkeiler.com/Mailing-Lists/securityfocus/Secure_Shell/2005-04/0053.html

The test were made with different iptables-configurations, but taking the
default config file that system-config-securitylevel creates to allow ssh
incomings I have arrived at some point. The default file is:

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT

Which shows the problem, but I've managed to make it work with an additional rule:

*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state INVALID -m tcp -p tcp --sport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT

The new rule is added after watching some ethereal capture traces (one of them
is included as an attachment). It seems that some type of answer from system
destination is not coming back properly through the firewall (ICMP and RELATED
connections allowed). 

Although some packets keep transferring once the connection is stalled, no info
gets to the other side, no matter for how long the stalled connection could be
trying to transfer (an hour or more time and no visible results).

Comment 1 Jose Traver 2005-06-28 10:40:52 UTC
Created attachment 116051 [details]
ethereal capture of a stalled connection

Comment 2 Thomas Woerner 2005-06-28 10:58:46 UTC
Iptables is used to set up, maintain, and inspect the tables of IP packet filter
rules in the Linux kernel.

Assigning to kernel.


Comment 4 Thomas Graf 2008-06-13 20:49:38 UTC
Is this bug still occurring?

Comment 5 Thomas Graf 2008-11-03 12:55:10 UTC
I'm closing this bugzilla as there was no answer to my ping. Feel free to reopen the bug if the problem still occurs.

Comment 6 Chris 2014-02-20 17:17:53 UTC
For about 2 weeks now, we've been trying to figure out why we can't deliver files larger then 600KB across our network before it stalls. We experience the same symptoms testing with both FTP and SFTP.  Using Wireshark; we can see that just prior to the stall we receive 40 to 60 Duplicate ACK requests.  According to http://ssfnet.org/Exchange/tcp/tcpTutorialNotes.html (specifically the Congestive Control section) this implies that a packet was lost due to congestion and the therefore making the remaining packets delivered out of sequence.  The multiple Duplicate ACKs (>=3, in our case 50 to 60) insinuates that the packet was lost.

But here is where Linux stalls... nothing is re-transmitted... after 1 to 2 minutes; the connection sometimes resumes... but otherwise stays dormant.  We have firewall configurations that kill stale connections with no data flow (I realize I can eliminate this in SSH with keep-alives, but it doesn't solve the FTP issue which suffers the same fate).

The transfer takes place from our internal LAN to a small 1MB pipe (WAN) which connects us across the country to another end point.  So according to the same article i posted above; this is exactly where congestion will occur.  the Duplicate ACK's tell the server to adjust it's sliding window (sometimes almost in half) to accommodate for the traffic.

When i disable iptables on the sending end (source), I'm able to transfer the file without a problem. In fact; the same 40 to 60 Duplicate ACKs are received; but Linux acknowledges them with iptables disabled and immediately recovers and delivers the file in sequence with virtually no interruptions (obviously some performance issues; but the transfer continues uninterrupted).

Then I stumbled across ```this``` bugzilla report: which was closed back in 2008 which is EXACTLY the issue I'm experiencing. Effectively adding this entry to my IP Tables (and restarting the firewall) resolves the problem:
-A INPUT -m state --state INVALID --dport 22 -m tcp -p tcp -j ACCEPT

It appeared to be closed because no one could solve the issue and it appeared to work for everyone else.

To further my testing I delivered a file via SFTP to a server I had at home (with iptables enabled and without the above workaround mentioned in this BugZilla report and seen above). It delivered perfectly without any issues at all.

I then stumbled across this:
http://www.experts-exchange.com/Software/Server_Software/Web_Servers/Apache/Q_27997038.html

The above article illustrates the EXACT same problem; but their solution was to disable tcp_sack requests instead of accepting INVALID Duplicate ACKs.

So I tried this next and it also worked perfectly for us in our production environment. I could send and acknowledge the Duplicate ACKs received (without the firewall entry work around).

As it turned out; the reason was able to deliver files successfully to an SFTP server on the internet (appose to our WAN ) was because our company's outside firewall (the last one before leaving into the vast internet) disables Selective Acknowledgments (SACK) for it's DoS exploit (http://www.iss.net/security_center/reference/vuln/TCP_Malformed_SACK_DoS.htm). Therefore it strips the extra SACK entries off of the packet and suddenly makes iptables not categorize the Duplicate ACK as an 'INVALID' packet.

This is currently happening to all of our Red Hat v5.8 and 6.4 servers deployed today.  Since the ticket seems to focus on Red Hat 4.  I have to assume it is still present in all v2.6 kernels in general (I'm not sure if it spans to other distributions or not).

The issue clearly has 4 workarounds when using iptables:
1. disable SACK at the router level to allow the packet to not be considered 'INVALID' and remain as 'ESTABLISHED'. Hence there is no reason why the following shouldn't be able to work (SSH as an example below /etc/sysconfig/iptables):
    -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
    -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
    -A INPUT -j DROP
2. disable SACK at the kernel level at each source (sending) machine so it can process the Duplicate SACK requests as Duplicate ACK's instead:
    echo 0 > /proc/sys/net/ipv4/tcp_sack
    # or:
    sysctl -w net.ipv4.tcp_sack=0
3. using the bugzilla mentioned above... add the following to IP Tables to allow for SACK responses:
    -A INPUT -m state --state INVALID --dport 22 -m tcp -p tcp -j ACCEPT
    # or even more dangerous; but support FTP and all other transport protocols
    -A INPUT -m state --state INVALID -m tcp -p tcp -j ACCEPT
4. Eliminate any possible means of network congestion on your network so duplicate ack's are never issued

But I think the real answer is:
Update the kernel (specifically the iptables module) to support the TCP Duplicate Selective ACK packets. These should be still flowing with the 'ESTABLISHED' category and not 'INVALID. 

In our environment; we will temporarily go with workaround #2 i defined above until this bug is resolved. I REALLY need to push that this be back ported to Red Hat v5 (as well as v6) since it affects many distributed systems across Canada right now running both versions.  But in the mean time I'm content with my work around.

Do you guys see any problems with workaround #2 for now? Perhaps you can offer a better solution I haven't stumbled across yet?