Bug 642388

Summary: ip_nat_ftp not working if ack for "227 Enter Passive mode" packet is lost
Product: Red Hat Enterprise Linux 5 Reporter: Veaceslav Falico <vfalico>
Component: kernelAssignee: Thomas Graf <tgraf>
Status: CLOSED ERRATA QA Contact: Network QE <network-qe>
Severity: high Docs Contact:
Priority: medium    
Version: 5.5CC: awestbro, bturner, cye, dennisml, dtian, fleite, herrold, hjia, peterm, qcai, rkhan, rom
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 676900 (view as bug list) Environment:
Last Closed: 2011-07-21 10:22:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
potential fix
none
Wireshark Packet Capture Screenshots (censored)
none
proposed patch none

Description Veaceslav Falico 2010-10-12 18:46:57 UTC
Description of problem:
When using ip_nat_ftp and ip_conntrack_ftp for handling passive ftp from under the NAT, we can hit a problem when the ACK for the "227 Entered passive mode" is lost. The server then retransmits the 227 response, whilst the NAT (i.e. ip_nat_ftp & ip_conntrack_ftp) server also waits for the ACK to activate the passive port, but the client thinks that it's already ok (the ack is sent) and opens the passive connection, which is dropped because the NAT server is still avaiting the ACK on 227 to open it.

Version-Release number of selected component (if applicable):
2.6.18-194.el5


How reproducible:
Set up a classic client <-> NAT <-> server scheme with ip_nat_ftp and conntrack, initiate a passive connection and artificially drop the ACK packet for 227 response going from client to server.
  
Actual results:
passive connection gets dropped

Expected results:
connects ok

Additional info:
A sample tcpdump from the problem happening

10:10:05.627403 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [S], seq 1841846563, win 65535, options [mss 1396,nop,nop,sackOK], length 0
E..0H.@.y..>.Z.....';[..m.Y#....p...`p.....t....
10:10:05.628201 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [S.], seq 100264103, ack 1841846564, win 5840, options [mss 1460,nop,nop,sackOK], length 0
E..0..@.>..5...'.Z....;[....m.Y$p...Z...........
10:10:05.651358 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [.], ack 100264104, win 65535, length 0
E..(H.@.y..E.Z.....';[..m.Y$....P....B........
10:10:05.654678 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264104:100264180, ack 1841846564, win 5840, length 76
E..t..@.>..d...'.Z....;[....m.Y$P...`...220 Interchange Services Batch FTP server IDA4673 built Jun 10 2003 ready.^M

10:10:05.684963 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [P.], seq 1841846564:1841846579, ack 100264180, win 65459, length 15
E..7H.@.y..5.Z.....';[..m.Y$....P....x..USER somebody^M

10:10:05.687278 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [.], ack 1841846579, win 5840, length 0
E..(..@.>......'.Z....;[....m.Y3P.......
10:10:05.687718 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264180:100264216, ack 1841846579, win 5840, length 36
E..L..@.>......'.Z....;[....m.Y3P.......331 Password required for somebody^M

10:10:05.714154 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [P.], seq 1841846579:1841846594, ack 100264216, win 65423, length 15
E..7H.@.y..4.Z.....';[..m.Y3....P....u..PASS somepass^M

10:10:05.735699 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264216:100264246, ack 1841846594, win 5840, length 30
E..F..@.>......'.Z....;[....m.YBP....)..230 User logged in, proceed.^M

10:10:05.778650 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [P.], seq 1841846594:1841846607, ack 100264246, win 65393, length 13
E..5H.@.y..5.Z.....';[..m.YB...6P..qhI..CWD receive^M

10:10:05.781739 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264246:100264284, ack 1841846607, win 5840, length 38
E..N..@.>......'.Z....;[...6m.YOP.......250 "receive" is current directory. ^M

10:10:05.810278 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [P.], seq 1841846607:1841846612, ack 100264284, win 65355, length 5
E..-H.@.y..<.Z.....';[..m.YO...\P..K....PWD^M
.
10:10:05.812883 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264284:100264321, ack 1841846612, win 5840, length 37
E..M..@.>......'.Z....;[...\m.YTP.......257 "receive" is current directory.^M

10:10:05.850630 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [P.], seq 1841846612:1841846618, ack 100264321, win 65318, length 6
E...H.@.y..:.Z.....';[..m.YT....P..&.b..PASV^M

10:10:05.852183 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264321:100264372, ack 1841846618, win 5840, length 51
E..[..@.>..q...'.Z....;[....m.YZP.......227 Entered Passive Mode (198,133,250,39,149,253)^M

10:10:06.086458 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [P.], seq 100264321:100264371, ack 1841846618, win 5840, length 50
E..Z..@.>..p...'.Z....;[....m.YZP.......227 Entered Passive Mode (172,16,86,104,149,253)^M

10:10:06.086992 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [R], seq 1841846618, win 0, length 0
E..(..@......Z.....';[..m.YZ....P....?........
10:10:06.099616 IP 204.90.147.139.15195 > 198.133.250.39.21: Flags [.], ack 100264372, win 65267, length 0
E..(H.@.y..>.Z.....';[..m.YZ....P.............
10:10:06.101233 IP 198.133.250.39.21 > 204.90.147.139.15195: Flags [R], seq 100264372, win 0, length 0                                         
E..(..@.>..=...'.Z....;[........P...d/..


As you can see, the server can't receive ack for pachet 100264372, and tries to re-send it, whilst the client alredy tries to send data to the server.

Comment 1 Fabio Olive Leite 2010-10-13 17:13:10 UTC
I spent some hours looking at the code in the ip_nat_ftp and ip_conntrack_ftp modules, and I don't think it was designed with the idea of retransmissions of mangled packets in mind.

I don't really get why is it that the second PASV packet goes by un-mangled, is it because earlier in the call chain the generic connection tracking code detects it has gone over that SEQ number already?

Can we use that information to somehow replay some cached contents? or re-mangle the packet contents but being careful to NOT re-do the other side-effects, such as affecting the SEQ number offset or the related connection expectation.

Comment 3 Thomas Graf 2010-10-13 21:24:15 UTC
The problem is that once the 227 packet has been processed by ftp conntrack. The seq number is updated in update_nl_seq(). When the same sequence number is processed again it will fail the "are we directly after a '\n' check" and the packet will be accepted without mangling.

At least that's what I think is happening.

Comment 4 Thomas Graf 2010-10-13 21:27:58 UTC
This bug affects any packets being retransmitted, e.g. a retransmitted "PORT" packet would also cause problems.

Comment 6 Fabio Olive Leite 2010-10-13 21:54:32 UTC
Actually, I think I finally understand that find_nl/update_nl code, and the call to update_nl_seq() at the end of help(). It is supposed to cache the SKB that was mangled, so that the conntrack code can eventually just replay it if needed, by calling ip_conntrack_event_cache(). Perhaps this is what is failing.

So apparently the module mangles the SKB and caches it for future replay, but this cache fails to be used later on?

I checked and we do build our kernels with CONFIG_IP_NF_CONNTRACK_EVENTS, so the code should be there.

Comment 7 Thomas Graf 2010-10-13 22:07:07 UTC
Created attachment 453327 [details]
potential fix

This patch may fix the problem. update_nl_seq() contains a bug in RHEL5 which leads to not always updating the older of the two remembered sequence numbers.

This patch fixes this and ensures that the "smallest" of the two sequence numbers being remembered is overwritten by the latest sequence to be expected. This should allow for a retransmitted packet to be accepted because the old sequence number is still around. Every packet matching either of the two remebered sequence numbers will be accepted and mangled.

Comment 9 Fabio Olive Leite 2010-10-14 07:24:38 UTC
I just finished setting up a network of 3 KVM guests, where a client guest connects to a router guest which NATs the FTP port to a server guest. Then on the NAT router I apply 5% of packet drops with the netem qdisc. The NATing is subtle, from 192.168.122.101 to .102, but at least it is easy to follow on a capture at the virtual bridge.

After a few minutes manually testing, I have a failed PASV response, although it does not quite seem to be the same situation:

569.147952       client -> router       FTP Request: PASV
569.148427       router -> server       FTP Request: PASV
569.148847       server -> router       FTP Response: 227 Entering Passive Mode (192,168,122,101,250,120)
569.149032       router -> client       FTP Response: 227 Entering Passive Mode (192,168,122,102,250,120)
569.151381       client -> router       FTP Request: LIST
569.151706       router -> server       FTP Request: LIST
569.151881       server -> router       FTP Response: 150 Here comes the directory listing.
569.152055       server -> router       FTP Response: 226 Directory send OK.
569.152218       router -> client       FTP [TCP Previous segment lost] Response: 226 Directory send OK.
569.355627       server -> router       FTP [TCP Retransmission] Response: 150 Here comes the directory listing.
569.355805       router -> client       FTP [TCP Retransmission] Response: 150 Here comes the directory listing.

579.692039       client -> router       FTP Request: PASV
579.692281       router -> server       FTP Request: PASV
579.692774       server -> router       FTP Response: 227 Entering Passive Mode (192,168,122,101,176,156)
579.693025       router -> client       FTP Response: 227 Entering Passive Mode (192,168,122,101,176,156)

[ Notice that on the packets above there was no NAT of the 227 contents ]

579.696357       client -> router       FTP Request: LIST
579.696513       router -> server       FTP Request: LIST
579.696841       server -> router       FTP Response: 425 Security: Bad IP connecting.
579.697238       router -> client       FTP Response: 425 Security: Bad IP connecting.

On this capture I was ignoring pure-ack packets, just to trim the packet flow and see if I could actually reproduce something similar. Now that I know we're almost there I'll start looking at all packets, so that we're sure to spot the drops and retransmissions and see about reproducing the exact situation.

Cheers,
Fábio Olivé

Comment 10 Fabio Olive Leite 2010-10-14 07:43:12 UTC
370.826272  client -> router  FTP Request: PASV
370.826429  router -> server  FTP Request: PASV
370.826796  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,132,156)

371.027328  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,132,156)

371.027499  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,101,132,156)

I believe this was a valid reproduction. The router dropped the first mangled 227 response, so 200ms later the server retransmitted it, and it went to the client as .101.

I'll see about testing your patched kernel ASAP.

Comment 11 Thomas Graf 2010-10-14 09:45:27 UTC
That is a valid reproduction.

Comment 12 Thomas Graf 2010-10-15 08:35:50 UTC
My patch does not fix the original problem completely. A retransmitted 227 is still not translated. It does solve the connection abortion issues in my 50% packet loss testing setup though.

Comment 14 Thomas Graf 2010-10-15 13:12:49 UTC
It seems the patch fixes the issue after all. I can no longer reproduce the problem. I've run hundreds of ftp session with 50% of simulated packet loss and even if the same packet is lost multiple times the session recovers and all transfers completed successfully.

Can you confirm Fabio?

With an ftpd running on 192.168.101.197 behind 192.168.100.130:

60	35.916735	192.168.100.130	192.168.100.212	FTP
Response: 227 Entering Passive Mode (192,168,100,130,139,218)

65	36.119986	192.168.100.130	192.168.100.212	FTP
[TCP Retransmission]
Response: 227 Entering Passive Mode (192,168,100,130,139,218)

Comment 15 Fabio Olive Leite 2010-10-15 20:25:13 UTC
Hi Thomas!

Indeed, I have been unable to cause the issue even at 10% loss. One case was particularly nasty as both PASV and 227 and their ACKs were lost and retransmitted, and I never got any errors.

 54.914910  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,87,125)
 55.116488  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,87,125)
 55.116728  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,102,87,125)

213.049225  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,161,242)
213.255426  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,161,242)
213.255699  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,102,161,242)

261.787753  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,76,222)
261.992034  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,76,222)
261.992223  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,102,76,222)

283.440550  client -> router  FTP Request: PASV
283.440698  router -> server  FTP Request: PASV
283.441184  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,145,169)
283.645299  client -> router  FTP [TCP Retransmission] Request: PASV
283.645472  router -> server  FTP [TCP Retransmission] Request: PASV
283.645943  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,145,169)
283.646160  router -> client  FTP [TCP Out-Of-Order] Response: 227 Entering Passive Mode (192,168,122,102,145,169)

414.987642  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,45,176)
415.192745  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,45,176)
415.192983  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,102,45,176)

698.482338  server -> router  FTP Response: 227 Entering Passive Mode (192,168,122,101,29,230)
698.687647  server -> router  FTP [TCP Retransmission] Response: 227 Entering Passive Mode (192,168,122,101,29,230)
698.687877  router -> client  FTP Response: 227 Entering Passive Mode (192,168,122,102,29,230)

Thanks!
Fábio Olivé

Comment 24 Fred Wittekind IV 2011-01-28 21:52:09 UTC
Not sure if this belongs in it's own bug report or not, but I think it's definitely related, as I was able to reproduce the bug without packet loss.

5 Centos 5 machines.
xenhost1 running vfw, and www2
xenhost2 running www3

www2 and www3 both have private IP addresses, and use vfw as a firewall.
vfw has two virtual NICs bridged to two different VLANS, one used as internal network, one used as a external/public network.
All Xen networking is bridging to VLANs on a physical NIC

If I run wireshark on vfw, and do a FTP from a outside host to www3, all works as expected, no retransmissions, no CRC errors, no packet loss. FTP client get's passed the external IP address of www2, and all works as it should.

If I run wireshark on vfw, and do a FTP from a outside host to www2, I get CRC errors on every single packet that originates from www2 (packets from outside host have correct CRC), and I get retransmissions with the PASV handshake, tripping the above noted bug where retransmitted packet doesn't get mangled like it should be.  No packet loss (verified with a second wireshark on the host running the ftp client).  FTP client get's passed the internal IP address of www2, and fails.

Despite the CRC errors between vfw & www2, any traffic that does not need to be mangled by ip_nat_ftp works just fine.  I am guessing that ip_nat_ftp is causing the retransmission of the PASV handshake due to the CRC errors, in effect self generating the case that triggers the bug discussed above (previous notes).

So...  Should the CRC error trigger ip_nat_ftp to cause a retransmission be a separate bug report, and just consider this as documentation of this bug being triggered by another bug and not packet loss? Or is it enough related to handle it in the bug report?

From the research I've done looks like I do have a work around available to me by running the following command on www2 (turning off tcp checksum offload):
ethtool -K eth0 tx off

I have not yet tried the workaround though.

Comment 25 RHEL Program Management 2011-02-04 10:50:28 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 26 Fabio Olive Leite 2011-02-09 21:44:18 UTC
Hi Fred,

(In reply to comment #24)
> 
> If I run wireshark on vfw, and do a FTP from a outside host to www2, I get CRC
> errors on every single packet that originates from www2 (packets from outside
> host have correct CRC),

This is because you are capturing on the same physical host and those packets will only have their checksums calculated when they leave the box through the NIC, as "your ethtool -K eth0 tx off" workaround also implies. It is normal for wireshark to complain about those, since it cannot know if those are incoming or outgoing packets. You can disable TCP checksum verification if you want to silence those warnings.

> and I get retransmissions with the PASV handshake,
> tripping the above noted bug where retransmitted packet doesn't get mangled
> like it should be.  No packet loss (verified with a second wireshark on the
> host running the ftp client).

Hmm I believe you are indeed having packet loss, or at least some form of delay that is large enough to trigger a retransmission.

How many copies of the PASV packets do you see on the client?

If you only see one, and you see two on the host/firewall capture, then it means one of them got dropped somewhere on the network. If you see two, then it likely means the client ACK got dropped or delayed, thus also causing a retransmission from the server.

Hope this helps!

Fábio Olivé

Comment 27 Fred Wittekind IV 2011-02-10 18:03:25 UTC
Created attachment 478095 [details]
Wireshark Packet Capture Screenshots (censored)

Comment 28 Fred Wittekind IV 2011-02-10 18:18:36 UTC
(In reply to comment #26)
-- cut --
> This is because you are capturing on the same physical host and those packets
> will only have their checksums calculated when they leave the box through the
> NIC, as "your ethtool -K eth0 tx off" workaround also implies. It is normal for
> wireshark to complain about those, since it cannot know if those are incoming
> or outgoing packets. You can disable TCP checksum verification if you want to
> silence those warnings.

Understood.  The incorrect checksums where the only difference I could find that may have caused the retransmission.
 
-- cut --
> 
> Hmm I believe you are indeed having packet loss, or at least some form of delay
> that is large enough to trigger a retransmission.
> 
> How many copies of the PASV packets do you see on the client?
> 
> If you only see one, and you see two on the host/firewall capture, then it
> means one of them got dropped somewhere on the network. If you see two, then it
> likely means the client ACK got dropped or delayed, thus also causing a
> retransmission from the server.

I think the packet capture I attached will show that there was no packet loss, packet count on vfw1 and the ftp client match exactly.  As for delayed, I don't really think so either, unless it is a delay caused by ip_nat_ftp.  This packet capture is 100% repeatable with ftp connections to www2, and the retransmission only happens on the PASV handshake (every time).  Packet loss / delay should make where the retransmission happens random, not repeatable.

This ftp issue also did not start until vfw1 and www2 where on the same physical host, and the change of physical host was the only thing changed.

Somehow I think ip_nat_ftp is being tripped up by the CRC issue, and this would explain why the retransmission only happens on the PASV handshake.

Comment 29 Fabio Olive Leite 2011-02-10 22:51:20 UTC
Olá!

(In reply to comment #28)
> 
> I think the packet capture I attached will show that there was no packet loss,
> packet count on vfw1 and the ftp client match exactly.  As for delayed, I don't
> really think so either, unless it is a delay caused by ip_nat_ftp.  This packet
> capture is 100% repeatable with ftp connections to www2, and the retransmission
> only happens on the PASV handshake (every time).  Packet loss / delay should
> make where the retransmission happens random, not repeatable.

You are correct. This is a fairly interesting new bug to investigate.

> Somehow I think ip_nat_ftp is being tripped up by the CRC issue, and this would
> explain why the retransmission only happens on the PASV handshake.

There must be some bit of interaction with the virtualization setup.

Perhaps the packet is mangled and set for checksum recalculation, and this is delayed to be done by the NIC, but since it is targeted to a VM on the same host it does not leave through the NIC and is sent to the VM directly without having it recalculated? I guess it would be dropped in this case? Wild guess, of course... :) It could be confirmed with disabling TX checksum offload, as in this case the VMs would [re]calculate the cksums themselves.

I believe it may be interesting to open a new bug for this investigation, or clone this one to preserve the history and context.

Fábio Olivé

Comment 32 Thomas Graf 2011-03-15 10:32:39 UTC
I can't reproduce this with kvm so far. Can you please report this issue via its own bugzilla? This is definitely a separate issue.

Comment 33 Thomas Graf 2011-03-15 11:42:59 UTC
Created attachment 484442 [details]
proposed patch

Comment 34 Fred Wittekind IV 2011-03-15 12:57:06 UTC
(In reply to comment #32)
> I can't reproduce this with kvm so far. Can you please report this issue via
> its own bugzilla? This is definitely a separate issue.

Already did.
https://bugzilla.redhat.com/show_bug.cgi?id=676900

Comment 36 Jarod Wilson 2011-03-23 21:43:43 UTC
Patch(es) available in kernel-2.6.18-250.el5
Detailed testing feedback is always welcomed.

Comment 40 errata-xmlrpc 2011-07-21 10:22:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html

Comment 41 Dennis Jacobfeuerborn 2012-03-05 16:53:43 UTC
After this patch ip_nat_ftp seems to break FTP connections.
See: https://bugzilla.redhat.com/show_bug.cgi?id=740399

Removing ip_nat_ftp from the iptables-config files makes the problem go away.