Bug 1016739

Summary: ipt_MASQUERADE failing to maintain TCP-connection
Product: [Fedora] Fedora Reporter: Jari Turkia <redhat-bugzilla>
Component: kernelAssignee: fedora-kernel-networking
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 20CC: gansalmon, itamar, jerry, jonathan, kernel-maint, madhu.chinakonda, marcelo.barbosa, michele, psabata, redhat-bugzilla, twoerner
Target Milestone: ---Flags: jforbes: needinfo?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-10 15:01:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jari Turkia 2013-10-08 15:23:20 UTC
Description of problem:
iptables -t nat -A POSTROUTING -o em1 -j MASQUERADE
does not work, fix:
-A POSTROUTING -o em1 -j SNAT --to-source <the-outside-IP>

When using MASQUERADE-target, it almost works, but fails to maintain TCP-connection.


Version-Release number of selected component (if applicable):
kernel-3.11.3-201.fc19.x86_64
iptables-1.4.18-1.fc19.x86_64


How reproducible:
Easily, always.


Steps to Reproduce:
1. Configure iptables to NAT with MASQUERADE-target.
2. On an outside computer, create file of size 1+ MiB
3. On an inside (NATed) computer, try to transfer the file

Actual results:
Transfer fails after a short period.

Expected results:
Functioning transfer.

Additional info:
The fix is to use SNAT-target instead of MASQUERADE. It works fully.

Comment 1 Jari Turkia 2013-10-09 05:53:53 UTC
(In reply to Jari Turkia from comment #0)
> Additional info:
> The fix is to use SNAT-target instead of MASQUERADE. It works fully.

I wrote about this in detail into my blog. See http://blog.hqcodeshop.fi/archives/119-Bug-in-Linux-3.11-Netfilter-MASQUERADE-target-does-not-work-anymore.html

Comment 2 Thomas Woerner 2013-10-09 10:21:28 UTC
This is not an iptables problem. Reassigning to kernel.

Comment 3 Justin M. Forbes 2014-01-03 22:10:21 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 4 Jari Turkia 2014-01-04 09:22:00 UTC
Yes, I still experience this issue on FC19 kernel versions 3.12.5-200.fc19.x86_64 and 3.12.6-200.fc19.x86_64. And since nothing has changed SNAT still does work properly.

Comment 5 Michele Baldessari 2014-01-04 14:47:03 UTC
Hi Jari,

interesting one. I'll be back home next week where I can look into this a bit
more. Likely a regression with 3.11.X. Do you know which version of the
kernel this used to work with?

There were a few potentially impacting changes since 3.7:
* 1eb4f75 - (2013-07-10 19:45:39 -0700)  ipv6: in case of link failure remove route directly instead of letting it expire <Hannes Fre
* 75a493e - (2013-07-02 12:44:18 -0700)  ipv6: ip6_append_data_mtu did not care about pmtudisc and frag_size <Hannes Frederic Sowa>
* c65ef8d - (2012-12-16 23:28:30 +0100)  netfilter: nf_nat: Also handle non-ESTABLISHED routing changes in MASQUERADE <Andrew Collins
* a0ecb85 - (2012-12-03 15:14:20 +0100)  netfilter: nf_nat: Handle routing changes in MASQUERADE target <Jozsef Kadlecsik>

Can't yet say they are related though.

thanks,
Michele

Comment 6 Jari Turkia 2014-01-04 15:48:40 UTC
The issue is not so clear, that one would immediately notice it. In my environment there is also IPv6 (no NAT) and a HTTP-proxy on top of the fact that any short NATed connection will work as expected.

My best guess is that it never worked on Fedora 19. My previous install was a Fedora 17 and I had no problems there. That's also where I "inherited" the IPtables settings which had the MASQUERADE in them.

Comment 7 Justin M. Forbes 2014-03-10 14:49:31 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.13.5-100.fc19.  Please test this kernel update and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 8 Jari Turkia 2014-03-11 07:15:27 UTC
On my 3.13.5-103.fc19.x86_64 the situation is even worse. On 3.11.3-201.fc19.x86_64 the transfer would last ~1 MiB and then hang, on a good day even 3 MiB. Now I'm struggling to get to 100 KiB before the hang occurs.

Still, using SNAT solves the issue. For testing I transferred 100 MiB without problems. However, SNAT requires an IP-address in the rule to function. My Internet connection is using DHCP, so any changes will need manual reconfiguration. MASQUERADE-rule does not have the address requirement and is better suited for dynamic IP-addresses.

Comment 9 Jari Turkia 2014-03-27 07:41:15 UTC
This same bug exists in Fedora 20 kernel 3.13.6-200.fc20.x86_64.

Comment 10 Justin M. Forbes 2014-05-21 19:29:52 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.14.4-100.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 11 Jari Turkia 2014-05-22 14:47:31 UTC
Updated this to Fedora 20

Comment 12 Jerry C 2014-06-09 20:17:36 UTC
I had a chance to troubleshoot this scenario with someone else.

They were running an really odd MTU and blocking all ICMP.

In his case, the reason for the odd MTU was unknown, fixed everything to 1500 and could not duplicate the bug after that.

I suspect its somehow related to RELATED with MASQUERADE, but did not troubleshoot further since the users MTU was the bigger issue.

Comment 13 Justin M. Forbes 2014-11-13 16:02:45 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.17.2-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 14 Justin M. Forbes 2014-12-10 15:01:57 UTC
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.