Bug 143374
Summary: | netfilter NAT/masquerade/SNAT with 2.6 IPSEC broken | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Trevor Cordes <trevor> | ||||
Component: | kernel | Assignee: | David Miller <davem> | ||||
Status: | CLOSED DEFERRED | QA Contact: | |||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3 | CC: | alex, davej, intrep, tcarter, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-06-02 03:56:23 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Trevor Cordes
2004-12-20 02:22:16 UTC
The patches in netfilter patch-o-matic are not up to date and don't work with the 2.6.9 kernel. The author(s) already have new 2.6.9-compatible patches which they will email on request. I suppose someone could attach them here if desired. I'm not sure on the etiquette so I won't do it yet, but I do have copies of the patches if someone wants them. I can report that the new patches go into the FC3 2.6.9-1.681_FC3 kernel source rpm's without a hitch. Everything compiles perfectly and I have tested the new (fixed) ipsec/nat functionality on 3 boxes and it works perfectly exactly as it should. It's going on 2 production boxes tomorrow and I will report back if there are any problems in a heavily loaded production environment. Since the tests were all flawless I'm not expecting any problems. Status update: It's been 1 week with the patched kernel on 3 production boxes doing constant ipsec traffic and zero problems. If the patches need posting here then just let me know and I'll put them up after confirming it's ok with the author. Please post the patch. Or a link to the patch. I'd like to get IPSEC+NAT working and have been unable to do so. Thanks for doing the work to track down the problem! I appear to have found the patches on the netfilter-devel list. Since they are publically available at this point through a little searching I'll just post a link to them: https://lists.netfilter.org/pipermail/netfilter-devel/attachments/20041025/6b59a066/patches.tar.bin They are actually .tar.gz, but the list software renamed them. They appear to apply properly to the stock kernel-2.6.9-1.681_FC3 .src.rpm. Enjoy. I had the same problem with IPSec+NAT and have been running 2.6.9-1.681_FC + The 2.6.9 IPSec+NAT patches from the netfilter devel lists. I just tried to upgrade to 2.6.10-1.737_FC3 to fix the uselib vulnerability, but the patches don't seem to apply cleanly to the new kernel srpm. Darn, I was afraid of that. So we're stuck at .681 for now. We must press the netfilter guys to get this in the mainstream kernel asap! An updated patch (for 2.6.10) has been posted to the netfilter-devel list http://lists.netfilter.org/pipermail/netfilter-devel/2005-January/017961.html http://lists.netfilter.org/pipermail/netfilter-devel/attachments/20050104/db17e25f/ipsec-nat-2.6.10-0001.obj Ignore the .obj extension it's a plain text diff. It applies cleanly to 2.6.10-1.737, But building on FC3 (with gcc version 3.4.2 20041017) Fails with the following errors: net/ipv4/ip_output.c: In function `ip_build_and_send_pkt': net/ipv4/ip_output.c:130: sorry, unimplemented: inlining failed in call to 'ip_dst_output': function body not available net/ipv4/ip_output.c:253: sorry, unimplemented: called from here make[2]: *** [net/ipv4/ip_output.o] Error 1 make[1]: *** [net/ipv4] Error 2 make: *** [net] Error 2 It seems to build fine on FC2 (gcc version 3.3.3 20040412) though. I haven't tested the rebuilt kernel yet though. inlining failures are trivial to fix, find the function its complaining about, and move it to above the function that calls it. Created attachment 109808 [details] Fix inline function compile errors for IPSec+NAT patch The attached patch fixes the inlining error with the 2.6.10 IPSec+NAT patch from netfilter-devel, apply it after the netfilter-devel patch. I've rebuilt the 2.6.10-1.741_FC3 kernel rpm with the netfilter-devel patch and the attached inlining fix, my modified RPM & SRPM are available at http://www.noggin.com.au/rpms/kernel/ if anyone wants to use them. I've tested that it boots, but I can't reboot my firewall just now, so I haven't tested the actual IPSec+NAT functionality yet. After testing I found that the 2.6.10 patch from netfilter-devel doesn't seem to work for me, but this set (posted to netfilter-devel by Robert Dahlem) does. The inline patch is still needed to compile cleanly on FC3, and applies cleanly after this set of patches. http://lists.netfilter.org/pipermail/netfilter-devel/2005-January/017963.html http://lists.netfilter.org/pipermail/netfilter-devel/attachments/20050105/b7588a58/NEED_REVIEW_netfilter-ipsec-patches-linux-2.6.10.tar.obj The .tar.obj is actually a .tgz I have also updated my RPMS (http://www.noggin.com.au/rpms/kernel) with the new patches and they seem to be working fine on my firewall so far. Has the Dahlem patch been stable on your system since your last report? I'm about to recompile the latest 766 kernel with this patch to have a go and would like some confirmation of suitability before I move it to semi-production. To update my own record from before, my patched 681 (see previous notes) has worked 100% on 3 production boxes for 2 months now. We'll see how the latest 2.6.10-compatible patches work out. I'm still crossing my fingers for "real" kernel integration of these patches and thus into FC3. Does anyone know HOW I'd know when this occurs? Any chance of it getting in FC4? I haven't had any problems with 741 + the patch Robert Dahlem posted to netfilter-devel (See comment #10) I've been running it on my primary firewall for around 45 days now. I seem to recall reading somewhere in the netfilter archives that the fix was expected to be merged for the 2.6.11 kernel release, but I don't see anything in the 2.6.11 changelog. Yes, it looks ok. I compiled it and am running it on 4 production boxes with good success so far. Anyone dealing with this issue is likely to face bug 145507, 145773 and friends as well when they switch to 2.6.10+ Does anyone know if this bug is fixed in the recent 2.6.11 FC3 kernel release? How would I find out? Doesn't look like it. It doesn't seem like the fix has made it into the mainline kernel yet, because a patch for 2.6.11 has been posted to the netfilter-devel list http://lists.netfilter.org/pipermail/netfilter-devel/2005-March/018672.html and there doesn't seem to be any references to IPSec, NAT or netfilter patches the 2.6.11-1.14 kernel rpm spec file http://cvs.fedora.redhat.com/viewcvs/rpms/kernel/FC-3/kernel-2.6.spec?only_with_tag=kernel-2_6_11-1_14_FC3&view=markup The fix isn't going into the mainline kernel because, as discussed on netdev.com, there are many problems with the approach taken by those patches. It is very unwise for us to put these patches into the tree, as the upstream version of this fix will be very different, and the patch being discussed is not even being maintained actively by it's original author any longer. I know this is a huge pain for people, but there simply isn't a good netfilter solution for IPSEC in the 2.6.x kernel yet. People just need to be patient while a correct solution is worked out. Thanks. For those that are stuck, you may want to try using GRE tunnels within IPSEC transport. It isn't a perfect fix for this issue, but has allowed me to work around the issue I was having. It also allows NAT to work within the tunnel. I hope this helps! Comment #16: Do you have specific links or message-id's from netdev.com or search terms I could use? I tried to find the discussions but all I found was a handful of messages from Jan 2004 about this problem specifically. So the stumbling block is an architectural issue, not a practical or conceptual issue? It makes 100% sense to me that you can slap ipsec on a connection (or take it out) and have that connection have 100% the same behaviour as before except with encryption. It makes no sense to me that it should be any other way. What version of the kernel will we have to wait for to see a fix for this? Hopefully not 5 years from now. Comment #17: Would your suggestion entail rewriting all the iptables NAT rules and ipsec forwarding rules that I have set up, or would it be a transparent drop-in to make NAT over ipsec work? I am loathe to rewrite my insanely complex rule set again -- I once had a working free-swan setup and spent days converting 1000 rules to the new 2.6 native ipsec. The interface way of doing it was nice and simple but free-swan was so finicky and prone to network lockout upon any mistake. Native ipsec holds so much promise -- the idea is so clean! -- if only NAT would work with it!!! Looks like in the meantime I'll be hacking in whatever patches I can find and maintaining my own kernel (ugh). If anyone else sees this and is in my boat, please email me and let's pool resources to try to keep our kernels up to date with the FC errata releases and this patch. Update: I have successfully patched the latest kernel-2.6.12-1.1372_FC3 kernel with the netfilter PoM patches related to ipsec+nat. It actually isn't that hard, send email if you need help. I have deployed the patched kernel to 2 test boxes that are ipsec+nat'd and so far everything works 100% perfect! Initial ipsec+nat tests with VNC & ssh are working great. So this is a viable option for us ipsec+nat'ers that want to use the latest kernel. Hopefully some long-term mainstream-kernel solution will be found soon. Are those patches "the old patches" or "the new patches"? "The old patches" have been abondoned some time ago, they are not maintained anymore, and will never become part of the official upstream kernel (they've been rejected by kernel maintainers long time ago). There's some folks that are attempting to keep them alive, but nothing official. "The new patches" have been activly developed, but they are not simple and require major kernel surgery. I've also got reply on Netfilter developer mailing list that they are working night and day on new set of patches, but the solution is very complex and hard to implement. See also bug #165359. According to David Miller, the bug (most likely) will never be fixed in RHEL4, since the changes might simply be too big (breaking compatibility with 3rd party binary-only device drivers). Fedora Core, with its more volatile nature is probably going to have better luck, and will probably get patched kernels much sooner (if not as a patch for FC4, then probably in the first major release after the problem is fixed in upstream kernel). It might be possible to run patched kernels on RHEL4 (from Fedora Core for example), unless you need to have 3rd party device drivers (that work with RHEL4 kernels only). Also, as some people suggested, GRE tunnels as workaround work just fine. I've made several setups like that. Setting up GRE tunnel is extremely simple (much simpler then IPSec tunneling). Then you simply setup IPSec only between tunnel endpoints. Theoretically, all you need is IPSec transport mode (since GRE is taking care of tunneling). However, the bug disscussed here also affects transport mode and makes it unusable in combination with Netfilter. So you need workaround number two, and "emulate" transport mode using tunnel mode. For example, you want to tunnel between 10.1.0.0/16 and 10.2.0.0/16 over public network. I'll give configuration on one side, the other side is mirrored. Replace string "some_name" with whatever you want tunnel device to be called (say the other end is in Moose Jaw, Saskatchewan, you might choose to call it mjaw0). # modprobe ip_gre # ip tunnel add some_name mode gre local 1.2.3.4 remote 4.3.2.1 ttl 255 # ip link set some_name up # ip addr add 10.255.255.1 peer 10.255.255.2 dev some_name # ip route add 10.2.0.0/16 via 10.255.255.2 This gives you cleartext tunnel. You use standard routing to get things into the tunnel (as shown in last command). Theoretically, we could add encryption by defining IPSec policy like this (don't try it, it works on its own, but not in combination with Netfilter): # setkey -c <<EOF spdadd 1.2.3.4 4.3.2.1 any -P out ipsec esp/transport//require; spdadd 4.3.2.1 1.2.3.4 any -P in ipsec esp/transport//require; EOF However, because of bug discussed here (and in 165359), this isn't going to play nicely with Netfilter. So, instead of the above, one would use workaround #2 and define IPSec policy like this: # setkey -c <<EOF spdadd 1.2.3.4 4.3.2.1 any -P out ipsec esp/tunnel/1.2.3.4-4.3.2.1/require; spdadd 4.3.2.1 1.2.3.4 any -P in ipsec esp/tunnel/4.3.2.1-1.2.3.4/require; EOF Tunnel mode defined that way gives exactly the same functionality as transport mode (with small encapsulation overhead). Note that the above defines policy that enforces encryption only between 1.2.3.4 and 4.3.2.1. Encryption is not enforced between local networks by using IPSec policy! But since tunneled packets are encapsulated, they will get encrypted. Also, to get things into the tunnel, you don't use IPSec policy anymore. You simply route into the tunnel. You even have network device (called "some_name" in above examples) that you can use for both routing and writing Netfilter firewall rules (for outgoing packet, you'll first see it on device "some_name" and then you'll see GRE packet on eth0 (or ppp0, or whatever is your external interface)). Much like userland VPN solutions that use tun* devices. I kind of like this approach more, since by simply typing "ip route show" I know exactly where the packets are going to end (there's no hidden, behind the scenes implied IPSec routing that doesn't show anywhere in routing tables, and which is sometimes pain the but to get play nicely whith existing real routes). Now, the problem is, the above workarounds work nicely between two Linux boxes. They should also work between Linux box and anything else that supports both GRE and IPSec (or they might not). It isn't going to work between Linux box and those small Linksys VPN boxes (since they do only IPSec). It might work between Linux box and real Cisco router (I had no means of testing that). Interesting GRE setup ideas there, Alex. Unfortunately, the pain for me to convert all my iptables rules (1000+) back to a virtual interface paradigm is much greater than simply maintaining my own patched kernel for while until a fix is in the mainstream kernel. I'm only using FC (not RHEL) so I can easily migrate to new kernels as they appear. The patches I'm using successfully (now with 1376) are almost certainly the "old" ones, as the code looks much like it did when I used the older 2.6.10 patches. I got them straight from the PoM set on the netfilter sites. I am heartened to hear that the teams are actively working on this to get it in the mainstream. I'm posting these updates mostly for the people who need this functionality NOW and want some options; so they know it is doable. |