Bug 182012
| Summary: | iptables --state ESTABLISHED rule occasionally misses packets, leads to spurious rejects | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Jeff Brown <jbrown> |
| Component: | kernel | Assignee: | Thomas Graf <tgraf> |
| Status: | CLOSED CANTFIX | QA Contact: | Brian Brock <bbrock> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.0 | CC: | davem, jbaron, rkhan, smithj4, ssnodgra |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | i386 | ||
| OS: | Linux | ||
| URL: | http://www-cse.ucsd.edu/~jbrown/reset/ | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2008-11-03 12:57:49 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Jeff Brown
2006-02-19 01:53:39 UTC
(Possibly related to bug #112709) This is a netfilter kernel problem, not a iptables userland problem. Assigning to kernel. I am seeing the exact same problem here. Since we use stateful iptables firewall rules on a lot of our servers and this is causing a lot of hung connection problems, I have asked our RedHat Network representative to open a formal support ticket. Jeff, if you're still fighting this take a look at bug #191336 and see if it sounds like it might explain your problem. The only issue with this bug is that it does require there to be a 5-minute idle at some point. I'd also be interested to know what your conntrack entry looks like after one of the random drops - in particular, does the number of packets match what you've seen in the session or does the conntrack appear to have been destroyed at some point. It looks like there may be a few separate bugs affecting different people. In our case, after some more testing, we discovered that the problem we are having is this tcp_sack related connection tracking bug that is mentioned in this netfilter mailing list post, affecting kernels <= 2.6.11: https://lists.netfilter.org/pipermail/netfilter/2005-June/061101.html Disabling tcp_sack fixes the problem for us, although this is not a desirable solution for servers that handle a lot of network traffic or suffer from a lot of loss since it will increase the retransmits necessary to recover from any packet loss. Are you still experiencing this problem? I don't know if recent RHEL4 kernels still exhibit this bug. We long ago worked around this bug by re-structuring our firewall rules on production machines, changing them from the form: - accept ESTABLISHED - accept inbound NEW to ports x,y,z - reject others ...to the form: - accept ESTABLISHED - accept inbound to ports x,y,z - reject others With the removal of the "NEW" qualifier, when the "ESTABLISHED" test misses a packet for an existing connection, the per-port accept rules still allow them, so we no longer encounter spurious rejects. It's tricky to test conclusively whether it's been fixed, since the original bug was sporadic, and we never had a particular workload that would reliably reproduce it. Sorry. I'm closing this bugzilla because I can't reproduce it myself and there is none left to reproduce it either. |