From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc3 Firefox/1.0.6 Description of problem: when attempting to start or restart iptables it hangs on modprobe -r ipt_state. This appears similar to bug 112630 for FC3. # ps aux |grep modprobe root 22389 99.9 0.0 2048 412 pts/2 R 12:07 18:51 modprobe -r ipt_state root 22646 0.0 0.0 2608 404 ? S< 12:10 0:00 /sbin/modprobe -q -- ipt_state root 22700 0.0 0.0 1760 412 pts/3 D 12:13 0:00 modprobe -r iptable_filter root 22754 0.0 0.0 2408 396 pts/2 S 12:13 0:00 modprobe -r ip_tables root 22808 0.0 0.0 2368 396 pts/2 S+ 12:24 0:00 modprobe -r ip_tables Version-Release number of selected component (if applicable): iptables-1.2.11-3.1.RHEL4 kernel-smp-2.6.9-5.EL How reproducible: Didn't try Steps to Reproduce: 1. service iptables restart 2. 3. Actual Results: Hangs Expected Results: program restarts Additional info:
This is a kernel problem - not a userland problem. As an interim solution you could disable module unload in /etc/sysconfig/iptables-config. But please remember, that you have to unload the modules to get to a sane state for a restart or stop.
This is the "nf_reset()" problem that was fixed recently upstream. We ended up iterating through 3 different versions of the fix because the first two variants introduced regressions of various kinds, in bridging netfilter and elsewhere, so we have to be careful to apply the correct final fix. I'm going to attach the two relevant patches to this bugzilla report. The first is the incorrect fix, and the second is the later patch which fixes things up correctly. James, you should be able to take these two diffs and consolidate them into the RHEL4 tree quite readily.
Created attachment 117685 [details] Firt part of fix
Created attachment 117686 [details] Second part of fix
(In reply to comment #3) > This is the "nf_reset()" problem that was fixed recently upstream. > We ended up iterating through 3 different versions of the fix > because the first two variants introduced regressions of various > kinds, in bridging netfilter and elsewhere, so we have to be careful > to apply the correct final fix. > > I'm going to attach the two relevant patches to this bugzilla > report. The first is the incorrect fix, and the second is the > later patch which fixes things up correctly. Am I right in thinking that we don't need the second patch, as nf_reset() is not called in ip_output_finish2() in the 2.6.9 kernel? i.e. there should be no reference holding bug for bridging there.
I think I have a handle on things now. The final change is to add nf_reset() calls to net/packet/af_packet.c, I'll attach that shortly. But so much other stuff has changed, as exemplified by this conflict you discovered, that I am still not certain that this fixes the reported bug. James, please help out by doing some more research in this area, thanks.
Created attachment 117776 [details] nf_reset() additions to af_packet.c
(In reply to comment #8) > Created an attachment (id=117776) [edit] > nf_reset() additions to af_packet.c > Dave, do you know if there was ever a reproducable test case for this? I can't reproduce it with the following: + hitting the web server with apachebench fron another system + running tcpdump on the target + running a loop of adding/flush/modprobe -r on the target The script I'm running on the target is: while (true); do iptables -A INPUT -m state --state NEW,ESTABLISHED -j ACCEPT iptables -A OUTPUT -m state --state NEW,ESTABLISHED -j ACCEPT sleep 1 iptables -L -v iptables -F modprobe -r ipt_state done That should trigger the bug, eventually, right?
I don't know what a test case would look like, sorry. Patrick McHardy is the one who worked on fixing the bug, perhaps you can ask him.
I was able to reproduce the problem by opening a packet socket and calling select() on it (but not recvmsg()) to cause packets to be queued to it (thanks to Patrick for help there). Your patch in #8 fixes the problem.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html
I have this problem under RHEL4.3 and kernel-2.6.9-34ELsmp. This is the kernel package supplied as the above errata. The machine is a dual Xeon installed as RHEL4.3. I have several other identical (hardware and OS) machines which do not exhibit the behaviour. It is reproducible every time on the affected system. After reboot on the first iptables restart the problem occurs. Must reboot to get rid of it.
Just in case nobody got it - that means the problem isn't fixed.
(In reply to comment #17) > Just in case nobody got it - that means the problem isn't fixed. > Please open a new bugzilla with full details on how to reproduce the problem you're seeing.