485903 – [RHEL5] Netfilter modules unloading hangs

Bug 485903 - [RHEL5] Netfilter modules unloading hangs

Summary: [RHEL5] Netfilter modules unloading hangs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Pirko
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	485904 533192 600215
TreeView+	depends on / blocked

Reported:	2009-02-17 11:34 UTC by Tomas Smetana
Modified:	2018-11-14 18:30 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Calling the "service iptables stop" command causes the iptables init script to unload the netfilter modules. Because a clean-up code path was not taken, an endless loop occurred, which resulted in the init script becoming unresponsive. This update ensures that the clean-up code path is correctly taken, with the result that stopping the iptables service now works as expected.
Clone Of:
Clones:	485904 (view as bug list)
Environment:
Last Closed:	2011-01-13 20:46:00 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
reproducer script (276 bytes, text/plain) 2009-02-20 16:20 UTC, Mike Gahagan	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0017	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update	2011-01-13 10:37:42 UTC

Description Tomas Smetana 2009-02-17 11:34:30 UTC

Description of problem:
The unloading of netfilter modules (triggered by e.g. service iptables stop) may hang under certain circumstances.  Please see the reproducer.

Version-Release number of selected component (if applicable):
kernel-2.6.18-128.el5

How reproducible:
always

Steps to Reproduce:
1. set up iptables:

iptables -F
iptables -X
iptables -A OUTPUT -d 192.168.122.254/255.255.255.0 -o eth0 -p tcp -m state --state NEW -m tcp --dport 7365 -j ACCEPT

The 192.168.122.254 host should not exist,

2. run the following script (note that timing matters -- running the commands by hand may not reproduce the problem)

#!/bin/sh
ping 192.168.122.254 -c1 -w1
arp -d 192.168.122.254
/etc/init.d/iptables stop

3. observe the results
  
Actual results:
the initscritpt would never finish:

PING 192.168.122.254 (192.168.122.254) 56(84) bytes of data.

--- 192.168.122.254 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms

Flushing firewall rules:                                   [  OK  ]
Setting chains to policy ACCEPT: filter                    [  OK  ]
Unloading iptables modules:

... and nothing else. The output of ps -ef looks like this

root     16344 16272 99 12:13 pts/1    00:06:18 modprobe -r xt_state

Expected results:
clean module unload

Additional info:
Adding sleep before the arp command in the script prevents the problems.  Also there are several patches in the upstream kernel that look to be related:
http://www.mail-archive.com/git-commits-head@vger.kernel.org/msg14393.html
http://www.mail-archive.com/git-commits-head@vger.kernel.org/msg07687.html

Comment 1 Tomas Smetana 2009-02-17 11:36:25 UTC

The kernel is spinning in the ip_conntrack_cleanup() function:

i_see_dead_people:
       ip_conntrack_flush();
       if (atomic_read(&ip_conntrack_count) != 0) {
               schedule();
               goto i_see_dead_people;
       }

where the ip_conntrack_count is never zeroed.

Comment 2 Mike Gahagan 2009-02-20 16:20:22 UTC

Created attachment 332728 [details]
reproducer script

updated reproducer script, adds route to the non-existant host.

Comment 3 Tomas Smetana 2009-04-16 07:46:18 UTC

Just a note: I have tried to backport the patches that regarded the RCU usage in netfilter and looked "suspicious" to me.  The problem is that netfilter code has changed quite a lot in the recent upstream releases and any backport (I'd made) is a bit dangerous or incomplete, which was my problem -- I shot more or less blindly and the patches I tried simply didn't work for me.

Please let me know if you made any progress on this.

Comment 4 Jiri Pirko 2009-04-17 15:05:17 UTC

Digging into this and the problem seems that there is probably a missing (unreached) nf_conntrack_put somewhere - the reference count for ct never counts down to 1 and therefore nfct->destroy() (where decrementing of ip_conntrack_count is done) is never called. That's the reason for looping in "goto i_see_dead_people;" (atomic_read(&ip_conntrack_count == 1 all the time).

when I do this:
 ping 192.168.122.254 -c1 -w1
+sleep 1
 arp -d 192.168.122.254

it do not hang. I'll dig in this more...

Comment 8 Jason D. Clinton 2009-08-03 18:42:39 UTC

Has anyone come up with a work-around? As it stands, system-config-securitylevel cannot complete. Normal shutdown is also problematic.

Comment 14 Jiri Pirko 2010-02-09 14:06:11 UTC

I found out the following thing. Using eth0 uninitialized, the reproducer script does not hang. Then after bringing it up with "ifconfig eth0 up" and running reproducer again, it will also not hang. Then I assign ip address by "ifconfig eth0 10.0.0.1 netmask 255.255.255.0" and I run the reproducer, the hang occurs. Testing this with kernel 2.6.18-187.el5.

Neal would you please look at this? Thanks

Comment 15 Jiri Pirko 2010-02-18 15:34:36 UTC

The issue that we are seeing (looping in ip_conntrack_cleanup) happens because ip_conntrack_count never reaches 0. That's because one instance of ip_conntrack is never freed by ip_conntrack_free() (ip_conntrack_count is decrementing there).

ip_conntrack_free() is called from destroy_conntrack() and it is called from nf_conntrack_put() once refcount (&nfct->use) reaches zero.

Looking at this with prinks on appropriate places, when reproducing with "ping 192.168.122.254 -c1 -w1 || sleep x.y && arp -d 192.168.122.254" the mentioned refcount goes up and down (1-4) during ~1sec and then it stays still for ~10secs. When "sleep x.y" is long enough, it will make it to refcnt=1 before calling "arp part". If "arp part" is called earlier (sleep ~<1s), refcount stays >1 and then (after ~10secs) appropriate ip_conntrack not freed. In another words the "arp part" stops the refcnt from changing.

The problem in "arp part" happens somewhere in neigh_update() function called from arp_req_delete(). Still not sure where exactly or why...

Comment 16 Neil Horman 2010-02-18 15:59:50 UTC

I think my comments from bz 485904 are still valid. I made  a mistake in the name of the proc files though, its nfs_conntrack and nf_contract_expect you want to examine before and after the hang.  My expectation is that we're seeing something get on the expect list, holding a reference, but never transition to the nf_conntrack list, so it never gets clean.  Thats likely what we need to look at.

Comment 17 Jiri Pirko 2010-02-19 14:08:25 UTC

Hm, do not see these files there:

# ls /proc/net/netfilter/
nf_log  nf_queue

Doing manual search in other suspicious dirs, I cannot find them either.

Comment 18 Neil Horman 2010-02-19 14:36:48 UTC

/proc/net/nf_conntrack and /proc/net/nf_conntrack_expect

Comment 19 Jiri Pirko 2010-02-24 14:04:18 UTC

Ok I found a fix. Indeed the problem was in neigh_update(). Timer was deleted but references were not put. Following upstream commit fixes this:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=5ef12d98a19254ee5dc851bd83e214b43ec1f725

Comment 23 RHEL Program Management 2010-05-20 12:46:11 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 25 Jenett Tillotson 2010-05-24 19:23:34 UTC

We are having issues with this bug as well. We are using RHEL 5 on a distributed cluster spread across 5 states. The remoteness of several of the cluster pieces makes having reliably rebooting machines a priority. Can this patch be incorporated ASAP?

Comment 27 Jarod Wilson 2010-05-25 21:10:20 UTC

in kernel-2.6.18-200.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 30 chaot_s 2010-06-17 13:55:44 UTC

kernel 203 from http://people.redhat.com/jwilson/el5/203.el5/i686/kernel-2.6.18-203.el5.i686.rpm works for me. Please note that when installing http://people.redhat.com/jwilson/el5/203.el5/i386/kernel-headers-2.6.18-203.el5.i386.rpm I get the following error:

[root@hostname ~]# uname -a
Linux hostname.domain.tld 2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:09:10 EDT 2010 i686 athlon i386 GNU/Linux
[root@hostname ~]# rpm -ihv kernel-headers-2.6.18-203.el5.i386.rpm
Preparing...                ########################################### [100%]
        file /usr/include/linux/gfs2_ondisk.h from install of kernel-headers-2.6.18-203.el5.i386 conflicts with file from package kernel-headers-2.6.18-194.3.1.el5.i386
        file /usr/include/linux/taskstats.h from install of kernel-headers-2.6.18-203.el5.i386 conflicts with file from package kernel-headers-2.6.18-194.3.1.el5.i386
[root@hostname ~]#

before the 203 kernel reloading iptables hung at unloading the netfilters, with the 203 kernel it works just fine.

Comment 32 Douglas Silas 2010-06-28 20:48:29 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
Calling the "service iptables stop" command causes the iptables init script to unload the netfilter modules. Because a clean-up code path was not taken, an endless loop occurred, which resulted in the init script becoming unresponsive. This update ensures that the clean-up code path is correctly taken, with the result that stopping the iptables service now works as expected.

Comment 36 masanari iida 2010-08-13 05:07:36 UTC

If I am not mistaken, this one was fixed on 2.6.18-194.6.1.
* Mon Jun 07 2010 Jiri Pirko  [2.6.18-194.6.1.el5]
- [net] neigh: fix state transitions via Netlink request (Jiri Pirko) [600215 485903]

And 2.6.18-194.11.1 was released on 10th/August.

I can see BZ#600215 is on the list of following URL.
http://www.redhat.com/docs/en-US/errata/RHSA-2010-0504/Kernel_Security_Update/index.html

So if someone from RH confirm the release, set this BZ status to CLOSED.
Thanks

Comment 37 Issue Tracker 2010-08-13 05:14:26 UTC

Event posted on 08-13-2010 02:14pm JST by tumeya

> So if someone from RH confirm the release, set this BZ status to CLOSED.
BZ600215 addressed EUS delivery for this bug. It got pushed out on July
1st btw. 
This BZ, bz485903, however must stay open until its delivery on 5.6.0. 


This event sent from IssueTracker by tumeya 
 issue 261512

Comment 38 Eryu Guan 2010-11-02 03:03:09 UTC

verified by job https://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=178158
see case /kernel/errata/5.5.z/600215-netfilter

Comment 40 errata-xmlrpc 2011-01-13 20:46:00 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html

Comment 41 Shi jin 2011-01-28 14:43:00 UTC

I am still having exactly the same problem after upgrading to RHEL-5.6 and the 2.6.18-238.1.1.el5 kernel.

Comment 42 Jiri Pirko 2011-01-28 15:33:03 UTC

(In reply to comment #41)
> I am still having exactly the same problem after upgrading to RHEL-5.6 and the
> 2.6.18-238.1.1.el5 kernel.

That's most probably a different issue which looks alike. Would you please file a new bug with reproducing steps? Thanks.

Note You need to log in before you can comment on or make changes to this bug.