Bug 116982 - ip address flush deadlocks doing netlink communication over and over on 2.6 kernel
Summary: ip address flush deadlocks doing netlink communication over and over on 2.6 k...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: iproute
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Phil Knirsch
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks: FC2Target
TreeView+ depends on / blocked
 
Reported: 2004-02-27 00:39 UTC by Arkadiusz Miskiewicz
Modified: 2015-03-05 01:13 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-03-17 14:34:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Arkadiusz Miskiewicz 2004-02-27 00:39:29 UTC
On 2.6 kernel iproute-2.4.7-11:

ip a a 192.168.0.1/24 dev eth0
ip link set eth0 down
ip a flush dev eth0

Here on my vanilla 2.6.2 it locks eating CPU - it does netlink 
communication over and over. This ,,hang'' doesn't happen when 
interface is in UP state. Also doesn't happen on 2.4 kernels.

This also could be kernel bug...

Comment 1 Bill Nottingham 2004-03-02 04:19:38 UTC
What adapter?

Comment 2 Arkadiusz Miskiewicz 2004-03-02 10:19:19 UTC
Doesn't really matter (but it's 3c905C-TX/TX-M using 3c59x and some 
RTL-8139/8139C/8139C+ using 8139too). It even hapens on lo device so 
should be easy to reproduce. ,,how to reproduce'' recipe doesn't work 
for you?

Comment 3 Arkadiusz Miskiewicz 2004-03-04 01:57:16 UTC
It seems that in ipaddr_list_or_flush() in for(;;) loop 
rtnl_dump_filter() function executes filter++ at each pass so it 
never leaves that for(;;) loop.

On 2.6 one thing happens different from 2.4 kernels.
rtnl_wilddump_request() sends request with rth->dump == 1078365203 
but gets answer with h->nlmsg_seq == 1078365202 and in next while 
(NLMSG_OK(h, status)) pass it gets the right one h->nlmsg_seq  == 
1078365203.

on 2.4 it always gets right reply. Maybe this has nothing to do with 
the problem or maybe it has.

Comment 4 Arkadiusz Miskiewicz 2004-03-04 22:26:33 UTC
For now I've just limited number of loop passed to 10k. (hack).

diff -urN iproute2.org/ip/ipaddress.c iproute2/ip/ipaddress.c
--- iproute2.org/ip/ipaddress.c 2004-03-04 23:00:41.050515248 +0100
+++ iproute2/ip/ipaddress.c     2004-03-04 23:08:08.810575433 +0100
@@ -603,7 +603,7 @@
                                fprintf(stderr, "Flush terminated\n")
;
                                exit(1);
                        }
-                       if (filter.flushed == 0) {
+                       if (filter.flushed == 0 || round > 10000) {
                                if (round == 0) {
                                        fprintf(stderr, "Nothing to 
flush.\n");
                                } else if (show_stats)


Comment 5 Arkadiusz Miskiewicz 2004-03-17 14:34:56 UTC
Fixed in kernel
http://oss.sgi.com/projects/netdev/archive/2004-03/msg00190.html


Note You need to log in before you can comment on or make changes to this bug.