Bug 116982 - ip address flush deadlocks doing netlink communication over and over on 2.6 kernel
ip address flush deadlocks doing netlink communication over and over on 2.6 k...
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: iproute (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Phil Knirsch
Brock Organ
:
Depends On:
Blocks: FC2Target
  Show dependency treegraph
 
Reported: 2004-02-26 19:39 EST by Arkadiusz Miskiewicz
Modified: 2015-03-04 20:13 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-03-17 09:34:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Arkadiusz Miskiewicz 2004-02-26 19:39:29 EST
On 2.6 kernel iproute-2.4.7-11:

ip a a 192.168.0.1/24 dev eth0
ip link set eth0 down
ip a flush dev eth0

Here on my vanilla 2.6.2 it locks eating CPU - it does netlink 
communication over and over. This ,,hang'' doesn't happen when 
interface is in UP state. Also doesn't happen on 2.4 kernels.

This also could be kernel bug...
Comment 1 Bill Nottingham 2004-03-01 23:19:38 EST
What adapter?
Comment 2 Arkadiusz Miskiewicz 2004-03-02 05:19:19 EST
Doesn't really matter (but it's 3c905C-TX/TX-M using 3c59x and some 
RTL-8139/8139C/8139C+ using 8139too). It even hapens on lo device so 
should be easy to reproduce. ,,how to reproduce'' recipe doesn't work 
for you?
Comment 3 Arkadiusz Miskiewicz 2004-03-03 20:57:16 EST
It seems that in ipaddr_list_or_flush() in for(;;) loop 
rtnl_dump_filter() function executes filter++ at each pass so it 
never leaves that for(;;) loop.

On 2.6 one thing happens different from 2.4 kernels.
rtnl_wilddump_request() sends request with rth->dump == 1078365203 
but gets answer with h->nlmsg_seq == 1078365202 and in next while 
(NLMSG_OK(h, status)) pass it gets the right one h->nlmsg_seq  == 
1078365203.

on 2.4 it always gets right reply. Maybe this has nothing to do with 
the problem or maybe it has.
Comment 4 Arkadiusz Miskiewicz 2004-03-04 17:26:33 EST
For now I've just limited number of loop passed to 10k. (hack).

diff -urN iproute2.org/ip/ipaddress.c iproute2/ip/ipaddress.c
--- iproute2.org/ip/ipaddress.c 2004-03-04 23:00:41.050515248 +0100
+++ iproute2/ip/ipaddress.c     2004-03-04 23:08:08.810575433 +0100
@@ -603,7 +603,7 @@
                                fprintf(stderr, "Flush terminated\n")
;
                                exit(1);
                        }
-                       if (filter.flushed == 0) {
+                       if (filter.flushed == 0 || round > 10000) {
                                if (round == 0) {
                                        fprintf(stderr, "Nothing to 
flush.\n");
                                } else if (show_stats)
Comment 5 Arkadiusz Miskiewicz 2004-03-17 09:34:56 EST
Fixed in kernel
http://oss.sgi.com/projects/netdev/archive/2004-03/msg00190.html

Note You need to log in before you can comment on or make changes to this bug.