Bug 487346 - ifdown bond0 causes a deadlock [NEEDINFO]
ifdown bond0 causes a deadlock
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
i386 Linux
low Severity low
: rc
: ---
Assigned To: Jiri Pirko
Red Hat Kernel QE team
Depends On:
Blocks: 533192 526775
  Show dependency treegraph
Reported: 2009-02-25 10:33 EST by Jiri Pirko
Modified: 2015-05-04 21:16 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-03-30 03:43:56 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
cward: needinfo? (yasuhiro.ozone)

Attachments (Terms of Use)

  None (edit)
Description Jiri Pirko 2009-02-25 10:33:21 EST
Description of problem:
I have machine with three NICs, eth0 is normally connected to network and I'm running ssh over it. eth1 and eth2 are slaves for bonding interface bond0. When I try to ifdown bond0 (or if system does it while it reboots) system does to some kind of deadlock.

Version-Release number of selected component (if applicable):
2.6.18-128.1.1.el5 on i686 but I have the same results with 2.6.18-131, for upstream kernel (2.6.29-rc6 in my case) this issue do not occur and it works well.

How reproducible:
always on my machine - I had no luck on dell-pe2850-01.rhts.bos.redhat.com for example.

Steps to Reproduce:
I do following on my system:
[root@localhost ~]# ifdown bond0

Actual results:
After this I never got command line back, cannot ssh to the machine, cannot write on console, but it replies pings.

dmesg says:
bonding: bond0: Removing slave eth1
bonding: bond0: Warning: the permanent HWaddr of eth1 - 00:1F:1F:01:2F:22 - is still in use by bond0. Set the HWaddr of eth1 to a different address to avoid conflicts.
bonding: bond0: releasing active interface eth1
bonding: bond0: Removing slave eth2
bonding: bond0: releasing active interface eth2
Same messages in upstream kernel, where it's working.
ps uax says:
root      2814  0.3  0.6   4612  1300 pts/0    S+   16:13   0:00 /bin/bash /etc/sysconfig/network-scripts/ifdown-eth ifcfg-bond0
root      2907  0.3  0.6   4616  1300 pts/0    D+   16:13   0:00 /bin/bash /etc/sysconfig/network-scripts/ifdown-eth ifcfg-eth2
pid 2907 cannot be killed even with -9

Expected results:
Command line gets back, system is running normally.

Additional info:
[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0 
[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
# Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
[root@localhost ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth2
# Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+
[root@localhost ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1f:1f:01:2f:22

Slave Interface: eth2
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:1f:1f:01:17:69

It doesn't matter in which mode bond0 is.
Comment 1 Jiri Pirko 2009-06-09 10:05:05 EDT
I reproduced this issue on another machine. Also with Realtek 8139 NIC's.
Comment 2 Ivan Vecera 2009-06-09 10:41:11 EDT
It could to be 8139 specific, but I will try the same steps on my machine with one tg3 and two r8169 based cards.
Comment 3 Jiri Pirko 2009-06-15 06:41:30 EDT
Indeed, this issue is 8139too specific. We were digging into this and Michal Schmidt found the upstream patch which fixes the issue:


I've backported this into rhel5 and tested with positive results.
Comment 5 Yasuhiro Ozone 2009-09-15 01:00:08 EDT
I use 2.6.18-164.el5 on i686 but it doesn't work completely.
I have read changelog ,but this issue didn't fix.

8139too NIC driver version 0.9.27 is used in 2.6.18-164.el5.
But 8139too NIC driver version 0.9.27 4D0198C0EF38F3D25A3DCF7 is used in 2.6.9-89.0.7.EL.

The bonding interface bond0 always work in 2.6.9-89.0.7.EL completely.

Maybe this issue is 8139too specific and rhel5.

I hope this issue should be fixed in next kernel.
Comment 6 RHEL Product and Program Management 2009-09-25 13:36:44 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 7 Don Zickus 2009-10-06 15:36:33 EDT
in kernel-2.6.18-168.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 9 Chris Ward 2010-03-19 08:54:29 EDT

Could you confirm whether or not the latest kernel available resolves this issue?


Thank you!
Comment 11 errata-xmlrpc 2010-03-30 03:43:56 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.