Description of problem: The system crashes after changing slaves with master interface down. Kernel 2.6.18-239.el5 on an x86_64 # lsmod | grep bonding # # modprobe bonding mode=0 miimon=500 # ip addr list dev bond0 7: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff # echo +eth0 > /sys/class/net/bond0/bonding/slaves bonding: bond0: doing slave updates when interface is down. bonding bond0: master_dev is not up in bond_enslave # echo +eth1 > /sys/class/net/bond0/bonding/slaves bonding: bond0: doing slave updates when interface is down. bonding bond0: master_dev is not up in bond_enslave # echo -eth0 > /sys/class/net/bond0/bonding/slaves bonding: bond0: doing slave updates when interface is down. bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:1A:A0:A4:85:E7 - is . ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at kernel/workqueue.c:191 invalid opcode: 0000 [1] SMP last sysfs file: /class/net/bond0/bonding/slaves CPU 1 Modules linked in: bonding(U) nls_utf8(U) ipt_MASQUERADE(U) iptable_nat(U) ip_n) Pid: 4997, comm: bash Tainted: G 2.6.18-239.el5 #42 RIP: 0010:[<ffffffff8009fd19>] [<ffffffff8009fd19>] queue_delayed_work+0x36/0xb RSP: 0018:ffff8101298bddb8 EFLAGS: 00010207 RAX: ffff810134993a00 RBX: ffff8101349939f8 RCX: 0000000000011029 RDX: 0000000000000000 RSI: ffff8101349939f8 RDI: ffff81013d6bfba0 RBP: ffff810134993a28 R08: ffffffff80319f28 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000180 R12: 0000000000000000 R13: 0000000000000000 R14: ffff81013e3c2000 R15: ffff81013c60c800 FS: 00002b058b233f50(0000) GS:ffff81010470f7c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fff3dc2fd18 CR3: 000000013f7ec000 CR4: 00000000000006e0 Process bash (pid: 4997, threadinfo ffff8101298bc000, task ffff81013fdcc820) Stack: 00000000000005dc ffff810134993500 ffff81013c60c800 ffffffff889237cb 00000000000005dc ffff81013e7b9a00 ffff810134993000 ffffffff88921dce 00000000000005dc ffff810134993500 ffff810134993000 ffff81013c60c800 Call Trace: [<ffffffff889237cb>] :bonding:bond_change_active_slave+0x487/0x494 [<ffffffff88921dce>] :bonding:bond_compute_features+0x9a/0xb1 [<ffffffff88923d49>] :bonding:bond_release+0x1a0/0x4f1 [<ffffffff8006456b>] __down_write_nested+0x12/0x92 [<ffffffff8892cb54>] :bonding:bonding_store_slaves+0x25c/0x2f7 [<ffffffff8010fee2>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016aa3>] vfs_write+0xce/0x174 [<ffffffff8001735b>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 ec 5b 2b 80 c2 bf 00 48 89 fe 48 89 df e8 c5 f6 ff RIP [<ffffffff8009fd19>] queue_delayed_work+0x36/0x8b RSP <ffff8101298bddb8> <0>Kernel panic - not syncing: Fatal exception The problem is discussed here: http://www.spinics.net/lists/netdev/msg149884.html Upstream patch fixing it: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ffa95ed50f9fb2d8faaa6bd73086a7056ea46a06 I've backported to RHEL-5 and verified that it fixes the issue.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-246.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-1065.html