671238 – [bonding] crash when adding/removing slaves with master interface down

Bug 671238 - [bonding] crash when adding/removing slaves with master interface down

Summary: [bonding] crash when adding/removing slaves with master interface down

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.6
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Flavio Leitner
QA Contact:	Boris Ranto
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-01-20 20:15 UTC by Flavio Leitner
Modified:	2018-11-30 22:36 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-07-21 10:18:35 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:1065	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update	2011-07-21 09:21:37 UTC

Description Flavio Leitner 2011-01-20 20:15:05 UTC

Description of problem:

The system crashes after changing slaves with master interface down.

Kernel 2.6.18-239.el5 on an x86_64

# lsmod | grep bonding
#
# modprobe bonding mode=0 miimon=500
# ip addr list dev bond0
7: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
# echo +eth0 > /sys/class/net/bond0/bonding/slaves
bonding: bond0: doing slave updates when interface is down.
bonding bond0: master_dev is not up in bond_enslave
# echo +eth1 > /sys/class/net/bond0/bonding/slaves
bonding: bond0: doing slave updates when interface is down.
bonding bond0: master_dev is not up in bond_enslave
# echo -eth0 > /sys/class/net/bond0/bonding/slaves
bonding: bond0: doing slave updates when interface is down.
bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:1A:A0:A4:85:E7 - is .
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at kernel/workqueue.c:191
invalid opcode: 0000 [1] SMP
last sysfs file: /class/net/bond0/bonding/slaves
CPU 1
Modules linked in: bonding(U) nls_utf8(U) ipt_MASQUERADE(U) iptable_nat(U) ip_n)
Pid: 4997, comm: bash Tainted: G      2.6.18-239.el5 #42
RIP: 0010:[<ffffffff8009fd19>]  [<ffffffff8009fd19>] queue_delayed_work+0x36/0xb
RSP: 0018:ffff8101298bddb8  EFLAGS: 00010207
RAX: ffff810134993a00 RBX: ffff8101349939f8 RCX: 0000000000011029
RDX: 0000000000000000 RSI: ffff8101349939f8 RDI: ffff81013d6bfba0
RBP: ffff810134993a28 R08: ffffffff80319f28 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000180 R12: 0000000000000000
R13: 0000000000000000 R14: ffff81013e3c2000 R15: ffff81013c60c800
FS:  00002b058b233f50(0000) GS:ffff81010470f7c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff3dc2fd18 CR3: 000000013f7ec000 CR4: 00000000000006e0
Process bash (pid: 4997, threadinfo ffff8101298bc000, task ffff81013fdcc820)
Stack:  00000000000005dc ffff810134993500 ffff81013c60c800 ffffffff889237cb
 00000000000005dc ffff81013e7b9a00 ffff810134993000 ffffffff88921dce
 00000000000005dc ffff810134993500 ffff810134993000 ffff81013c60c800
Call Trace:
 [<ffffffff889237cb>] :bonding:bond_change_active_slave+0x487/0x494
 [<ffffffff88921dce>] :bonding:bond_compute_features+0x9a/0xb1
 [<ffffffff88923d49>] :bonding:bond_release+0x1a0/0x4f1
 [<ffffffff8006456b>] __down_write_nested+0x12/0x92
 [<ffffffff8892cb54>] :bonding:bonding_store_slaves+0x25c/0x2f7
 [<ffffffff8010fee2>] sysfs_write_file+0xb9/0xe8
 [<ffffffff80016aa3>] vfs_write+0xce/0x174
 [<ffffffff8001735b>] sys_write+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 ec 5b 2b 80 c2 bf 00 48 89 fe 48 89 df e8 c5 f6 ff
RIP  [<ffffffff8009fd19>] queue_delayed_work+0x36/0x8b
 RSP <ffff8101298bddb8>
 <0>Kernel panic - not syncing: Fatal exception

The problem is discussed here:
http://www.spinics.net/lists/netdev/msg149884.html

Upstream patch fixing it:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ffa95ed50f9fb2d8faaa6bd73086a7056ea46a06

I've backported to RHEL-5 and verified that it fixes the issue.

Comment 5 RHEL Program Management 2011-02-01 17:02:34 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 18 Jarod Wilson 2011-03-03 20:34:11 UTC

in kernel-2.6.18-246.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 21 errata-xmlrpc 2011-07-21 10:18:35 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html

Note You need to log in before you can comment on or make changes to this bug.