Bug 621280

Summary: [5u5] bonding: fix a race condition in calls to slave MII ioctls
Product: Red Hat Enterprise Linux 5 Reporter: Flavio Leitner <fleitner>
Component: kernelAssignee: Flavio Leitner <fleitner>
Status: CLOSED ERRATA QA Contact: Network QE <network-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.5CC: hjia
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 621209 Environment:
Last Closed: 2011-01-13 21:09:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 621209    
Bug Blocks:    

Description Flavio Leitner 2010-08-04 16:36:09 UTC
+++ This bug was initially created as a clone of Bug #621209 +++

Description of problem:
In mii monitor mode, bond_check_dev_link() calls the the ioctl
handler of slave devices. It stores the ndo_do_ioctl function
pointer to a static (!) ioctl variable and later uses it to call the
handler with the IOCTL macro.

If another thread executes bond_check_dev_link() at the same time
(even with a different bond, which none of the locks prevent), a
race condition occurs. If the two racing slaves have different
drivers, this may result in one driver's ioctl handler being
called with a pointer to a net_device controlled with a different
driver, resulting in unpredictable breakage.

------------[ cut here ]------------
kernel BUG at include/asm/spinlock.h:146!
invalid operand: 0000 [#1]
SMP 
Modules linked in: md5 ipv6 netconsole netdump i2c_dev i2c_core sunrpc sr_mod usb_storage joydev dm_mirror dm_mod button battery ac ohci_hcd ehci_hcd shpchp bnx2 e1000 bonding(U) ext3 jbd megaraid_sas sd_mod scsi_mod
CPU:    3
EIP:    0060:[<c02d333e>]    Not tainted VLI
EFLAGS: 00010016   (2.6.9-42.ELsmp) 
EIP is at _spin_lock_irqsave+0x20/0x45
eax: f88fc596   ebx: 00000202   ecx: c02e6fa1   edx: c02e6fa1
esi: f7e2c994   edi: c03d1f64   ebp: f7e2c6c0   esp: c03d1f20
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c03d1000 task=f7f305b0)
Stack: f7e2c994 f7e2c900 f88fc596 00000000 c03d1f74 00000000 f7e2c6c5 c03d1000 
       c03d1f64 f88c5734 f7e2c6c0 00000001 00000001 00000003 00000000 00000000 
       00000001 35687465 00000000 00000000 00000000 00010001 1000b4d3 f69b88f0 
Call Trace:
 [<f88fc596>] e1000_mii_ioctl+0x7e/0x227 [e1000]
 [<f88c5734>] bond_check_dev_link+0x9a/0x143 [bonding]
 [<c012bcf8>] __group_send_sig_info+0x8f/0x98
 [<f88c6d93>] bond_mii_monitor+0x89/0x3cc [bonding]
 [<f88c6d0a>] bond_mii_monitor+0x0/0x3cc [bonding]
 [<c012a541>] run_timer_softirq+0x123/0x145
 [<c01269b8>] __do_softirq+0x4c/0xb1
 [<c010819f>] do_softirq+0x4f/0x56
 =======================
 [<c011749e>] smp_apic_timer_interrupt+0x9a/0x9c
 [<c02d5142>] apic_timer_interrupt+0x1a/0x20
 [<c0104018>] default_idle+0x0/0x2f
 [<c0104041>] default_idle+0x29/0x2f
 [<c01040a0>] cpu_idle+0x26/0x3b
Code: 81 00 00 00 00 01 c3 f0 ff 00 c3 56 89 c6 53 9c 5b fa 81 78 04 ad 4e ad de 74 18 ff 74 24 08 68 a1 6f 2e c0 e8 62 f5 e4 ff 59 58 <0f> 0b 92 00 0c 60 2e c0 f0 fe 0e 79 13 f7 c3 00 02 00 00 74 01 

In the vmcore, the interface is actually a bnx2 and not a e1000 interface,
so that ioctl is incorrect. There are other two bonding devices with e1000
devices as slaves.

Version-Release number of selected component (if applicable):
RELEASE: 2.6.9-42.ELsmp

How reproducible:
Unknown

Steps to Reproduce:
Unknown - In theory more than one bonding device with different slaves devices 
drivers.
  
Actual results:
The wrong ioctl function is called with unexpected data. It can cause crash or memory corruption.

Expected results:
Always work.

Additional info:
This issue is fixed by the upstream commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=patch;h=d9d52832

Comment 3 RHEL Program Management 2010-08-27 18:30:20 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 5 Jarod Wilson 2010-09-03 19:06:52 UTC
in kernel-2.6.18-215.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 9 errata-xmlrpc 2011-01-13 21:09:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html