Bug 689759

Summary: Panic after multiple bonding failovers
Product: Red Hat Enterprise Linux 5 Reporter: Martin Wilck <martin.wilck>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.6CC: gasmith
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-22 12:48:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 659594    
Bug Blocks:    

Description Martin Wilck 2011-03-22 11:44:23 UTC
Description of problem:
When configuring one or two loadbalance bonds (mode=6) on PRIMERGY TX150S7 with installed RHEL5.6 x64 the system crashed and writes dump after multiple failovers of the team.

This problem occurs with 1 bond with 4 members as well as with 2 bonds with each 2 members.
This also happens with the latest kernel version (2.6.18-238.5.1.el5)


Version-Release number of selected component (if applicable):
2.6.18-238.5.1.el5

How reproducible:
always

Steps to Reproduce:
1. configure bonding device with mode 6
2. pull and re-plug cables several times to simulate failover

  
Actual results:
Kernel BUG at drivers/net/bonding/bonding.h:135
invalid opcode: 0000 [1] SMP
last sysfs file: /block/sr0/size
CPU 0
Modules linked in: nfsd exportfs nfs_acl auth_rpcgss mptctl mptbase smbus(U) ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table
mperf bonding be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 libiscsi_tcp libiscsi2
scsi_transport_iscsi2 scsi_transport_iscsi nls_utf8 loop dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_ac
pi acpi_memhotplug ac parport_pc lp parport joydev sr_mod cdrom tpm_tis ixgbe tpm i2c_i801 i2c_core shpchp e1000e sg tpm_bios 8021q dca pcspkr dm_raid45 dm_message dm_r
egion_hash dm_log dm_mod dm_mem_cache usb_storage ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 3360, comm: bond0 Tainted: G      2.6.18-238.5.1.el5 #1
RIP: 0010:[<ffffffff88597e95>]  [<ffffffff88597e95>] :bonding:bond_mii_monitor+0x41e/0x4c0
RSP: 0018:ffff810672921e10  EFLAGS: 00010286
RAX: 00000000ffffffff RBX: ffff81067417a530 RCX: ffff810672c76060
RDX: 0000000000000000 RSI: ffff810679ea8c00 RDI: ffffffff80358ac0
RBP: ffff81067417a500 R08: ffffffff80319f28 R09: 000000000000003b
R10: ffff810672921ab0 R11: ffffffff88597a77 R12: ffff810679ea8c00
R13: 0000000000000000 R14: 0000000000000002 R15: ffffffff88597a77
FS:  0000000000000000(0000) GS:ffffffff80425000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaaae17108 CR3: 000000063d547000 CR4: 00000000000006e0
Process bond0 (pid: 3360, threadinfo ffff810672920000, task ffff8106796037e0)
Stack:  ffff81067417a878 ffff81067417a880 ffff810677595540 0000000000000282
 ffff81067417a500 ffffffff8004d966 ffff810672921e80 ffff810677595540
 ffffffff8004a175 ffff810672821d68 0000000000000282 ffff810672821d58
Call Trace:
 [<ffffffff8004d966>] run_workqueue+0x99/0xf6
 [<ffffffff8004a175>] worker_thread+0x0/0x122
 [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
 [<ffffffff8004a265>] worker_thread+0xf0/0x122
 [<ffffffff8008e40a>] default_wake_function+0x0/0xe
 [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
 [<ffffffff80032af3>] kthread+0xfe/0x132
 [<ffffffff8005dfb1>] child_rip+0xa/0x11
 [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
 [<ffffffff800329f5>] kthread+0x0/0x132
 [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code: 0f 0b 68 aa 04 5a 88 c2 87 00 48 8d 5d 34 48 89 df e8 c9 cc
RIP  [<ffffffff88597e95>] :bonding:bond_mii_monitor+0x41e/0x4c0
 RSP <ffff810672921e10>

 
Expected results:
no panic

Additional info:
This problem is solved by the fix proposed for bug #659594 (tested with http://people.redhat.com/jwilson/el5/247.el5/).

Comment 1 Gary Smith 2011-03-22 12:48:35 UTC

*** This bug has been marked as a duplicate of bug 659594 ***