Description of problem: System soft lockup in bonding driver (soft lockup - CPU#0 stuck for 10s!). Version-Release number of selected component (if applicable): RHEL 5.2 (2.6.18-92.1.18.el5) How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Aug 11 12:37:17 snikhlp202 kernel: BUG: soft lockup - CPU#0 stuck for 10s! [bond0:14417] Aug 11 12:37:17 snikhlp202 kernel: CPU 0: Aug 11 12:37:17 snikhlp202 kernel: Modules linked in: nfsd exportfs auth_rpcgss xt_tcpudp autofs4 iptable_filter ip_tables x_tables ipv6 xfrm_nalgo crypto_api ibrix(PU) ipfs1(U) nfs lockd fscache nfs_acl dm_snapshot sunrpc bonding cpufreq_ondemand dm_mirror dm_round_robin dm_multipath dm_mod video sbs backlight i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev qla2400(U) e1000e sr_mod ide_cd qla2xxx(FU) i2c_nforce2 cdrom forcedeth shpchp k8temp hwmon i2c_core e1000 serio_raw sg k8_edac edac_mc pcspkr usb_storage mptsas mptscsih mptbase scsi_transport_sas ext3 jbd uhci_hcd ohci_hcd ehci_hcd sd_mod scsi_mod qla2xxx_conf(FU) intermodule(U) Aug 11 12:37:17 snikhlp202 kernel: Pid: 14417, comm: bond0 Tainted: PF 2.6.18-92.1.18.el5 #1 Aug 11 12:37:17 snikhlp202 kernel: RIP: 0010:[<ffffffff80064ba4>] [<ffffffff80064ba4>] .text.lock.spinlock+0x2/0x30 Aug 11 12:37:17 snikhlp202 kernel: RSP: 0018:ffffffff80418cb8 EFLAGS: 00000286 Aug 11 12:37:17 snikhlp202 kernel: RAX: 0000000000000001 RBX: ffff810218050080 RCX: 0000000000000002 Aug 11 12:37:17 snikhlp202 kernel: RDX: ffff810218050000 RSI: ffff810218050080 RDI: ffff810218050168 Aug 11 12:37:17 snikhlp202 kernel: RBP: ffffffff80418c30 R08: 0000000000000000 R09: ffff8100c06bc080 Aug 11 12:37:17 snikhlp202 kernel: R10: 0000000000000282 R11: ffff81020a8a9c90 R12: ffffffff8005dc8e Aug 11 12:37:17 snikhlp202 kernel: R13: ffff8101b94b8820 R14: ffffffff8007707d R15: ffffffff80418c30 Aug 11 12:37:17 snikhlp202 kernel: FS: 00002ba150404470(0000) GS:ffffffff803a0000(0000) knlGS:00000000f740b6d0 Aug 11 12:37:17 snikhlp202 kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Aug 11 12:37:20 snikhlp202 kernel: CR2: 00000034d448dbc0 CR3: 0000000000201000 CR4: 00000000000006e0 Aug 11 12:37:20 snikhlp202 kernel: Aug 11 12:37:20 snikhlp202 kernel: Call Trace: Aug 11 12:37:20 snikhlp202 kernel: <IRQ> [<ffffffff8842c477>] :bonding:ad_rx_machine+0x20/0x502 Aug 11 12:37:20 snikhlp202 kernel: [<ffffffff8842caa2>] :bonding:bond_3ad_lacpdu_recv+0xc1/0x1fc Aug 11 12:37:20 snikhlp202 kernel: [<ffffffff800201cd>] netif_receive_skb+0x330/0x3ae Aug 11 12:37:20 snikhlp202 kernel: [<ffffffff882db72d>] :e1000e:e1000_receive_skb+0x143/0x164 Aug 11 12:37:20 snikhlp202 kernel: [<ffffffff882deb2c>] :e1000e:e1000_clean_rx_irq+0x24d/0x2ee Aug 11 12:37:20 snikhlp202 kernel: [<ffffffff882dcfad>] :e1000e:e1000_clean+0x77/0x246 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8842ce76>] :bonding:bond_3ad_state_machine_handler+0x0/0x84a Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8000c54c>] net_rx_action+0xa4/0x1a4 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff882dd2ff>] :e1000e:e1000_intr_msi+0xd4/0xe7 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff80011efc>] __do_softirq+0x5e/0xd6 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff80154f3b>] end_msi_irq_wo_maskbit+0x9/0x16 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8006c6f2>] do_softirq+0x2c/0x85 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8006c57a>] do_IRQ+0xec/0xf5 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8005d615>] ret_from_intr+0x0/0xa Aug 11 12:37:21 snikhlp202 kernel: <EOI> [<ffffffff80064a30>] _spin_lock+0x3/0xa Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8842c477>] :bonding:ad_rx_machine+0x20/0x502 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8842cf4a>] :bonding:bond_3ad_state_machine_handler+0xd4/0x84a Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8004cf47>] run_workqueue+0x94/0xe4 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff80049826>] worker_thread+0x0/0x122 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8009dc53>] keventd_create_kthread+0x0/0xc4 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff80049916>] worker_thread+0xf0/0x122 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8008ac68>] default_wake_function+0x0/0xe Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8009dc53>] keventd_create_kthread+0x0/0xc4 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8003253d>] kthread+0xfe/0x132 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8009dc53>] keventd_create_kthread+0x0/0xc4 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8003243f>] kthread+0x0/0x132 Aug 11 12:37:21 snikhlp202 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11 Aug 11 12:37:22 snikhlp202 kernel: Expected results: Bonding driver should not crash the system. Additional info:
This is fixed with a patch added in 2.6.18-128. Please upgrade to the RHEL5.3 kernel and this issue will not longer exist. *** This bug has been marked as a duplicate of bug 457300 ***