Running an "isic" stresstest on and against a testbox [which, amongst other things, generates random incoming and outgoing packets] on 2.6.17-rc3 (and 2.6.17-rc3-mm1) over gigabit results in a reproducible lockup, after 5-10 minutes of runtime: BUG: soft lockup detected on CPU#0! [<c0104e7f>] show_trace+0xd/0xf [<c0104e96>] dump_stack+0x15/0x17 [<c015ad02>] softlockup_tick+0xc5/0xd9 [<c0134c02>] run_local_timers+0x22/0x24 [<c0134fb7>] update_process_times+0x40/0x65 [<c011aa56>] smp_apic_timer_interrupt+0x58/0x60 [<c010492b>] apic_timer_interrupt+0x27/0x2c [<c0f00df9>] sctp_new+0x8b/0x235 [<c0ef9666>] ip_conntrack_in+0x175/0x4ca [<c0eb6dd7>] nf_iterate+0x31/0x94 [<c0eb6e83>] nf_hook_slow+0x49/0xda [<c0ec2f55>] ip_rcv+0x24c/0x567 [<c0e7dec4>] netif_receive_skb+0x34b/0x397 [<c07870cb>] rtl8139_poll+0x3d8/0x5db [<c0e7c7ad>] net_rx_action+0x9b/0x1ba [<c0131955>] __do_softirq+0x6e/0xec [<c0106187>] do_softirq+0x59/0xcd ======================= [<c0131427>] local_bh_enable+0x111/0x15d [<c0e7d8fb>] dev_queue_xmit+0x218/0x222 [<c0ec7e2a>] ip_output+0x20b/0x249 [<c0ec5711>] ip_push_pending_frames+0x331/0x3fe [<c0ede6e3>] raw_sendmsg+0x5cf/0x678 [<c0ee62f8>] inet_sendmsg+0x39/0x46 [<c0e7406f>] sock_sendmsg+0xf2/0x10d [<c0e741e5>] sys_sendmsg+0x15b/0x1c9 [<c0e748b5>] sys_socketcall+0x16f/0x18a [<c1048c1b>] syscall_call+0x7/0xb this is with FRAME_POINTERS enabled, so it's an exact stacktrace. Original report (including test case) can be found here: http://lkml.org/lkml/2006/5/2/104
hmmm, we only support Red Hat kernels, i suggest following up on lkml. If you can reproduce this on a Red Hat supported kernel, please re-open.
I wasn't able to reproduce this on my systems (I don't have a gigabit switch for testing), but this doesn't mean that this issue doesn't exists in Red Hat kernels. Looking at the source for the SCTP conntrack module, it looks similar to the code in the Red Hat kernels based on 2.6.9.
ok. sorry.
The upstream fix can be found here: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e17df688f7064dae1417ce425dd1e4b71d24d63b
committed in stream U4 build 36.1. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0493.html