Bug 82416
Summary: | OOPS - Frequently system lockup/crash under some load | ||
---|---|---|---|
Product: | [Retired] Red Hat Raw Hide | Reporter: | Daniel Khan <dk> |
Component: | kernel | Assignee: | Nalin Dahyabhai <nalin> |
Status: | CLOSED WONTFIX | QA Contact: | Jay Turner <jturner> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 1.0 | CC: | davej, dk, srevivo |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-10-30 04:01:29 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Daniel Khan
2003-01-21 22:59:36 UTC
I was able to reproduce the crash doing rsyncs over the 1000mbit nic. I have got oops now and I have posted it to the kernel list also. I tried the rawhide kernel 2.4.20 and the problem is the same. Scenario: 2.4.20-2.25smp from RawHide Doing a rsync from the crashing host _to_ another host over a 1000 Mbit 3com (TG3). The rsynced files include bigger files with about 1.5 gigs. Heartbeat runs. Below are the OOPS. <------------------------CUT----------------------------> NMI Watchdog detected LOCKUP on CPU0, eip c02499ac, registers: via686a eeprom lm80 i2c-proc i2c-isa i2c-viapro i2c-core tg3 eepro100 mii ipt_LOG ipt_limit ipt_state ipt_REJECT iptable_nat ip_cona CPU: 0 EIP: 0060:[<c02499ac>] Not tainted EFLAGS: 00000086 EIP is at .text.lock.tcp_ipv4 [kernel] 0x182 (2.4.20-2.25smp) eax: 00000001 ebx: d400010a ecx: 00000000 edx: f78837d8 esi: f6f22ae0 edi: c3d3ad40 ebp: f74939f4 esp: f1335d8c ds: 0068 es: 0068 ss: 0068 Process rsync (pid: 3151, stackpage=f1335000) Stack: c3d3ad40 f3121f38 00000001 f1335e28 00000000 03ff0202 00000004 000003ff 00000000 00000006 c3d3ad40 f74939e0 c022d67e c3d3ad40 f1335e28 c3d5a000 00000000 00000006 00000000 00000001 00000000 c022d530 c021ce67 c3d3ad40 Call Trace: [<c022d67e>] ip_local_deliver_finish [kernel] 0x14e (0xf1335dbc)) [<c022d530>] ip_local_deliver_finish [kernel] 0x0 (0xf1335de0)) [<c021ce67>] nf_hook_slow [kernel] 0x107 (0xf1335de4)) [<c022d530>] ip_local_deliver_finish [kernel] 0x0 (0xf1335e00)) [<c022d2b3>] ip_local_deliver [kernel] 0x53 (0xf1335e1c)) [<c022d530>] ip_local_deliver_finish [kernel] 0x0 (0xf1335e34)) [<c022d8b9>] ip_rcv_finish [kernel] 0x219 (0xf1335e38)) [<c022d6a0>] ip_rcv_finish [kernel] 0x0 (0xf1335e5c)) [<c022d6a0>] ip_rcv_finish [kernel] 0x0 (0xf1335e6c)) [<c021ce67>] nf_hook_slow [kernel] 0x107 (0xf1335e70)) [<c022d6a0>] ip_rcv_finish [kernel] 0x0 (0xf1335e8c)) [<c022d480>] ip_rcv [kernel] 0x1a0 (0xf1335ea8)) [<c022d6a0>] ip_rcv_finish [kernel] 0x0 (0xf1335ec0)) [<c021566e>] netif_receive_skb [kernel] 0x14e (0xf1335ed8)) [<f89d2c7c>] tg3_rx [tg3] 0x27c (0xf1335ef8)) [<f89d2e71>] tg3_poll [tg3] 0x81 (0xf1335f38)) [<c0215917>] net_rx_action [kernel] 0xa7 (0xf1335f58)) [<c01289f9>] do_softirq [kernel] 0xd9 (0xf1335f80)) [<c010b81b>] do_IRQ [kernel] 0xfb (0xf1335f9c)) [<c010e7c8>] call_do_IRQ [kernel] 0x5 (0xf1335fc0)) Code: 7e f8 e9 68 e5 ff ff e8 2c ed eb ff e9 c3 ee ff ff e8 22 ed console shuts up ... NMMI Watchdog detected LOCKUP on CPU1, eip f89d9f3b, registers: <------------------------CUT----------------------------> I now got rid of the failures by exchanging the tg3 driver with the latest bcm5700 driver from 3com. It seems as there is a bug in tg3 with the BCM5701 Gigabit Ethernet card. regards Daniel Khan |