Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 557380 - Kernel panic due to recursive lock in 3c59x driver.
Kernel panic due to recursive lock in 3c59x driver.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.8
All Linux
low Severity medium
: rc
: ---
Assigned To: Neil Horman
Network QE
: ZStream
Depends On:
Blocks: 648407
  Show dependency treegraph
 
Reported: 2010-01-21 04:06 EST by Vitaly Mayatskikh
Modified: 2011-02-16 11:05 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-02-16 11:05:44 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Console log (150.88 KB, text/plain)
2010-01-21 04:06 EST, Vitaly Mayatskikh
no flags Details
patch to prevent tx recursion (922 bytes, patch)
2010-01-26 09:12 EST, Neil Horman
no flags Details | Diff
Panic on January 12 (3.29 KB, text/plain)
2011-01-12 05:13 EST, Dayong Tian
no flags Details
panic on January 13th (7.17 KB, text/plain)
2011-01-12 22:02 EST, Dayong Tian
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0263 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 4.9 kernel security and bug fix update 2011-02-16 10:14:55 EST

  None (edit)
Description Vitaly Mayatskikh 2010-01-21 04:06:50 EST
Created attachment 385880 [details]
Console log

During regular testing machine failed in network driver for 3Com NIC:

eth0: Too much work in interrupt, status 8401.
<0>Kernel panic - not syncing: drivers/net/3c59x.c:2265: spin_lock(drivers/net/3c59x.c:dfe2b1f4) already locked by drivers/net/3c59x.c/2419

See attachment for full console log.
Comment 1 Neil Horman 2010-01-26 09:12:45 EST
Created attachment 386835 [details]
patch to prevent tx recursion

please give this patch a try, and let me know the results.  Thanks!
Comment 2 Vitaly Mayatskikh 2010-02-09 07:40:46 EST
Have no idea, how to test it :( Any thoughts?
Comment 7 Neil Horman 2010-08-09 10:42:57 EDT
sigh, Vitaly is no longer with us.  I'll test this myself
Comment 8 Neil Horman 2010-08-09 13:29:49 EDT
I've sent this upstream for review.
Comment 9 Neil Horman 2010-08-11 09:12:02 EDT
Davem didn't want this patch for upstream, so its back to the drawing board here.
Comment 10 Neil Horman 2010-08-11 11:07:33 EDT
sent a new patch attempt upstream
Comment 11 Don Howard 2010-09-15 13:59:17 EDT
Another instance of this bug in testing:
http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=16977668

Neil: 
Has your latest patch been accepted upstream?
Comment 12 Neil Horman 2010-09-15 15:44:16 EDT
sure has:
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=aa25ab7d943a5e1e6bcc2a65ff6669144f5b5d60

I've also posted it internally for this bug.
Comment 14 Vivek Goyal 2010-10-05 11:49:11 EDT
Committed in 89.39.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 24 Dayong Tian 2011-01-12 05:13:42 EST
Created attachment 472983 [details]
Panic on January 12
Comment 26 Neil Horman 2011-01-12 07:25:21 EST
No, thats a completely different crash, If its reproducible I'd open up a new bug
Comment 27 Dayong Tian 2011-01-12 22:02:58 EST
Created attachment 473187 [details]
panic on January 13th
Comment 28 Dayong Tian 2011-01-12 22:06:51 EST
(In reply to comment #26)
> No, thats a completely different crash, If its reproducible I'd open up a new
> bug

Hi Neil, it's reproducible on machine cpq-dl380-01.rhts.eng.bos.redhat.com with kernel 2.6.9-89.ELsmp:
--------------------
Code:  Bad EIP value.
Unable to handle kernel paging request at virtual address e09074e2
 printing eip:
e09074e2
*pde = 00000000
Recursive die() failure, output suppressed
 <0>Fatal exception: panic in 5 seconds

Kernel panic - not syncing: Fatal exception
------------[ cut here ]------------
kernel BUG at kernel/panic.c:77!
invalid operand: 0000 [#3]
SMP 
Modules linked in: netconsole netdump md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc cpufreq_powersave loop button battery ac e100 3c59x mii floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod cpqarray sd_mod scsi_mod
CPU:    1
EIP:    0060:[<c0122d0a>]    Not tainted VLI
EFLAGS: 00010286   (2.6.9-89.ELsmp) 
EIP is at panic+0x47/0x166
eax: 0000002f   ebx: d87df000   ecx: d87dfb40   edx: c02ef568
esi: c02e7c13   edi: c02e7bc7   ebp: c02ef1e7   esp: d87dfb48
ds: 007b   es: 007b   ss: 0068
Process bash (pid: 5305, threadinfo=d87df000 task=deb6c790)
Stack: d87df000 c01060d0 c02e7c3b 00004890 c0123745 c0425cf3 00000006 00000013 
       c012365d c02ef17e 00000000 c02ef17e 00000000 c0007820 00007820 c011bad9 
       c02ef1d6 00000000 c02f396f e09074e2 c02ef1c3 c02ef1a8 e09074e2 00000000 
Call Trace:
 [<c01060d0>] die+0x164/0x16b
 [<c0123745>] release_console_sem+0x75/0xa9
 [<c012365d>] vprintk+0x136/0x14a
 [<c011bad9>] do_page_fault+0x3f0/0x5c6
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<c01d4859>] vgacon_scroll+0x182/0x199
 [<c020e0d5>] scrup+0x63/0xce
 [<c020e6e3>] complement_pos+0x12/0x144
 [<c020eb8a>] set_cursor+0x62/0x6e
 [<c0211bc4>] vt_console_print+0x286/0x2a5
 [<c021193e>] vt_console_print+0x0/0x2a5
 [<c0123307>] __call_console_drivers+0x36/0x40
 [<c012341f>] call_console_drivers+0xb6/0xd8
 [<c011b6e9>] do_page_fault+0x0/0x5c6
 [<c02de3db>] error_code+0x2f/0x38
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<c0135a0a>] try_crashdump+0x31/0x33
 [<c010604e>] die+0xe2/0x16b
 [<c012365d>] vprintk+0x136/0x14a
 [<c011bad9>] do_page_fault+0x3f0/0x5c6
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<c01d4830>] vgacon_scroll+0x159/0x199
 [<c01d4859>] vgacon_scroll+0x182/0x199
 [<c020e0d5>] scrup+0x63/0xce
 [<c020e6e3>] complement_pos+0x12/0x144
 [<c020eb8a>] set_cursor+0x62/0x6e
 [<c0211bc4>] vt_console_print+0x286/0x2a5
 [<c021193e>] vt_console_print+0x0/0x2a5
 [<c0123307>] __call_console_drivers+0x36/0x40
 [<c012341f>] call_console_drivers+0xb6/0xd8
 [<c011b6e9>] do_page_fault+0x0/0x5c6
 [<c02de3db>] error_code+0x2f/0x38
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<e09074e2>] netpoll_start_netdump+0x0/0xf8 [netdump]
 [<c0135a0a>] try_crashdump+0x31/0x33
 [<c010604e>] die+0xe2/0x16b
 [<c012365d>] vprintk+0x136/0x14a
 [<c011bad9>] do_page_fault+0x3f0/0x5c6
 [<c0213238>] sysrq_handle_crash+0x0/0x8
 [<c011dc49>] try_to_wake_up+0x288/0x293
 [<c012ac01>] __mod_timer+0x101/0x10b
 [<c021290b>] poke_blanked_console+0xa1/0xac
 [<c0211bd2>] vt_console_print+0x294/0x2a5
 [<c021193e>] vt_console_print+0x0/0x2a5
 [<c0123307>] __call_console_drivers+0x36/0x40
 [<c011b6e9>] do_page_fault+0x0/0x5c6
 [<c02de3db>] error_code+0x2f/0x38
 [<c0213238>] sysrq_handle_crash+0x0/0x8
 [<c02133d0>] __handle_sysrq+0x62/0xd9
 [<c018f50c>] write_sysrq_trigger+0x23/0x29
 [<c015d5a7>] vfs_write+0xb6/0xe2
 [<c015d671>] sys_write+0x3c/0x62
 [<c02dd8e3>] syscall_call+0x7/0xb
 [<c02d007b>] xfrm_sk_policy_lookup+0xc1/0x3ca
--------------------
Please refer to attachment 473187 [details] for detailed info.
Comment 29 Dayong Tian 2011-01-13 04:12:28 EST
Hi Neil, I reproduced the panic with kernel 2.6.9-96.ELsmp, and filed bug 669302. You know when I tried to reproduce the panic described in this bug I always got the panic described in bug 669302, how should I verify this bug? Thanks!
Comment 30 Neil Horman 2011-01-13 07:13:32 EST
This bug has nothing to do with netdump (which it appears is how you are trying to verify it).  If you want to verify it, setup netconsole, stream messages accross it by generating printk events in the kernel and observe that the system doesn't lock up.
Comment 32 errata-xmlrpc 2011-02-16 11:05:44 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0263.html

Note You need to log in before you can comment on or make changes to this bug.