Bug 249098

Summary: kernel BUG at net/core/h:91
Product: Red Hat Enterprise Linux 4 Reporter: Alberto Reyes <betoreyez>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 4.4   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.9-55.el4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-09-03 18:07:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sysreport
none
sysreport updated none

Description Alberto Reyes 2007-07-20 21:21:13 UTC
Description of problem:
I got a kernel panic related to network in an oraclerac environament

Version-Release number of selected component (if applicable):
I'm using kernel 2.6.9-42hugemem

How reproducible: First time, it happens

Additional info:
This is netdump information and I've diskdump too if needed
Jul 20 11:40:09 10.100.1.115  kernel BUG at net/core/skbuff.c:91!
Jul 20 11:40:09 10.100.1.115  invalid operand: 0000 [#1]
Jul 20 11:40:09 10.100.1.115  SMP
Jul 20 11:40:09 10.100.1.115  Modules linked in: IBMtape(U) nfsd exportfs lp
netconsole ocfs2(U) debugfs(U) nfs lockd nfs_acl ocfs2_dlmfs(U) ocfs2_dlm(U)
ocfs2_nodemanager(U) configfs(U) sdd_mod(U) sunrpc ide_dump scsi_dump diskdump
zlib_deflate dm_mirror button battery ac uhci_hcd parport_pc parport e1000
floppy st sg ext3 jbd dm_mod qla2300 mptscsih mptsas mptspi mptfc mptscsi
mptbase qla2xxx scsi_transport_fc sd_mod scsi_mod
Jul 20 11:40:09 10.100.1.115  CPU:    2
Jul 20 11:40:09 10.100.1.115  EIP:    0060:[<0227530f>]    Tainted: PF     VLI
Jul 20 11:40:09 10.100.1.115  EFLAGS: 00210296   (2.6.9-42.ELhugemem)
Jul 20 11:40:09 10.100.1.115  EIP is at skb_over_panic+0x1f/0x2d
Jul 20 11:40:09 10.100.1.115  eax: 0000002f   ebx: 022e7e33   ecx: 023c7f44  
edx: 022ffd3a
Jul 20 11:40:09 10.100.1.115  esi: 022e7e33   edi: f89edeb0   ebp: 8e512980  
esp: 023c7f40
Jul 20 11:40:09 10.100.1.115  ds: 007b   es: 007b   ss: 0068
Jul 20 11:40:09 10.100.1.115  Process oracle (pid: 18223, threadinfo=023c7000
task=ba7fe1b0)
Jul 20 11:40:10 10.100.1.115  Stack: 022ffd3a f898b6de 0000089e 000002d6
022e7e33 000002d6 becf5bb0 f898b6e5
Jul 20 11:40:10 10.100.1.115         000000e4 23f35700 00000001 00000001
000000bc 000002d6 f89edeb0 f89ede9c
Jul 20 11:40:10 10.100.1.115         becf5bc0 becf5bb0 bfdf7000 023c7fbc
15f356c0 bfdf7240 bfdf7240 bf075800
Jul 20 11:40:10 10.100.1.115  Call Trace:
Jul 20 11:40:10 10.100.1.115   [<f898b6de>] e1000_clean_rx_irq+0x302/0x568 [e1000]
Jul 20 11:40:10 10.100.1.115   [<f898b6e5>] e1000_clean_rx_irq+0x309/0x568 [e1000]
Jul 20 11:40:10 10.100.1.115   [<f898b0a9>] e1000_clean+0xd6/0x176 [e1000]
Jul 20 11:40:10 10.100.1.115   [<0227a8e0>] net_rx_action+0xae/0x160
Jul 20 11:40:10 10.100.1.115   [<02126424>] __do_softirq+0x4c/0xb1
Jul 20 11:40:10 10.100.1.115   [<021084af>] do_softirq+0x4f/0x56
Jul 20 11:40:10 10.100.1.115   =======================
Jul 20 11:40:10 10.100.1.115   [<0211745a>] smp_apic_timer_interrupt+0x9a/0x9c
Jul 20 11:40:10 10.100.1.115  Code:  Bad EIP value.

Comment 1 Alberto Reyes 2007-07-20 21:21:21 UTC
Created attachment 159702 [details]
sysreport

Comment 2 Neil Horman 2007-08-08 15:29:57 UTC
hmmm, looks like we tried to copy data beyond the length of the allocated
buffer.  First, question would be, have you tried with the 4.6 kernel, e1000 has
had several complete updates.  Also, can you try adding the copybreak=0 option
to the module, and see if that clears the problem up.  Thanks!

Comment 3 Ra P. 2007-08-09 18:29:40 UTC
I had another crash with kernel 2.6.9-55... seem to be related

SYSTEM MAP: /boot/System.map-2.6.9-55.ELhugemem
DEBUG KERNEL: /usr/lib/debug/lib/modules/2.6.9-55.ELhugemem/vmlinux
(2.6.9-55.ELhugemem)
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 16
        DATE: Thu Aug  9 11:20:58 2007
      UPTIME: 14 days, 09:52:17
LOAD AVERAGE: 10.87, 10.62, 10.21
       TASKS: 1019
    NODENAME: erpdb2.opttima.com.mx
     RELEASE: 2.6.9-55.ELhugemem
     VERSION: #1 SMP Fri Apr 20 17:20:11 EDT 2007
     MACHINE: i686  (2994 Mhz)
      MEMORY: 27.5 GB
       PANIC: "kernel BUG at include/linux/netdevice.h:890!"
         PID: 0
     COMMAND: "swapper"
        TASK: 231ba80  (1 of 16)  [THREAD_INFO: 238d000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

PID: 0      TASK: 231ba80   CPU: 0   COMMAND: "swapper"
 #0 [ 23c9e58] disk_dump at f8cdb1a2
 #1 [ 23c9e5c] printk at 212292e
 #2 [ 23c9e68] freeze_other_cpus at f8cdaef5
 #3 [ 23c9e78] start_disk_dump at f8cdafa0
 #4 [ 23c9e88] try_crashdump at 213386e
 #5 [ 23c9e90] die at 2106335
 #6 [ 23c9ec4] do_invalid_op at 2106710
 #7 [ 23c9f74] error_code (via invalid_op) at fffecede
    EAX: 00000006  EBX: e05e8000  ECX: 112fbd00  EDX: 00000040  EBP: 023c9fd4
    DS:  007b      ESI: e05e8240  ES:  007b      EDI: 00000246
    CS:  0060      EIP: f8cb75ce  ERR: ffffffff  EFLAGS: 00010046
 #8 [ 23c9fb0] e1000_clean at f8cb75ce
 #9 [ 23c9fd0] net_rx_action at 227bdb1
#10 [ 23c9fe8] __do_softirq at 2126486
--- <soft IRQ> ---
 #0 [ 238df8c] do_softirq at 2108440
 #1 [ 238df94] do_IRQ at 2107d9f
 #2 [ 238dfb0] common_interrupt at fffecc9c
    EAX: 00000000  EBX: 0238d000  ECX: 02104018  EDX: 0238d000  EBP: 00492007
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 023c6120
    CS:  0060      EIP: 02104041  ERR: ffffffd1  EFLAGS: 00000246
 #3 [ 238dfe4] default_idle at 2104041
 #4 [ 238dfe8] cpu_idle at 210409e


Comment 4 Alberto Reyes 2007-08-09 21:27:21 UTC
Created attachment 161025 [details]
sysreport updated

This is sysreport updated from last crash

Comment 5 Neil Horman 2007-08-29 14:37:08 UTC
It looks like you didn't turn off copybreak in your last test.  Please do so and
report results.  Thanks!

Comment 6 Alberto Reyes 2007-09-02 03:26:06 UTC
No crashes any more, close it...

Comment 7 Neil Horman 2007-09-03 18:07:06 UTC
closing as current release then.  Thanks!