Bug 584746 - [ixgbe] kernel NULL pointer dereference at [<ffffffff800273b8>] eth_type_trans+0x3d/0xf0
[ixgbe] kernel NULL pointer dereference at [<ffffffff800273b8>] eth_type_tran...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.4
All Linux
low Severity medium
: rc
: ---
Assigned To: Andy Gospodarek
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-22 06:28 EDT by Bernd Schubert
Modified: 2014-06-29 19:02 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-26 11:03:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bernd Schubert 2010-04-22 06:28:50 EDT
DDN Lustre customer has network problems, which seems to trigger an ixgbe bug. A NULL pointer reference then happens and the entirely locks up with a kernel panic.

LustreError: 8094:0:(ost_handler.c:1094:ost_brw_write()) client csum 4a41e0df, server csum 4739e073
LustreError: 168-f: datafs-OST0009: BAD WRITE CHECKSUM: changed in transit before arrival at OST from 12345-10.128.130.174@tcp inum 2107295/28010499 object 3891970/0 extent [47185920-48234495]
LustreError: 8094:0:(ost_handler.c:1169:ost_brw_write()) client csum 4a41e0df, original server csum 4739e073, server csum now 4739e073
LustreError: 10973:0:(ost_handler.c:1094:ost_brw_write()) client csum 3bd08cb7, server csum c7e08c7a
LustreError: 168-f: datafs-OST000d: BAD WRITE CHECKSUM: changed in transit before arrival at OST from 12345-10.128.128.202@tcp inum 7963256/28017359 object 3892458/0 extent [0-1048575]
LustreError: 10973:0:(ost_handler.c:1169:ost_brw_write()) client csum 3bd08cb7, original server csum c7e08c7a, server csum now c7e08c7a
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
 [<ffffffff800273b8>] eth_type_trans+0x3d/0xf0
PGD 0
Oops: 0000 [1] SMP
last sysfs file: /class/infiniband_mad/umad0/port
CPU 11
Modules linked in: hidp(U) l2cap(U) bluetooth(U) obdfilter(U) lquota(U) fsfilt_ldiskfs(U) ost(U) mgc(U) ptlrpc(U) ib_srp(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_table(U) bnx2i(U) libiscsi2(U) cnic(U) uio(U) scsi_transport_iscsi2(U) scsi_transport_iscsi(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) ib_umad(U) iw_nes(U) iw_cxgb3(U) cxgb3(U) mlx4_en(U) mlx4_ib(U) ib_mthca(U) ib_mad(U) ib_core(U) ldiskfs(U) crc16(U) ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dm_round_robin(U) dm_multipath(U) scsi_dh(U) video(U) hwmon(U) backlight(U) sbs(U) i2c_ec(U) i2c_core(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) joydev(U) sr_mod(U) cdrom(U) ixgbe(U) sg(U) mlx4_core(U) 8021q(U) dca(U) serio_raw(U) bnx2(U) pcspkr(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) ata_piix(U) libata(U) shpchp(U) megaraid_sas(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Pid: 0, comm: swapper Tainted: G      2.6.18-164.11.1.el5_lustre.1.8.2 #1
RIP: 0010:[<ffffffff800273b8>]  [<ffffffff800273b8>] eth_type_trans+0x3d/0xf0
RSP: 0018:ffff8101b5703dc8  EFLAGS: 00010202
RAX: 00000000000005dc RBX: ffff8101920cdbc0 RCX: 0000000000000000
RDX: ffff810313ff07c0 RSI: ffff810328baa000 RDI: ffff8101920cdbc0
RBP: 0000000000000000 R08: ffff81011ed87000 R09: 00000000313ddbbc
R10: 0000000000000000 R11: 0000000000000000 R12: ffff810313ff07d0
R13: ffff81032879b120 R14: 0000000000000063 R15: ffffc200109d8360
FS:  0000000000000000(0000) GS:ffff8101afe3e540(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff8101b56fe000, task ffff8101afe3f100)
Stack:  ffffffff88250043 000000002879b120 ffff8101b5703f08 0000000b109dfd68
 ffff81000902e780 0000004000000000 ffff8101b5703eac ffff81032fda3800
 ffff810328baa500 ffff8101af9be000 ffff810313ff07c0 0000007d00000000
Call Trace:
 <IRQ>  [<ffffffff88250043>] :ixgbe:ixgbe_clean_rx_irq+0x523/0xbe0
 [<ffffffff88251fd3>] :ixgbe:ixgbe_clean_rxonly+0x83/0x1a0
 [<ffffffff8825ac22>] :ixgbe:__kc_adapter_clean+0x32/0x60
 [<ffffffff8000c845>] net_rx_action+0xac/0x1e0
 [<ffffffff8001231d>] __do_softirq+0x89/0x133
 [<ffffffff8005e2fc>] call_softirq+0x1c/0x28
 [<ffffffff8006cb3c>] do_softirq+0x2c/0x85
 [<ffffffff8006c9c4>] do_IRQ+0xec/0xf5
 [<ffffffff8005d615>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff80198732>] acpi_processor_idle_simple+0x17d/0x30e
 [<ffffffff80197e78>] acpi_safe_halt+0x25/0x36
 [<ffffffff80198695>] acpi_processor_idle_simple+0xe0/0x30e
 [<ffffffff801985b5>] acpi_processor_idle_simple+0x0/0x30e
 [<ffffffff8004947e>] cpu_idle+0x95/0xb8
 [<ffffffff80077474>] start_secondary+0x498/0x4a7


Code: f6 01 01 74 3e 48 8d 86 c8 01 00 00 66 8b 50 02 66 8b 40 04
RIP  [<ffffffff800273b8>] eth_type_trans+0x3d/0xf0
 RSP <ffff8101b5703dc8>
CR2: 0000000000000000
 <0>Kernel panic - not syncing: Fatal exception
 &B [send break]
&. [terminated ipmitool]
You have new mail in /var/mail/root
ESC[1mESC[31mgaribaldi:/opt/impi.log # ESC[0;10mexit
exit
Comment 1 Andy Gospodarek 2010-04-26 11:03:23 EDT
This appears to be a bug with the Intel ixgbe driver downloaded from SF.  The function '__kc_adapter_clean' is from that driver and is not include in the driver provided as part of the Red Hat kernel RPM.

I will be happy to look at this if the ixgbe driver provided by Red Hat also has this problem.  Please re-open this bug if it does.

Thanks!

Note You need to log in before you can comment on or make changes to this bug.