Bug 149636 - Kernel panic (EIP is at find_inode)
Summary: Kernel panic (EIP is at find_inode)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Stephen Tweedie
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 156320
TreeView+ depends on / blocked
 
Reported: 2005-02-24 18:57 UTC by Juanjo Villaplana
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version: RHSA-2005-663
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-09-28 14:48:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kernel panic +SysRq full console output (31.89 KB, application/x-gzip)
2005-02-24 18:57 UTC, Juanjo Villaplana
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:663 0 qe-ready SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 6 2005-09-28 04:00:00 UTC

Description Juanjo Villaplana 2005-02-24 18:57:38 UTC
Description of problem:

We got a Kernel panic (2.4.21-27.0.2.ELsmp):

----------8<--------------------
Unable to handle kernel NULL pointer dereference at virtual address 00000028
 printing eip:
c0180518
*pde = 29aa5001
*pte = 431cc067
Oops: 0000
cpqci soundcore usbserial lp parport autofs4 tg3 ipt_state ip_conntrack
ipt_REJECT iptable_filter ip_tables floppy sg microcode loop keybdev mousedev
hid inpu
CPU:    3
EIP:    0060:[<c0180518>]    Tainted: P
EFLAGS: 00010203

EIP is at find_inode [kernel] 0x28 (2.4.21-27.0.2.ELsmp/i686)
eax: 00000000   ebx: 00000000   ecx: 0007ffff   edx: c9aa2480
esi: 00000000   edi: c9bdcec0   ebp: 000d4397   esp: d00f3e90
ds: 0068   es: 0068   ss: 0068
Process smbd (pid: 8817, stackpage=d00f3000)
Stack: 00000000 e79ef970 de98bc00 000d4397 c9bdcec0 000d4397 de98bc00 c01808b1
       de98bc00 000d4397 c9bdcec0 00000000 00000000 000d4397 d03def00 de98bc00
       da7cfd80 f887b76d de98bc00 000d4397 00000000 00000000 f2775018 fffffff4
Call Trace:   [<c01808b1>] iget4_locked [kernel] 0x61 (0xd00f3eac)
[<f887b76d>] ext3_lookup [ext3] 0x7d (0xd00f3ed4)
[<c01729cc>] real_lookup [kernel] 0xec (0xd00f3ef8)
[<c0172ffc>] link_path_walk [kernel] 0x45c (0xd00f3f18)
[<c0173549>] path_lookup [kernel] 0x39 (0xd00f3f58)
[<c0173899>] __user_walk [kernel] 0x49 (0xd00f3f68)
[<c016e6de>] sys_lstat64 [kernel] 0x2e (0xd00f3f84)

Code: 39 6b 28 89 de 75 f1 8b 44 24 20 39 83 ac 00 00 00 75 e5 8b

Kernel panic: Fatal exception
----------8<--------------------

on a Proliant DL580 G2 (4 x Intel Xeon MP 2.0GHz, 8GB RAM, Smart Array RAID)
running RHEL 3U4 with latest updates.

Version-Release number of selected component (if applicable):

kernel-smp-2.4.21-27.0.2.EL

How reproducible:

don't know

Steps to Reproduce:

n/a
  
Actual results:

n/a

Expected results:

n/a

Additional info:

See attachment with full console output and SysRq M+T+P+W+U+B

Comment 1 Juanjo Villaplana 2005-02-24 18:57:38 UTC
Created attachment 111390 [details]
kernel panic +SysRq full console output

Comment 2 Ernie Petrides 2005-03-02 15:54:08 UTC
Hello, Juanjo.  Can this problem be reproduced with an untainted kernel?

Comment 3 Juanjo Villaplana 2005-03-09 09:13:52 UTC
Hi Ernie,

I guess the kernel is tainted because the loading of HP's management modules (we
are using "hprsm" and "hprsm" rpms provided by HP).

As far as we don't know how to repoduce this problem, we will disable the HP
management services and see if the serves panics again.

Regards.

Comment 4 Stephen Tweedie 2005-03-09 13:14:11 UTC
Have you ever seen this problem before?  How long was it running before the panic?

Comment 5 Juanjo Villaplana 2005-03-10 11:19:21 UTC
It is the first time we see this problem. The server had an uptime of 34 days.
Note that we are using HP management services on this server since 01/05/2005.

Comment 6 Stephen Tweedie 2005-03-10 16:29:39 UTC
OK, I don't think there's much we can do here without more information.  It
would be useful if you could enable netdump and capture a core file if the crash
recurs, as otherwise there's not much to go on here.


Comment 7 Juanjo Villaplana 2005-03-11 09:17:27 UTC
OK. I will try to configure netdump.

Comment 9 Stephen Tweedie 2005-05-23 20:40:18 UTC
We have recently found a problem in the RHEL-3 kernels which is likely to be the
cause of this bug.  Details are in bug 155289.

We have a test kernel built which should resolve this problem; x86 rpms can be
downloaded at 

http://people.redhat.com/~petrides/.pte_race/kernel-hugemem-2.4.21-32.4.EL.i686.rpm
http://people.redhat.com/~petrides/.pte_race/kernel-hugemem-unsupported-2.4.21-32.4.EL.i686.rpm
http://people.redhat.com/~petrides/.pte_race/kernel-smp-2.4.21-32.4.EL.i686.rpm
http://people.redhat.com/~petrides/.pte_race/kernel-smp-unsupported-2.4.21-32.4.EL.i686.rpm
http://people.redhat.com/~petrides/.pte_race/kernel-source-2.4.21-32.4.EL.i386.rpm

and these patches will be in RHEL-3 U6.

Given that you don't seem able to reproduce this, please let us know if you want
to try these kernels or simply wait for U6.

Comment 10 Juanjo Villaplana 2005-05-25 11:23:44 UTC
We are unable to reproduce this problem, so we will continue using the stable
kernels and wait for U6, thanks for your interest.

Regards.

Comment 11 Stephen Tweedie 2005-05-25 11:41:09 UTC
OK, I'll close the bug for now on the basis that we believe this problem will be
fixed in the next release.  Please reopen if you need to pursue this further
before then.

Comment 14 Ernie Petrides 2005-07-22 02:13:35 UTC

*** This bug has been marked as a duplicate of 155289 ***

Comment 17 Red Hat Bugzilla 2005-09-28 14:48:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html



Note You need to log in before you can comment on or make changes to this bug.