Description of problem:
We got a Kernel panic (2.4.21-27.0.2.ELsmp):
Unable to handle kernel NULL pointer dereference at virtual address 00000028
*pde = 29aa5001
*pte = 431cc067
cpqci soundcore usbserial lp parport autofs4 tg3 ipt_state ip_conntrack
ipt_REJECT iptable_filter ip_tables floppy sg microcode loop keybdev mousedev
EIP: 0060:[<c0180518>] Tainted: P
EIP is at find_inode [kernel] 0x28 (2.4.21-27.0.2.ELsmp/i686)
eax: 00000000 ebx: 00000000 ecx: 0007ffff edx: c9aa2480
esi: 00000000 edi: c9bdcec0 ebp: 000d4397 esp: d00f3e90
ds: 0068 es: 0068 ss: 0068
Process smbd (pid: 8817, stackpage=d00f3000)
Stack: 00000000 e79ef970 de98bc00 000d4397 c9bdcec0 000d4397 de98bc00 c01808b1
de98bc00 000d4397 c9bdcec0 00000000 00000000 000d4397 d03def00 de98bc00
da7cfd80 f887b76d de98bc00 000d4397 00000000 00000000 f2775018 fffffff4
Call Trace: [<c01808b1>] iget4_locked [kernel] 0x61 (0xd00f3eac)
[<f887b76d>] ext3_lookup [ext3] 0x7d (0xd00f3ed4)
[<c01729cc>] real_lookup [kernel] 0xec (0xd00f3ef8)
[<c0172ffc>] link_path_walk [kernel] 0x45c (0xd00f3f18)
[<c0173549>] path_lookup [kernel] 0x39 (0xd00f3f58)
[<c0173899>] __user_walk [kernel] 0x49 (0xd00f3f68)
[<c016e6de>] sys_lstat64 [kernel] 0x2e (0xd00f3f84)
Code: 39 6b 28 89 de 75 f1 8b 44 24 20 39 83 ac 00 00 00 75 e5 8b
Kernel panic: Fatal exception
on a Proliant DL580 G2 (4 x Intel Xeon MP 2.0GHz, 8GB RAM, Smart Array RAID)
running RHEL 3U4 with latest updates.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
See attachment with full console output and SysRq M+T+P+W+U+B
Created attachment 111390 [details]
kernel panic +SysRq full console output
Hello, Juanjo. Can this problem be reproduced with an untainted kernel?
I guess the kernel is tainted because the loading of HP's management modules (we
are using "hprsm" and "hprsm" rpms provided by HP).
As far as we don't know how to repoduce this problem, we will disable the HP
management services and see if the serves panics again.
Have you ever seen this problem before? How long was it running before the panic?
It is the first time we see this problem. The server had an uptime of 34 days.
Note that we are using HP management services on this server since 01/05/2005.
OK, I don't think there's much we can do here without more information. It
would be useful if you could enable netdump and capture a core file if the crash
recurs, as otherwise there's not much to go on here.
OK. I will try to configure netdump.
We have recently found a problem in the RHEL-3 kernels which is likely to be the
cause of this bug. Details are in bug 155289.
We have a test kernel built which should resolve this problem; x86 rpms can be
and these patches will be in RHEL-3 U6.
Given that you don't seem able to reproduce this, please let us know if you want
to try these kernels or simply wait for U6.
We are unable to reproduce this problem, so we will continue using the stable
kernels and wait for U6, thanks for your interest.
OK, I'll close the bug for now on the basis that we believe this problem will be
fixed in the next release. Please reopen if you need to pursue this further
*** This bug has been marked as a duplicate of 155289 ***
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.