Description of problem: We got a Kernel panic (2.4.21-27.0.2.ELsmp): ----------8<-------------------- Unable to handle kernel NULL pointer dereference at virtual address 00000028 printing eip: c0180518 *pde = 29aa5001 *pte = 431cc067 Oops: 0000 cpqci soundcore usbserial lp parport autofs4 tg3 ipt_state ip_conntrack ipt_REJECT iptable_filter ip_tables floppy sg microcode loop keybdev mousedev hid inpu CPU: 3 EIP: 0060:[<c0180518>] Tainted: P EFLAGS: 00010203 EIP is at find_inode [kernel] 0x28 (2.4.21-27.0.2.ELsmp/i686) eax: 00000000 ebx: 00000000 ecx: 0007ffff edx: c9aa2480 esi: 00000000 edi: c9bdcec0 ebp: 000d4397 esp: d00f3e90 ds: 0068 es: 0068 ss: 0068 Process smbd (pid: 8817, stackpage=d00f3000) Stack: 00000000 e79ef970 de98bc00 000d4397 c9bdcec0 000d4397 de98bc00 c01808b1 de98bc00 000d4397 c9bdcec0 00000000 00000000 000d4397 d03def00 de98bc00 da7cfd80 f887b76d de98bc00 000d4397 00000000 00000000 f2775018 fffffff4 Call Trace: [<c01808b1>] iget4_locked [kernel] 0x61 (0xd00f3eac) [<f887b76d>] ext3_lookup [ext3] 0x7d (0xd00f3ed4) [<c01729cc>] real_lookup [kernel] 0xec (0xd00f3ef8) [<c0172ffc>] link_path_walk [kernel] 0x45c (0xd00f3f18) [<c0173549>] path_lookup [kernel] 0x39 (0xd00f3f58) [<c0173899>] __user_walk [kernel] 0x49 (0xd00f3f68) [<c016e6de>] sys_lstat64 [kernel] 0x2e (0xd00f3f84) Code: 39 6b 28 89 de 75 f1 8b 44 24 20 39 83 ac 00 00 00 75 e5 8b Kernel panic: Fatal exception ----------8<-------------------- on a Proliant DL580 G2 (4 x Intel Xeon MP 2.0GHz, 8GB RAM, Smart Array RAID) running RHEL 3U4 with latest updates. Version-Release number of selected component (if applicable): kernel-smp-2.4.21-27.0.2.EL How reproducible: don't know Steps to Reproduce: n/a Actual results: n/a Expected results: n/a Additional info: See attachment with full console output and SysRq M+T+P+W+U+B
Created attachment 111390 [details] kernel panic +SysRq full console output
Hello, Juanjo. Can this problem be reproduced with an untainted kernel?
Hi Ernie, I guess the kernel is tainted because the loading of HP's management modules (we are using "hprsm" and "hprsm" rpms provided by HP). As far as we don't know how to repoduce this problem, we will disable the HP management services and see if the serves panics again. Regards.
Have you ever seen this problem before? How long was it running before the panic?
It is the first time we see this problem. The server had an uptime of 34 days. Note that we are using HP management services on this server since 01/05/2005.
OK, I don't think there's much we can do here without more information. It would be useful if you could enable netdump and capture a core file if the crash recurs, as otherwise there's not much to go on here.
OK. I will try to configure netdump.
We have recently found a problem in the RHEL-3 kernels which is likely to be the cause of this bug. Details are in bug 155289. We have a test kernel built which should resolve this problem; x86 rpms can be downloaded at http://people.redhat.com/~petrides/.pte_race/kernel-hugemem-2.4.21-32.4.EL.i686.rpm http://people.redhat.com/~petrides/.pte_race/kernel-hugemem-unsupported-2.4.21-32.4.EL.i686.rpm http://people.redhat.com/~petrides/.pte_race/kernel-smp-2.4.21-32.4.EL.i686.rpm http://people.redhat.com/~petrides/.pte_race/kernel-smp-unsupported-2.4.21-32.4.EL.i686.rpm http://people.redhat.com/~petrides/.pte_race/kernel-source-2.4.21-32.4.EL.i386.rpm and these patches will be in RHEL-3 U6. Given that you don't seem able to reproduce this, please let us know if you want to try these kernels or simply wait for U6.
We are unable to reproduce this problem, so we will continue using the stable kernels and wait for U6, thanks for your interest. Regards.
OK, I'll close the bug for now on the basis that we believe this problem will be fixed in the next release. Please reopen if you need to pursue this further before then.
*** This bug has been marked as a duplicate of 155289 ***
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html