From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050719 Red Hat/1.7.10-1.1.3.1 Description of problem: Kernel crashes with the following Oops info... Sep 4 11:15:18 VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... Sep 4 11:15:18 Sep 4 11:21:49 Unable to handle kernel paging request at virtual address a16cc79a Sep 4 11:21:49 printing eip: Sep 4 11:21:49 c0181257 Sep 4 11:21:49 *pde = 0804e000 Sep 4 11:21:49 Oops: 0000 Sep 4 11:21:49 ide-cd cdrom nfs nfsd lockd sunrpc usbserial lp parport autofs4 e1000 floppy sg Sep 4 11:21:49 microcode keybdev mousedev hid input usb-uhci usbcore ext3 jbd raid1 qla2300 q Sep 4 11:21:49 CPU: 1 Sep 4 11:21:49 EIP: 0060:[<c0181257>] Not tainted Sep 4 11:21:49 EFLAGS: 00010286 Sep 4 11:21:49 Sep 4 11:21:49 Sep 4 11:21:49 EIP is at iput [kernel] 0x37 (2.4.21-32.0.1.ELsmp/i686) Sep 4 11:21:49 eax: a16cc782 ebx: e7428a80 ecx: e7428a90 edx: f3dce400 Sep 4 11:21:50 esi: a16cc782 edi: ea56dc00 ebp: 0000c9ba esp: c4cbdf6c Sep 4 11:21:50 ds: 0068 es: 0068 ss: 0068 Sep 4 11:21:50 Process kswapd (pid: 11, stackpage=c4cbd000) Sep 4 11:21:50 Stack: caa77300 c017df70 f8cd4ae7 f3dce418 f3dce400 e7428a80 c017e47a e7428a80 Sep 4 11:21:50 Sep 4 11:21:50 e7428a80 c03a7b00 00000cfb 00000000 00000040 c017e848 000185a4 00000000 Sep 4 11:21:50 Sep 4 11:21:50 c0157000 00000006 000001d0 00000014 00000000 00000000 00001a61 00000000 Sep 4 11:21:50 Sep 4 11:21:50 Call Trace: [<c017df70>] dput [kernel] 0x30 (0xc4cbdf70) Sep 4 11:21:50 [<f8cd4ae7>] nfs_dentry_iput [nfs] 0x57 (0xc4cbdf74) Sep 4 11:21:50 [<c017e47a>] prune_dcache [kernel] 0x18a (0xc4cbdf84) Sep 4 11:21:50 [<c017e848>] shrink_dcache_memory [kernel] 0x68 (0xc4cbdfa0) Sep 4 11:21:50 [<c0157000>] do_try_to_free_pages_kswapd [kernel] 0x150 (0xc4cbdfac) Sep 4 11:21:50 [<c01571c8>] kswapd [kernel] 0x68 (0xc4cbdfd0) Sep 4 11:21:50 [<c0157160>] kswapd [kernel] 0x0 (0xc4cbdfe4) Sep 4 11:21:50 [<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xc4cbdff0) Sep 4 11:21:50 Sep 4 11:21:51 Code: 8b 46 18 85 c0 0f 85 d1 02 00 00 c7 44 24 04 1c c5 3a c0 8d Sep 4 11:21:51 Sep 4 11:21:51 Kernel panic: Fatal exception Sep 4 11:21:51 Sep 4 11:22:51 Rebooting in 60 seconds.. Version-Release number of selected component (if applicable): 2.4.21-32.0.1.ELsmp How reproducible: Couldn't Reproduce Additional info: Problem seems similar to bug 167385, but that is with a 2.6 kernel. No responses noted for that bug.
Created attachment 118605 [details] sysreport info
This appears to be corruption of the inode cache. Is this reproducable and if so, is the customer willing to run a debug kernel with slab debugging enabled? Larry Woodman
No, I can't intentionally reproduce it. We are willing to assist. Let me know what needs to be done, an what the impact might be. Keep in mind this is a production system, and that we may have to run it in debug for a while before another crash. I don't know what "slab" debugging is.
Sev, can you try to reproduce this problem with the RHEL3-U6 kernel? We have multiple fixes in that kernel that could prevent inode cache corruption. Larry Woodman
Well, I can't reproduce it even now. But I guess this means we should upgrade
A fix for this problem was committed to the RHEL3 U6 patch pool on 13-May-2005 (in kernel version 2.4.21-32.4.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-663.html *** This bug has been marked as a duplicate of 155289 ***
We have had a similar crash on a different sever even after going to U6. Please see bug# 177451
*** This bug has been marked as a duplicate of 177451 ***
A fix for this problem was committed to the RHEL3 U8 patch pool on 17-Feb-2006 (in kernel version 2.4.21-40.2.EL). *** This bug has been marked as a duplicate of 175216 ***
Adding a couple dozen bugs to CanFix list so I can complete the stupid advisory.
Seems bug is still around even with hot fix kernel 2.4.21-40.2.ELsmp VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... Unable to handle kernel paging request at virtual address 5069c79a printing eip: c0182097 *pde = 00000000 Oops: 0000 soundcore ide-cd cdrom nfs nfsd lockd usbserial lp parport netconsole mvfs vnode sunrpc autofs4 e1000 floppy sg microcode keybdev mousedev hid input usb-uhci CPU: 0 EIP: 0060:[<c0182097>] Tainted: PF EFLAGS: 00013206 EIP is at iput [kernel] 0x37 (2.4.21-40.2.ELsmp/i686) eax: 5069c782 ebx: dd7de900 ecx: dd7de910 edx: cb7d8c00 esi: 5069c782 edi: cd7dd800 ebp: cd7dd800 esp: f7f0ff6c ds: 0068 es: 0068 ss: 0068 Process kswapd (pid: 11, stackpage=f7f0f000) Stack: 00000003 f7e25f98 f8e7aae7 cb7d8c18 cb7d8c00 dd7de900 c017f05a dd7de900 dd7de900 c03aac00 00003281 00000000 00000040 c017f568 0000eb19 00000000 c01577f0 00000006 000001d0 00000014 00000000 00000000 0000652d 00000000 Call Trace: [<f8e7aae7>] nfs_dentry_iput [nfs] 0x57 (0xf7f0ff74) [<c017f05a>] prune_dcache [kernel] 0x1ca (0xf7f0ff84) [<c017f568>] shrink_dcache_memory [kernel] 0x68 (0xf7f0ffa0) [<c01577f0>] do_try_to_free_pages_kswapd [kernel] 0x150 (0xf7f0ffac) [<c01579b8>] kswapd [kernel] 0x68 (0xf7f0ffd0) [<c0157950>] kswapd [kernel] 0x0 (0xf7f0ffe4) [<c01095cd>] kernel_thread_helper [kernel] 0x5 (0xf7f0fff0) Code: 8b 46 18 85 c0 0f 85 d1 02 00 00 c7 44 24 04 1c f6 3a c0 8d CPU#0 is executing netdump. CPU#1 is frozen. CPU#2 is frozen. CPU#3 is frozen.
What's tainting the kernel?
We have a IBM(Rational) clearcase module installed