Bug 167839 - kernel crashes with an Ooops
kernel crashes with an Ooops
Status: CLOSED DUPLICATE of bug 175216
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Dave Anderson
Brian Brock
:
Depends On:
Blocks: RHEL3U8CanFix
  Show dependency treegraph
 
Reported: 2005-09-08 15:04 EDT by Sev Binello
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-01-20 15:17:05 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sysreport info (2.13 MB, application/x-bzip2)
2005-09-08 15:21 EDT, Sev Binello
no flags Details

  None (edit)
Description Sev Binello 2005-09-08 15:04:37 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050719 Red Hat/1.7.10-1.1.3.1

Description of problem:
Kernel crashes with the following Oops info...

Sep  4 11:15:18 VFS: Busy inodes after unmount. Self-destruct in 5 seconds.  Have a nice day...
Sep  4 11:15:18 

Sep  4 11:21:49 Unable to handle kernel paging request at virtual address a16cc79a

Sep  4 11:21:49  printing eip:

Sep  4 11:21:49 c0181257

Sep  4 11:21:49 *pde = 0804e000

Sep  4 11:21:49 Oops: 0000

Sep  4 11:21:49 ide-cd cdrom nfs nfsd lockd sunrpc usbserial lp parport autofs4 e1000 floppy sg 
Sep  4 11:21:49 microcode keybdev mousedev hid input usb-uhci usbcore ext3 jbd raid1 qla2300 q

Sep  4 11:21:49 CPU:    1

Sep  4 11:21:49 EIP:    0060:[<c0181257>]    Not tainted

Sep  4 11:21:49 EFLAGS: 00010286

Sep  4 11:21:49 
Sep  4 11:21:49 

Sep  4 11:21:49 EIP is at iput [kernel] 0x37 (2.4.21-32.0.1.ELsmp/i686)

Sep  4 11:21:49 eax: a16cc782   ebx: e7428a80   ecx: e7428a90   edx: f3dce400

Sep  4 11:21:50 esi: a16cc782   edi: ea56dc00   ebp: 0000c9ba   esp: c4cbdf6c

Sep  4 11:21:50 ds: 0068   es: 0068   ss: 0068

Sep  4 11:21:50 Process kswapd (pid: 11, stackpage=c4cbd000)

Sep  4 11:21:50 Stack: caa77300 c017df70 f8cd4ae7 f3dce418 f3dce400 e7428a80 c017e47a e7428a80 
Sep  4 11:21:50 

Sep  4 11:21:50        e7428a80 c03a7b00 00000cfb 00000000 00000040 c017e848 000185a4 00000000 
Sep  4 11:21:50 

Sep  4 11:21:50        c0157000 00000006 000001d0 00000014 00000000 00000000 00001a61 00000000 
Sep  4 11:21:50 

Sep  4 11:21:50 Call Trace:   [<c017df70>] dput [kernel] 0x30 (0xc4cbdf70)

Sep  4 11:21:50 [<f8cd4ae7>] nfs_dentry_iput [nfs] 0x57 (0xc4cbdf74)

Sep  4 11:21:50 [<c017e47a>] prune_dcache [kernel] 0x18a (0xc4cbdf84)

Sep  4 11:21:50 [<c017e848>] shrink_dcache_memory [kernel] 0x68 (0xc4cbdfa0)

Sep  4 11:21:50 [<c0157000>] do_try_to_free_pages_kswapd [kernel] 0x150 (0xc4cbdfac)

Sep  4 11:21:50 [<c01571c8>] kswapd [kernel] 0x68 (0xc4cbdfd0)

Sep  4 11:21:50 [<c0157160>] kswapd [kernel] 0x0 (0xc4cbdfe4)

Sep  4 11:21:50 [<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xc4cbdff0)

Sep  4 11:21:50 

Sep  4 11:21:51 Code: 8b 46 18 85 c0 0f 85 d1 02 00 00 c7 44 24 04 1c c5 3a c0 8d

Sep  4 11:21:51 

Sep  4 11:21:51 Kernel panic: Fatal exception

Sep  4 11:21:51  

Sep  4 11:22:51 Rebooting in 60 seconds..



Version-Release number of selected component (if applicable):
2.4.21-32.0.1.ELsmp

How reproducible:
Couldn't Reproduce


Additional info:

Problem seems similar to bug 167385, but that is with a 2.6 kernel.
No responses noted for that bug.
Comment 1 Sev Binello 2005-09-08 15:21:42 EDT
Created attachment 118605 [details]
sysreport info
Comment 2 Larry Woodman 2005-09-09 09:08:26 EDT
This appears to be corruption of the inode cache.  Is this reproducable and if
so, is the customer willing to run a debug kernel with slab debugging enabled?

Larry Woodman
Comment 3 Sev Binello 2005-09-09 09:46:10 EDT
No, I can't intentionally reproduce it.

We are willing to assist.
Let me know what needs to be done,
an what the impact might be.
Keep in mind this is a production system,
and that we may have to run it in debug for a while
before another crash.
I don't know what "slab" debugging is.
Comment 4 Larry Woodman 2005-09-30 15:04:46 EDT
Sev, can you try to reproduce this problem with the RHEL3-U6 kernel?
We have multiple fixes in that kernel that could prevent inode cache 
corruption.

Larry Woodman
Comment 5 Sev Binello 2005-09-30 16:53:01 EDT
Well, I can't reproduce it even now.
But I guess this means we should upgrade
Comment 8 Ernie Petrides 2005-10-10 17:52:48 EDT
A fix for this problem was committed to the RHEL3 U6 patch pool
on 13-May-2005 (in kernel version 2.4.21-32.4.EL).

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-663.html


*** This bug has been marked as a duplicate of 155289 ***
Comment 9 Sev Binello 2006-01-19 17:31:16 EST
We have had a similar crash on a different sever even after 
going to U6. Please see bug# 177451
Comment 10 Dave Anderson 2006-01-20 15:17:05 EST

*** This bug has been marked as a duplicate of 177451 ***
Comment 11 Ernie Petrides 2006-02-23 16:18:32 EST
A fix for this problem was committed to the RHEL3 U8 patch pool
on 17-Feb-2006 (in kernel version 2.4.21-40.2.EL).


*** This bug has been marked as a duplicate of 175216 ***
Comment 12 Ernie Petrides 2006-04-28 17:50:41 EDT
Adding a couple dozen bugs to CanFix list so I can complete the stupid advisory.
Comment 13 Sev Binello 2006-05-09 10:42:47 EDT
Seems bug is still around even with hot fix kernel   2.4.21-40.2.ELsmp

VFS: Busy inodes after unmount. Self-destruct in 5 seconds.  Have a nice day...

Unable to handle kernel paging request at virtual address 5069c79a
 printing eip:
c0182097
*pde = 00000000
Oops: 0000
soundcore ide-cd cdrom nfs nfsd lockd usbserial lp parport netconsole mvfs vnode
sunrpc autofs4 e1000 floppy sg microcode keybdev mousedev hid
input usb-uhci
CPU:    0
EIP:    0060:[<c0182097>]    Tainted: PF
EFLAGS: 00013206
 
EIP is at iput [kernel] 0x37 (2.4.21-40.2.ELsmp/i686)
eax: 5069c782   ebx: dd7de900   ecx: dd7de910   edx: cb7d8c00
esi: 5069c782   edi: cd7dd800   ebp: cd7dd800   esp: f7f0ff6c
ds: 0068   es: 0068   ss: 0068
Process kswapd (pid: 11, stackpage=f7f0f000)
Stack: 00000003 f7e25f98 f8e7aae7 cb7d8c18 cb7d8c00 dd7de900 c017f05a dd7de900
       dd7de900 c03aac00 00003281 00000000 00000040 c017f568 0000eb19 00000000
       c01577f0 00000006 000001d0 00000014 00000000 00000000 0000652d 00000000
Call Trace:   [<f8e7aae7>] nfs_dentry_iput [nfs] 0x57 (0xf7f0ff74)
[<c017f05a>] prune_dcache [kernel] 0x1ca (0xf7f0ff84)
[<c017f568>] shrink_dcache_memory [kernel] 0x68 (0xf7f0ffa0)
[<c01577f0>] do_try_to_free_pages_kswapd [kernel] 0x150 (0xf7f0ffac)
[<c01579b8>] kswapd [kernel] 0x68 (0xf7f0ffd0)
[<c0157950>] kswapd [kernel] 0x0 (0xf7f0ffe4)
[<c01095cd>] kernel_thread_helper [kernel] 0x5 (0xf7f0fff0)
 
Code: 8b 46 18 85 c0 0f 85 d1 02 00 00 c7 44 24 04 1c f6 3a c0 8d
 
CPU#0 is executing netdump.
CPU#1 is frozen.
CPU#2 is frozen.
CPU#3 is frozen.
Comment 14 Dave Anderson 2006-05-09 13:38:48 EDT
What's tainting the kernel?
Comment 15 Sev Binello 2006-05-09 14:17:53 EDT
We have a IBM(Rational) clearcase module installed

Note You need to log in before you can comment on or make changes to this bug.