Bug 147022 - oops - Unable to handle kernel paging request at virtual address
Summary: oops - Unable to handle kernel paging request at virtual address
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Peter Martuccelli
QA Contact: Brian Brock
URL: acnlin80.pbn.bnl.gov
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-02-03 19:31 UTC by Sev Binello
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-02-07 21:59:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Sev Binello 2005-02-03 19:31:05 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.2)
Gecko/20040301

Description of problem:
Kernel 2.4.21-4.EL produced similar crash.
No known way to reproduce.
Will attach sample oops.
We have also experienced system hangs,
that may or may no be related.

ACNLIN80 - CRASH INFO 2/1/03 (Newer Kernel)

nfs_statfs: statfs error = 116
Unable to handle kernel paging request at virtual address 00080006
printing eip:
f8cde44f
*pde = 307a9001
*pte = 3f398067
Oops: 0000
mvfs vnode nfsd nfs lockd sunrpc lp parport autofs4 e1000 floppy sg
microcode ke]
ybdev mousedev hid input usb-uhci usbcore ext3 jbd raid1 qla2300
qla2300_conf
CPU:    1
EFLAGS: 00010246

EIP is at mdki_memcmp [vnode] 0x11 (2.4.21-20.0.1.ELsmp/i686)
eax: e994190c   ebx: c9f1d834   ecx: 00000034   edx: 00000cb4
esi: e994190c   edi: 00080006   ebp: d8087e44   esp: d8087e3c
ds: 0068   es: 0068   ss: 0068
Process cp (pid: 12441, stackpage=d8087000)
Stack: e9941908 000aadcc d8087e60 f8cf73c2 e994190c 00080006 00000034
00000000
    e82c3c00 d8087e84 f8cf746f c9f1d834 e9941908 f648122c e9941908
00000000
    e82c3c00 00000000 d8087ebc f8cf6d13 e82c3c00 e9941908 d8087ea8
00000001

Code: f3 a6 0f 97 c0 0f 92 c2 5e 28 d0 0f be c0 5f c9 c3 55 89 e5



ACNLIN80 - CRASH INFO 12/3/04 (Old Kernel)

Unable to handle kernel paging request at virtual address 01000017
 printing eip:
0217d137
*pde = 00005001
*pte = 7e0000e3
Oops: 0000
ide-cd cdrom sg mvfs vnode nfsd nfs lockd sunrpc lp parport autofs
e1000 floppy
microcode keybdev mousedev hid input usb-uhci usbcore ext3 jbd raid1
qla2300 q
CPU:    2
EIP:    0060:[<0217d137>]    Tainted: PF
EFLAGS: 00010206

EIP is at iput [kernel] 0x37 (2.4.21-4.ELcustom)
eax: 00ffffff   ebx: ea0d8b00   ecx: ea0d8b10   edx: 159f1900
esi: 00ffffff   edi: e5f08800   ebp: 00000146   esp: b8393f24
ds: 0068   es: 0068   ss: 0068
Process umount (pid: 9707, stackpage=b8393000)
Stack: 00000000 0217a010 f8c9bac7 159f1918 159f1900 ea0d8b00 0217a4fa
ea0d8b00
       ea0d8b00 dd73c180 dd73c180 f8cb21a0 f8cb2390 0217a854 000001e3
c74bec00
       02168df4 dd73c180 0239ff68 00000000 b8393f8c 0804def8 feffb858
0217fe3f
Call Trace:   [<0217a010>] dput [kernel] 0x30 (0xb8393f28)
[<f8c9bac7>] nfs_dentry_iput [nfs] 0x57 (0xb8393f2c)
[<0217a4fa>] prune_dcache [kernel] 0x17a (0xb8393f3c)
[<f8cb21a0>] nfs_sops [nfs] 0x0 (0xb8393f50)
[<f8cb2390>] nfs_fs_type [nfs] 0x0 (0xb8393f54)
[<0217a854>] shrink_dcache_parent [kernel] 0x24 (0xb8393f58)
[<02168df4>] kill_super [kernel] 0x94 (0xb8393f64)
[<0217fe3f>] sys_umount [kernel] 0x3f (0xb8393f80)
[<021606ae>] filp_close [kernel] 0x8e (0xb8393f94)
[<0217feb7>] sys_oldumount [kernel] 0x17 (0xb8393fb4)

Code: Bad EIP value.

Kernel panic: Fatal exception


Version-Release number of selected component (if applicable):
kernel-2.4.21-20.0.1.EL

How reproducible:
Couldn't Reproduce

Steps to Reproduce:
1.Will crash about once a month, but can't reproduce at will.
2.
3.
    

Additional info:

Comment 1 Ernie Petrides 2005-02-07 21:59:06 UTC
This crash occurred in mdki_memcmp(), which is part of a non-RHEL3
module (presumably one that tainted your kernel).  Thus, we expect
the problem to be within that module, and so you'd need to file a
bug with whoever provides/supports it.

Please feel free to reopen this bugzilla report if you can reproduce
the problem with an untainted kernel.


Comment 2 Sev Binello 2005-02-07 22:31:40 UTC
(In reply to comment #1)
> This crash occurred in mdki_memcmp(), which is part of a non-RHEL3
> module (presumably one that tainted your kernel).  Thus, we expect
> the problem to be within that module, and so you'd need to file a
> bug with whoever provides/supports it.
> 
> Please feel free to reopen this bugzilla report if you can reproduce
> the problem with an untainted kernel.
> 

That doesn't appear to be the case in the earlier crash.
What is the policy ?
Does RedHat only investigate problems when no other 3rd party modules
are installed ?

Comment 3 Ernie Petrides 2005-02-07 23:47:41 UTC
Sev, I don't understand why the line between "CPU:" and "EFLAGS:" is
missing in the output from your first crash.  But basically, if we
don't even have access to the source code that crashed, we aren't
able to debug the problem.  If the 3rd-party-vendor can point us to
a bug in the core RHEL3 code, then we'd certainly be happy to fix it.


Comment 4 Sev Binello 2005-02-08 14:49:16 UTC
For completeness I include the output again.
For some reason the output going to syslogd from our digi box
did not contain all the output, present on the console.
I will contact IBM/Rational about their ClearCase module.

Puting aside this crash,
can you tell me anything at all about the other crash I also included ?

Unable to handle kernel paging request at virtual address 00080006
printing eip:
f8cde44f
*pde = 307a9001
*pte = 3f398067
Oops: 0000
mvfs vnode nfsd nfs lockd sunrpc lp parport autofs4 e1000 floppy sgmicrocode
keybdev mousedev hid input usb-uhci usbcore ext3 jbd raid1 qla2300 qla2300_conf

CPU:    1
EIP:    0060:[<f8cde44f>]    Tainted: PF
EFLAGS: 00010246

EIP is at mdki_memcmp [vnode] 0x11 (2.4.21-20.0.1.ELsmp/i686)
eax: e994190c   ebx: c9f1d834   ecx: 00000034   edx: 00000cb4
esi: e994190c   edi: 00080006   ebp: d8087e44   esp: d8087e3c
ds: 0068   es: 0068   ss: 0068
Process cp (pid: 12441, stackpage=d8087000)
Stack: e9941908 000aadcc d8087e60 f8cf73c2 e994190c 00080006 00000034 00000000
      e82c3c00 d8087e84 f8cf746f c9f1d834 e9941908 f648122c e9941908 00000000
      e82c3c00 00000000 d8087ebc f8cf6d13 e82c3c00 e9941908 d8087ea8 00000001

Call Trace:   [<f8cf73c2>] mvfs_find_cred [mvfs] 0x32 (0xd8087e48)
[<f8cf746f>] mvfs_record_cred [mvfs] 0x8f (0xd8087e64)
[<f8cf6d13>] mfs_getcleartext [mvfs] 0x573 (0xd8087e88)
[<f8ceeacc>] mvfs_openv_ctx [mvfs] 0x2ec (0xd8087ec0)
[<f8d30060>] mvfs_vnodeops [mvfs] 0x0 (0xd8087f00)
[<f8d16dd6>] mvfs_linux_open_wrapper [mvfs] 0x16 (0xd8087f10)
[<f8cd6a3f>] vnode_fop_open [vnode] 0xb3 (0xd8087f2c)
[<c0162490>] dentry_open [kernel] 0x110 (0xd8087f54)
[<c0162378>] filp_open [kernel] 0x68 (0xd8087f70)
[<c0162783>] sys_open [kernel] 0x53 (0xd8087fa8)

Code: f3 a6 0f 97 c0 0f 92 c2 5e 28 d0 0f be c0 5f c9 c3 55 89 e5

Kernel panic: Fatal exception




Comment 5 Ernie Petrides 2005-02-09 01:02:59 UTC
> can you tell me anything at all about the other crash I also included ?

It looks like the "s_op" field of the (struct super_block) or the "i_sb"
field of the (struct inode) was bad while executing inside iput() under
NFS unmount handling.  This might be related to inode handling problems
within MVFS or an invalid attempt to unmount an NFS f/s being used by MVFS.

I haven't seen any reports similar to that, nor have there been any related
changes to fs/inode.c or fs/dcache.c since 2.4.21-4.EL that would address
such an issue.  But that release is well over a year old (there have been
4 updates released since then), so I would advise using a recent kernel.

Comment 6 Sev Binello 2005-02-09 14:44:36 UTC
Thanks for the info.
Will upgrade just in case.


Note You need to log in before you can comment on or make changes to this bug.