From Bugzilla Helper: User-Agent: Mozilla/4.78 [en] (X11; U; Linux 2.4.7-10 i686) Description of problem: The error is less frequent then on the kernel 2.4.2-2smp but it occurs once in a while. Not always but sometimes the system stops responding to anything and need a hard reset.Current kernel 2.4.9-13smp. Version-Release number of selected component (if applicable): How reproducible: Couldn't Reproduce Steps to Reproduce: 1.Unable to reproduce but it occurs quite frequently 2. 3. Additional info: A. uname -a Linux linhost010 2.4.9-13smp #1 SMP Fri May 24 13:53:20 IST 2002 i686 unknown B. Services (autofs,NIS) C. other Softwares: 1.Rational's Clearcase #cleartool -ver ClearCase version 4.2 (2001A.04.00) (Mon Jul 02 16:33:34 EDT 2001) clearcase_p4.2-17 (Fri May 24 11:04:02 EDT 2002) clearcase_p4.2-18 (Fri May 24 10:59:27 EDT 2002) @(#) MVFS version 4.2+ (Sun Mar 17 14:38:59 EST 2002) cleartool V4.2 (Mon Jun 25 11:57:13 EDT 2001) db_server V4.2+ (Sun Mar 31 12:39:18 EST 2002) VOB database schema version: 53 2.LSF (load sharing Tool) D. MESSAGES (/var/log/messages ) Sep 24 12:44:27 linhost010 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000008 Sep 24 12:44:27 linhost010 kernel: printing eip: Sep 24 12:44:27 linhost010 kernel: c01486a9 Sep 24 12:44:27 linhost010 kernel: *pde = 00000000 Sep 24 12:44:27 linhost010 kernel: Oops: 0000 Sep 24 12:44:27 linhost010 kernel: CPU: 1 Sep 24 12:44:27 linhost010 kernel: EIP: 0010:[path_walk+2089/2352] Not tainted Sep 24 12:44:27 linhost010 kernel: EIP: 0010:[<c01486a9>] Not tainted Sep 24 12:44:27 linhost010 kernel: EFLAGS: 00010246 Sep 24 12:44:27 linhost010 kernel: eax: 00000000 ebx: 00000000 ecx: c02dac38 edx: f4a4df9c Sep 24 12:44:27 linhost010 kernel: esi: cfc969c0 edi: f4b253c0 ebp: f4a4df9c esp: f4a4df28 Sep 24 12:44:27 linhost010 kernel: ds: 0018 es: 0018 ss: 0018 Sep 24 12:44:27 linhost010 kernel: Process fuser (pid: 9849, stackpage=f4a4d000) Sep 24 12:44:27 linhost010 kernel: Stack: 00000009 00000000 cfc969c0 000041ed 0000001f 00000282 00000000 00000000 Sep 24 12:44:27 linhost010 kernel: 00001000 fffffff4 c3038000 c3038000 00000003 00241703 0804ae84 00000000 Sep 24 12:44:27 linhost010 kernel: c3038000 f4a4df9c bfffd7a8 c0148d9a c3038000 f4a4df9c f4a4c000 00000295 Sep 24 12:44:27 linhost010 kernel: Call Trace: [__user_walk+58/96] __user_walk [kernel] 0x3a Sep 24 12:44:27 linhost010 kernel: Call Trace: [<c0148d9a>] __user_walk [kernel] 0x3a Sep 24 12:44:27 linhost010 kernel: [sys_stat64+19/112] sys_stat64 [kernel] 0x13 Sep 24 12:44:27 linhost010 kernel: [<c01454a3>] sys_stat64 [kernel] 0x13 Sep 24 12:44:27 linhost010 kernel: [system_call+51/56] system_call [kernel] 0x33 Sep 24 12:44:27 linhost010 kernel: [<c010719b>] system_call [kernel] 0x33 Sep 24 12:44:27 linhost010 kernel: Sep 24 13:22:01 linhost010 syslogd 1.4.1: restart. Sep 24 13:22:01 linhost010 syslog: syslogd startup succeeded Sep 24 13:22:01 linhost010 kernel: klogd 1.4.1, log source = /proc/kmsg started. No option but a hard reset. Please help me out,
Clearcase is a binary only kernel module and as a result, unsupported. I also recommend that you upgrade to the latest erratum kernel we released for 7.2 (2.4.9-34 right now) since quite a few bugs have been fixed.
I have this same exact problem. Hardware: IBM x330 pIII 1.2ghz RedHat 7.2 Rational Clearcase v4.2 LSF 4.10 Error message: Unable to handle kernel NULL point er dereference at virtual address 00000098 printing eip: f8d0c094 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<f8d0c094>] EFLAGS: 00010286 eax: f01478b4 ebx: 00000000 ecx: 00000000 edx: f01478b4 esi: f01478b4 edi: f0147800 ebp: d5b39cb8 esp: d5b39ca0 ds: 0018 es: 0018 ss: 0018 Process vlib (pid: 2227, stackpage=d5b39000) Stack: 00000000 e6e017a0 d5b39d2c c99a1be0 c013f830 ed62c5e0 d5b39d48 f8cec010 00000000 e6e017a0 f01478b4 00000000 effa8da0 000000f0 d5b39cf8 f8d105ac 00000000 f0147800 00000000 f8d0bb32 c99a1be0 00000000 00000000 00000000 Call Trace: [path_release+16/48] [ <f8cec010>] [<f8d105ac>] [<f8d0bb32>] [<f8d264fc>] Call Trace: [<c013f830>] [<f8cec01 0>] [<f8d105ac>] [<f8d0bb32>] [<f8d264fc>] [<f8d01ad4>] [<f8d25f84>] [<f8c ec658>] [<f8cec7b1>] [<f8cf04b8>] [<f8ce6fae>] [<f8ce6f88>] [<f8cf02da>] [<f8d 0b1c1>] [<f8d0b1b7>] [<f8ce6bed>] [<f8d08988>] [notify_change+94/288] [<f8d25f c8>] [do_truncate+107/160] [<f8d0a897>] [lookup_hash+106/144] [open_namei+1109/1 456] [<c014a20e>] [<f8d25fc8>] [<c01 33e0b>] [<f8d0a897>] [<c014063a>] [<c0140c75>] [<f8cce04d>] [filp_open+54/96] [sys_open+54/176] [system_call+51/56] [<f8cce04d>] [<c0134d16>] [<c01 35006>] [<c0106f0b>] Code: 83 bb 98 00 00 00 00 74 1a 8 b 93 98 00 00 00 83 7a 34 00 74 I would like to add that ksymoops reports back that this is indeed an issue with the mvfs.o kernel module. Upgrading to latest RedHat erratta kernel is not an option to some sites due to the fact that the binary only kernel module does checksumming before loading . So unless they carefully craft their kernel, it will not work. There have been reports of custom kernels being built and used with the mvfs.o module (after rebuilding it). Specifically, one of our remote sites have these combinations working: 2.4.9-13 with Clearcase Patch Clearcase_p4.2-17 and Clearcase_p4.2-18 2.4.9-31 with Clearcase Patch Clearcase_p4.2-17 and Clearcase_p4.2-18 These kernel/mvfs module combinations work and the bug does not seem to be present in them. regards, Ladd Hebert mailto: lhebert.ti.com
the_end: your problem is running an unsupported configuration. Configurations that use binary only kernel modules are not supported for several reasons, one of them is that you can't move to a fixed kernel.