Bug 97467

Summary: kernel 2.4.20-13.7 crash
Product: [Retired] Red Hat Linux Reporter: Malcolm Amir Hussain-Gambles <malcolm>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.2CC: sct
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-06-18 08:45:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Malcolm Amir Hussain-Gambles 2003-06-16 12:43:38 UTC
Description of problem:
System hangs, forcing a power cycle to get system back

Version-Release number of selected component (if applicable):
kernel 2.4.20-13.7

How reproducible:
Unsure, system seems to only hang when live, possible network load
System is live for about 1 week then hangs

Steps to Reproduce:
1.
2.
3.
    
Actual results:
System crash

Expected results:
No crash!

Additional info:
Jun 13 04:02:04 hfxcoplo1 kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000034
Jun 13 04:02:04 hfxcoplo1 kernel:  printing eip:
Jun 13 04:02:04 hfxcoplo1 kernel: c0149fa4
Jun 13 04:02:04 hfxcoplo1 kernel: *pde = 00000000
Jun 13 04:02:04 hfxcoplo1 kernel: Oops: 0000
Jun 13 04:02:04 hfxcoplo1 kernel: w83781d i2c-proc i2c-piix4 dmi_scan i2c-core 
sd_mod scsi_mod ide-cd cdrom 3c59x ext3 jbd  
Jun 13 04:02:04 hfxcoplo1 kernel: CPU:    0
Jun 13 04:02:05 hfxcoplo1 kernel: EIP:    0010:[d_lookup+96/252]    Not tainted
Jun 13 04:02:05 hfxcoplo1 kernel: EIP:    0010:[<c0149fa4>]    Not tainted
Jun 13 04:02:05 hfxcoplo1 kernel: EFLAGS: 00010207
Jun 13 04:02:05 hfxcoplo1 kernel: 
Jun 13 04:02:05 hfxcoplo1 kernel: EIP is at d_lookup [kernel] 0x60 (2.4.20-13.7)
Jun 13 04:02:05 hfxcoplo1 kernel: eax: cff80000   ebx: fffffff0   ecx: 
0000000f   edx: b26fb5d3
Jun 13 04:02:05 hfxcoplo1 kernel: esi: b26fb5d3   edi: c3962110   ebp: 
00000000   esp: cb9f9d70
Jun 13 04:02:05 hfxcoplo1 kernel: ds: 0018   es: 0018   ss: 0018
Jun 13 04:02:05 hfxcoplo1 kernel: Process updatedb (pid: 13831, 
stackpage=cb9f9000)
Jun 13 04:02:11 hfxcoplo1 kernel: Stack: cffa2540 cc50d000 b26fb5d3 00000018 
cb9f9f48 b26fb5d3 c3962110 cb9f9f00 
Jun 13 04:02:11 hfxcoplo1 kernel:        c01416d6 cbe85210 cb9f9f00 cb9f9f48 
c0141ea8 cbe85210 cb9f9f00 00000000 
Jun 13 04:02:11 hfxcoplo1 kernel:        00000008 00000000 cc50d018 00000000 
00000000 cb9f9e0c 00080563 c6743cc0 
Jun 13 04:02:11 hfxcoplo1 kernel: Call Trace:   [cached_lookup+14/72] 
cached_lookup [kernel] 0xe (0xcb9f9d90))
Jun 13 04:02:11 hfxcoplo1 kernel: Call Trace:   [<c01416d6>] cached_lookup 
[kernel] 0xe (0xcb9f9d90))
Jun 13 04:02:11 hfxcoplo1 kernel: [link_path_walk+1532/2208] link_path_walk 
[kernel] 0x5fc (0xcb9f9da0))
Jun 13 04:02:11 hfxcoplo1 kernel: [<c0141ea8>] link_path_walk [kernel] 0x5fc 
(0xcb9f9da0))
Jun 13 04:02:12 hfxcoplo1 kernel: [3c59x:__insmod_3c59x_O/lib/modules/2.4.20-
13.7/kernel/drivers/net+-578533/96] ext3_mark_iloc_dirty [ext3] 0x23 
(0xcb9f9dd4))
Jun 13 04:02:12 hfxcoplo1 kernel: [<d081fc1b>] ext3_mark_iloc_dirty [ext3] 0x23 
(0xcb9f9dd4))
Jun 13 04:02:12 hfxcoplo1 kernel: [filldir64+505/608] filldir64 [kernel] 0x1f9 
(0xcb9f9de4))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c0145e25>] filldir64 [kernel] 0x1f9 
(0xcb9f9de4))
Jun 13 04:02:12 hfxcoplo1 kernel: [3c59x:__insmod_3c59x_O/lib/modules/2.4.20-
13.7/kernel/drivers/net+-650007/96] journal_stop_R6d4da6dd [jbd] 0x1b1 
(0xcb9f9e0c))
Jun 13 04:02:12 hfxcoplo1 kernel: [<d080e4e9>] journal_stop_R6d4da6dd [jbd] 
0x1b1 (0xcb9f9e0c))
Jun 13 04:02:12 hfxcoplo1 kernel: [3c59x:__insmod_3c59x_O/lib/modules/2.4.20-
13.7/kernel/drivers/net+-586691/96] ext3_bread [ext3] 0x31 (0xcb9f9e34))
Jun 13 04:02:12 hfxcoplo1 kernel: [<d081dc3d>] ext3_bread [ext3] 0x31 
(0xcb9f9e34))
Jun 13 04:02:12 hfxcoplo1 kernel: [3c59x:__insmod_3c59x_O/lib/modules/2.4.20-
13.7/kernel/drivers/net+-595650/96] ext3_readdir [ext3] 0x2ea (0xcb9f9e6c))
Jun 13 04:02:12 hfxcoplo1 kernel: [<d081b93e>] ext3_readdir [ext3] 0x2ea 
(0xcb9f9e6c))
Jun 13 04:02:12 hfxcoplo1 kernel: [3c59x:__insmod_3c59x_O/lib/modules/2.4.20-
13.7/kernel/drivers/net+-595443/96] ext3_readdir [ext3] 0x3b9 (0xcb9f9e84))
Jun 13 04:02:12 hfxcoplo1 kernel: [<d081ba0d>] ext3_readdir [ext3] 0x3b9 
(0xcb9f9e84))
Jun 13 04:02:12 hfxcoplo1 kernel: [open_namei+739/1380] open_namei [kernel] 
0x2e3 (0xcb9f9ed8))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c01428ab>] open_namei [kernel] 0x2e3 
(0xcb9f9ed8))
Jun 13 04:02:12 hfxcoplo1 kernel: [getname+96/156] getname [kernel] 0x60 
(0xcb9f9f0c))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c01414d0>] getname [kernel] 0x60 
(0xcb9f9f0c))
Jun 13 04:02:12 hfxcoplo1 kernel: [path_lookup+27/36] path_lookup [kernel] 0x1b 
(0xcb9f9f20))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c01422b7>] path_lookup [kernel] 0x1b 
(0xcb9f9f20))
Jun 13 04:02:12 hfxcoplo1 kernel: [__user_walk+36/60] __user_walk [kernel] 0x24 
(0xcb9f9f30))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c01424d8>] __user_walk [kernel] 0x24 
(0xcb9f9f30))
Jun 13 04:02:12 hfxcoplo1 kernel: [vfs_lstat+23/68] vfs_lstat [kernel] 0x17 
(0xcb9f9f44))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c013ee13>] vfs_lstat [kernel] 0x17 
(0xcb9f9f44))
Jun 13 04:02:12 hfxcoplo1 kernel: [sys_lstat64+16/40] sys_lstat64 [kernel] 0x10 
(0xcb9f9f70))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c013f484>] sys_lstat64 [kernel] 0x10 
(0xcb9f9f70))
Jun 13 04:02:12 hfxcoplo1 kernel: [system_call+51/56] system_call [kernel] 0x33 
(0xcb9f9fc0))
Jun 13 04:02:12 hfxcoplo1 kernel: [<c0108583>] system_call [kernel] 0x33 
(0xcb9f9fc0))
Jun 13 04:02:12 hfxcoplo1 kernel: 
Jun 13 04:02:12 hfxcoplo1 kernel: 
Jun 13 04:02:12 hfxcoplo1 kernel: Code: 39 53 44 8b 6d 00 75 7c 8b 44 24 24 39 
43 0c 75 73 8b 40 4c

Comment 1 Stephen Tweedie 2003-06-16 20:23:36 UTC
An oops in "d_lookup" is one of the classic signs of bad memory.  I'd try
memtest86 on this system before anything else.

Comment 2 Malcolm Amir Hussain-Gambles 2003-06-18 08:45:03 UTC
Ran memtest86, was indeed a memory fault.
Thankyou very much for your help!
memtest86 is a very very useful tool indeed.