Bug 171645

Summary: Oops kernel NULL pointer
Product: Red Hat Enterprise Linux 4 Reporter: BEA Boulder <eca-labadmin>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0575 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 21:26:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181409    

Description BEA Boulder 2005-10-24 17:50:52 UTC
Description of problem:

cat /etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 1)

rpm -q kernel
kernel-2.6.9-11.EL


Oct 19 09:03:54 usbohp380-7 kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 00000030
Oct 19 09:03:54 usbohp380-7 kernel:  printing eip:
Oct 19 09:03:54 usbohp380-7 kernel: c02c5ee4
Oct 19 09:03:54 usbohp380-7 kernel: *pde = 1a784001
Oct 19 09:03:54 usbohp380-7 kernel: Oops: 0000 [#1]
Oct 19 09:03:54 usbohp380-7 kernel: SMP 
Oct 19 09:03:54 usbohp380-7 kernel: Modules linked in: nfsd exportfs nfs 
lockd parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc md5 ipv6 dm_mod 
button battery ac joydev
 uhci_hcd ehci_hcd tg3 floppy ext3 jbd cciss sd_mod scsi_mod
Oct 19 09:03:54 usbohp380-7 kernel: CPU:    3
Oct 19 09:03:54 usbohp380-7 kernel: EIP:    0060:[<c02c5ee4>]    Not tainted 
VLI
Oct 19 09:03:54 usbohp380-7 kernel: EFLAGS: 00010286   (2.6.9-11.ELsmp) 
Oct 19 09:03:54 usbohp380-7 kernel: EIP is at _spin_lock+0x3/0x34
Oct 19 09:03:54 usbohp380-7 kernel: eax: 0000002c   ebx: 0000002c   ecx: 
00000001   edx: 000000d0
Oct 19 09:03:54 usbohp380-7 kernel: esi: 00000000   edi: 00000000   ebp: 
c143adc0   esp: c7cccd1c
Oct 19 09:03:54 usbohp380-7 kernel: ds: 007b   es: 007b   ss: 0068
Oct 19 09:03:54 usbohp380-7 kernel: Process htsearch (pid: 18567, 
threadinfo=c7ccc000 task=f3115930)
Oct 19 09:03:54 usbohp380-7 kernel: Stack: 00093c5c c014e2b3 00000000 
00000000 00000001 c33c9d60 c33d1d60 c33c9d60 
Oct 19 09:03:54 usbohp380-7 kernel:        c33ca6c0 c7cccd6c c7cccd5c 
c16a5000 c16a5000 00000246 c143adc0 c143adc0 
Oct 19 09:03:54 usbohp380-7 kernel:        c031eee0 c7ccce78 c014e4d5 
c031eee0 c01451e2 00000001 00000000 00000000 
Oct 19 09:03:54 usbohp380-7 kernel: Call Trace:
Oct 19 09:03:54 usbohp380-7 kernel:  [<c014e2b3>] 
try_to_unmap_file+0x30/0x21c
Oct 19 09:03:54 usbohp380-7 kernel:  [<c014e4d5>] try_to_unmap+0x36/0x49
Oct 19 09:03:55 usbohp380-7 kernel:  [<c01451e2>] shrink_list+0x1ba/0x3ed
Oct 19 09:03:55 usbohp380-7 kernel:  [<c01444b0>] __pagevec_release+0x15/0x1d
Oct 19 09:03:55 usbohp380-7 kernel:  [<c01455f2>] shrink_cache+0x1dd/0x34d
Oct 19 09:03:55 usbohp380-7 kernel:  [<c0145cb0>] shrink_zone+0xa7/0xb6
Oct 19 09:03:55 usbohp380-7 kernel:  [<c0145d0b>] shrink_caches+0x4c/0x57
Oct 19 09:03:55 usbohp380-7 kernel:  [<c0145e02>] 
try_to_free_pages+0xc3/0x1a7
Oct 19 09:03:55 usbohp380-7 kernel:  [<c013f9a1>] __alloc_pages+0x1b5/0x29d
Oct 19 09:03:56 usbohp380-7 kernel:  [<c013faa1>] __get_free_pages+0x18/0x24
Oct 19 09:03:56 usbohp380-7 kernel:  [<c01423f8>] kmem_getpages+0x1c/0xbb
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0142f46>] cache_grow+0xab/0x138
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0143138>] 
cache_alloc_refill+0x165/0x19d
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0143333>] kmem_cache_alloc+0x51/0x57
Oct 19 09:03:56 usbohp380-7 kernel:  [<c01206c4>] copy_process+0x4df/0xa7c
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0120d4d>] do_fork+0x8e/0x173
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0155a94>] 
generic_file_llseek+0x0/0xcb
Oct 19 09:03:56 usbohp380-7 kernel:  [<c0104966>] sys_clone+0x22/0x26
Oct 19 09:03:56 usbohp380-7 kernel:  [<c02c7377>] syscall_call+0x7/0xb
Oct 19 09:03:56 usbohp380-7 kernel: Code: c0 84 d2 0f 9f c0 c3 89 c2 f0 81 28 
00 00 00 01 0f 94 c0 84 c0 b9 01 00 00 00 75 09 f0 81 02 00 00 00 01 30 c9 89 
c8 c3 53 89 c3 <8
1> 78 04 ad 4e ad de 74 18 ff 74 24 04 68 2a 97 2d c0 e8 db ba 
Oct 19 09:03:57 usbohp380-7 kernel:  <0>Fatal exception: panic in 5 seconds

Comment 1 Larry Woodman 2005-11-01 21:24:26 UTC
Can someone tell me if this is reproducable or not?  From what I can tell,
try_to_unmap() called try_to_unmap_file() which passed the NULL
page->mapping->i_mmap_lock to spin_lock() and that caused the OOPs.  Without
being able to reproduce this I cant really debug the problem.

Thanks, Larry Woodman



Comment 2 BEA Boulder 2005-11-02 18:01:51 UTC
Fault occurred during a webcrawler search of the server. I am attempting to 
trace the source of crawler session for replay.

Excerpt of Apache access logs with valid URL of the form 
"http://usbohp380-7/cgi-bin/htsearch";:

grep /cgi-bin/htsearch/ /etc/httpd/logs/access_log*

access_log.2:206.189.193.220 - - [19/Oct/2005:08:59:53 -0600] "GET 
/cgi-bin/htsearch?exclude=%60/etc/passwd%60 HTTP/1.0" 200 391 "-" "-"
access_log.2:206.189.193.220 - - [19/Oct/2005:09:00:04 -0600] "GET 
/cgi-bin/htsearch?-c/nonexistent HTTP/1.0" 200 391 "-" "-"

Only these entries are non "404" and run "htsearch". Neither one reproduces 
the system crash but they do fail with the following:

ht://Dig error
htsearch detected an error. Please report this to the webmaster of this site 
by sending an e-mail to: root@localhost The error message is:

Unable to read word database file '/var/lib/htdig/db.words.db'
Did you run htdig?


Comment 10 Bob Johnson 2006-04-11 17:12:23 UTC
This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.

Comment 11 Jason Baron 2006-04-19 15:07:42 UTC
committed in stream U4 build 34.19. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/ However, there is a *serious* slab
corruption issue with this kernel, and thus it should not be released to
customers under any circumstances. I'll update this bug when the kernel is
stable again.


Comment 12 Jason Baron 2006-04-19 20:06:02 UTC
We've identified the corruption as specfic to x86-64 smp kernel builds 34.16 and
34.17. All other builds are safe for consumption.


Comment 15 Red Hat Bugzilla 2006-08-10 21:26:35 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html