Bug 142091 - (IT_57995) CAN-2004-1237 kernel oops captured, system hangs
CAN-2004-1237 kernel oops captured, system hangs
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Peter Martuccelli
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-12-07 03:42 EST by Bernd Bartmann
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-07 16:33:37 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bernd Bartmann 2004-12-07 03:42:52 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5)
Gecko/20041107 Firefox/1.0

Description of problem:
I'm getting random oopses on one of my machine. System is RHES3 with
current errata kernel-2.4.21-20.0.1.EL.i686. After the oops has
occured I can still ping the machine but no network (ssh) or local
service (console login) is responding. The screen remains just black.
Num Lock and Scoll Lock blink together.

Tonight I managed to capture the oops message via serial console:

Unable to handle kernel NULL pointer dereference at virtual address
00000008
 printing eip:
e09fb556
*pde = 00000000
Oops: 0000
parport_pc lp parport audit e1000 floppy sg scsi_mod microcode keybdev
mousedev hid input usb-uhci usbcore ext3 jbd  
CPU:    0
EIP:    0060:[<e09fb556>]    Not tainted
EFLAGS: 00010246

EIP is at __audit_get_target [audit] 0x1f6 (2.4.21-20.0.1.EL/i686)
eax: 00000000   ebx: c0551e70   ecx: 00000003   edx: c0551f6c
esi: c5a0ae9c   edi: 00000203   ebp: c0551e30   esp: c0551da8
ds: 0068   es: 0068   ss: 0068
Process schedule_au (pid: 8266, stackpage=c0551000)
Stack: c5a0ae3c 00000000 00000000 df10bb00 c0551e00 c0551e70 dfa43a00
dfa43a00 
       e09fbbe8 00000203 c0551f6c c0551e70 c5a0ae9c 00000000 00000000
00000000 
       00000005 c5a0ae9c 00000000 dee86980 c5a0aeb4 dee869a4 c0551df0
dee86100 
Call Trace:   [<e09fbbe8>] audit_filter_eval [audit] 0x2a8 (0xc0551dc8)
[<c014a0d0>] __alloc_pages_limit [kernel] 0x60 (0xc0551e58)
[<c014a1f0>] __alloc_pages [kernel] 0xe0 (0xc0551e6c)
[<c0150274>] __pte_chain_free [kernel] 0x24 (0xc0551ea8)
[<c0137097>] do_wp_page [kernel] 0x1f7 (0xc0551eb4)
[<c01381f8>] handle_mm_fault [kernel] 0x188 (0xc0551edc)
[<c0130793>] in_group_p [kernel] 0x23 (0xc0551ee4)
[<c015c951>] cp_new_stat64 [kernel] 0x101 (0xc0551f0c)
[<e09fe244>] audit_lock [audit] 0x0 (0xc0551f30)
[<e09f6f30>] __audit_policy_check [audit] 0x40 (0xc0551f40)
[<e09f6f6a>] audit_policy_check [audit] 0x2a (0xc0551f4c)
[<e09f683b>] __audit_syscall_return [audit] 0x7b (0xc0551f5c)
[<c015ffb7>] getname [kernel] 0x87 (0xc0551f88)
[<e09f6c8a>] __audit_result [audit] 0x3a (0xc0551f98)
[<c01e2b01>] audit_result [kernel] 0x41 (0xc0551fa8)
[<c0110040>] syscall_trace_leave [kernel] 0x40 (0xc0551fb4)

Code: 8b 48 08 85 c9 0f 84 87 00 00 00 8d 87 00 fe ff ff 83 f8 06

Kernel panic: Fatal exception


Version-Release number of selected component (if applicable):
kernel-2.4.21-20.0.1.EL.i686

How reproducible:
Sometimes

Steps to Reproduce:
1. just wait for the system to hang
2.
3.
    

Additional info:
Comment 1 Bernd Bartmann 2004-12-07 11:07:53 EST
And yet another oops:

Unable to handle kernel NULL pointer dereference at virtual address
00000008
 printing eip:
e09fb556
*pde = 00000000
Oops: 0000
parport_pc lp parport audit e1000 floppy sg scsi_mod microcode keybdev
mousedev hid input usb-uhci usbcore ext3 jbd
CPU:    0
EIP:    0060:[<e09fb556>]    Not tainted
EFLAGS: 00010246

EIP is at __audit_get_target [audit] 0x1f6 (2.4.21-20.0.1.EL/i686)
eax: 00000000   ebx: ce699e70   ecx: 00000003   edx: ce699f6c
esi: c085dc9c   edi: 00000203   ebp: ce699e30   esp: ce699da8
ds: 0068   es: 0068   ss: 0068
Process schedule_au (pid: 11050, stackpage=ce699000)
Stack: c085dc3c 00000000 00000000 ddee4600 ce699e00 ce699e70 df57e980
df57e980
       e09fbbe8 00000203 ce699f6c ce699e70 c085dc9c 00000000 00000000
00000000
       00000005 c085dc9c 00000000 de9c0980 c085dcb4 de9c09a4 ce699df0
de9c0100
Call Trace:   [<e09fbbe8>] audit_filter_eval [audit] 0x2a8 (0xce699dc8)
[<c014a0d0>] __alloc_pages_limit [kernel] 0x60 (0xce699e58)
[<c014a224>] __alloc_pages [kernel] 0x114 (0xce699e6c)
[<c0150274>] __pte_chain_free [kernel] 0x24 (0xce699ea8)
[<c0137097>] do_wp_page [kernel] 0x1f7 (0xce699eb4)
[<c01381f8>] handle_mm_fault [kernel] 0x188 (0xce699edc)
[<c0130793>] in_group_p [kernel] 0x23 (0xce699ee4)
[<c015c951>] cp_new_stat64 [kernel] 0x101 (0xce699f0c)
[<e09fe244>] audit_lock [audit] 0x0 (0xce699f30)
[<e09f6f30>] __audit_policy_check [audit] 0x40 (0xce699f40)
[<e09f6f6a>] audit_policy_check [audit] 0x2a (0xce699f4c)
[<e09f683b>] __audit_syscall_return [audit] 0x7b (0xce699f5c)
[<c015ffb7>] getname [kernel] 0x87 (0xce699f88)
[<e09f6c8a>] __audit_result [audit] 0x3a (0xce699f98)
[<c01e2b01>] audit_result [kernel] 0x41 (0xce699fa8)
[<c0110040>] syscall_trace_leave [kernel] 0x40 (0xce699fb4)

Code: 8b 48 08 85 c9 0f 84 87 00 00 00 8d 87 00 fe ff ff 83 f8 06

Kernel panic: Fatal exception
Comment 2 Bernd Bartmann 2004-12-07 15:26:26 EST
Ok, I now seem to have found something that triggers the oops.
The machine is a test pc where I evaluate Trendmicro InterScan
Messaging Suite 5.5 SP2 and InterScan Web Security Suite 2.0.

The problem seems to always occur when a new hour starts. Via a job in
root's crontab InterScan then tries to download the latest virus
pattern files. The crash seems to only occur when there really is a
new pattern file available.

You can download 30-day eval-version of IWSS and IMSS at:
http://www.trendmicro.com/download/trial/trial-us.asp?id=34
http://www.trendmicro.com/download/trial/trial-us.asp?id=12
Comment 3 Ernie Petrides 2004-12-07 16:33:37 EST
A fix for this problem has already been committed to the RHEL3 U5
patch pool on 15-Nov-2004 (in kernel version 2.4.21-25.1.EL).

To work around this problem before U5 is released (a few months
from now), please disable auditing with the following commands:

    chkconfig audit off
    service audit stop


*** This bug has been marked as a duplicate of 132245 ***
Comment 4 Bernd Bartmann 2004-12-09 06:21:55 EST
After stopping service audit my /var/log/cron log file is being
flooded with lines like:

Dec  9 12:15:01 beverly CROND[20002]: LAuS error - do_command.c:226 -
laus_attach: (19) laus_attach: No such device 
Dec  9 12:16:00 beverly CROND[20136]: LAuS error - do_command.c:175 -
laus_log: (19) laus_log: No such device 
Comment 5 David Lawrence 2005-01-18 18:52:35 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-043.html

Note You need to log in before you can comment on or make changes to this bug.