Red Hat Bugzilla – Bug 142091
CAN-2004-1237 kernel oops captured, system hangs
Last modified: 2007-11-30 17:07:05 EST
From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Description of problem: I'm getting random oopses on one of my machine. System is RHES3 with current errata kernel-2.4.21-20.0.1.EL.i686. After the oops has occured I can still ping the machine but no network (ssh) or local service (console login) is responding. The screen remains just black. Num Lock and Scoll Lock blink together. Tonight I managed to capture the oops message via serial console: Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: e09fb556 *pde = 00000000 Oops: 0000 parport_pc lp parport audit e1000 floppy sg scsi_mod microcode keybdev mousedev hid input usb-uhci usbcore ext3 jbd CPU: 0 EIP: 0060:[<e09fb556>] Not tainted EFLAGS: 00010246 EIP is at __audit_get_target [audit] 0x1f6 (2.4.21-20.0.1.EL/i686) eax: 00000000 ebx: c0551e70 ecx: 00000003 edx: c0551f6c esi: c5a0ae9c edi: 00000203 ebp: c0551e30 esp: c0551da8 ds: 0068 es: 0068 ss: 0068 Process schedule_au (pid: 8266, stackpage=c0551000) Stack: c5a0ae3c 00000000 00000000 df10bb00 c0551e00 c0551e70 dfa43a00 dfa43a00 e09fbbe8 00000203 c0551f6c c0551e70 c5a0ae9c 00000000 00000000 00000000 00000005 c5a0ae9c 00000000 dee86980 c5a0aeb4 dee869a4 c0551df0 dee86100 Call Trace: [<e09fbbe8>] audit_filter_eval [audit] 0x2a8 (0xc0551dc8) [<c014a0d0>] __alloc_pages_limit [kernel] 0x60 (0xc0551e58) [<c014a1f0>] __alloc_pages [kernel] 0xe0 (0xc0551e6c) [<c0150274>] __pte_chain_free [kernel] 0x24 (0xc0551ea8) [<c0137097>] do_wp_page [kernel] 0x1f7 (0xc0551eb4) [<c01381f8>] handle_mm_fault [kernel] 0x188 (0xc0551edc) [<c0130793>] in_group_p [kernel] 0x23 (0xc0551ee4) [<c015c951>] cp_new_stat64 [kernel] 0x101 (0xc0551f0c) [<e09fe244>] audit_lock [audit] 0x0 (0xc0551f30) [<e09f6f30>] __audit_policy_check [audit] 0x40 (0xc0551f40) [<e09f6f6a>] audit_policy_check [audit] 0x2a (0xc0551f4c) [<e09f683b>] __audit_syscall_return [audit] 0x7b (0xc0551f5c) [<c015ffb7>] getname [kernel] 0x87 (0xc0551f88) [<e09f6c8a>] __audit_result [audit] 0x3a (0xc0551f98) [<c01e2b01>] audit_result [kernel] 0x41 (0xc0551fa8) [<c0110040>] syscall_trace_leave [kernel] 0x40 (0xc0551fb4) Code: 8b 48 08 85 c9 0f 84 87 00 00 00 8d 87 00 fe ff ff 83 f8 06 Kernel panic: Fatal exception Version-Release number of selected component (if applicable): kernel-2.4.21-20.0.1.EL.i686 How reproducible: Sometimes Steps to Reproduce: 1. just wait for the system to hang 2. 3. Additional info:
And yet another oops: Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: e09fb556 *pde = 00000000 Oops: 0000 parport_pc lp parport audit e1000 floppy sg scsi_mod microcode keybdev mousedev hid input usb-uhci usbcore ext3 jbd CPU: 0 EIP: 0060:[<e09fb556>] Not tainted EFLAGS: 00010246 EIP is at __audit_get_target [audit] 0x1f6 (2.4.21-20.0.1.EL/i686) eax: 00000000 ebx: ce699e70 ecx: 00000003 edx: ce699f6c esi: c085dc9c edi: 00000203 ebp: ce699e30 esp: ce699da8 ds: 0068 es: 0068 ss: 0068 Process schedule_au (pid: 11050, stackpage=ce699000) Stack: c085dc3c 00000000 00000000 ddee4600 ce699e00 ce699e70 df57e980 df57e980 e09fbbe8 00000203 ce699f6c ce699e70 c085dc9c 00000000 00000000 00000000 00000005 c085dc9c 00000000 de9c0980 c085dcb4 de9c09a4 ce699df0 de9c0100 Call Trace: [<e09fbbe8>] audit_filter_eval [audit] 0x2a8 (0xce699dc8) [<c014a0d0>] __alloc_pages_limit [kernel] 0x60 (0xce699e58) [<c014a224>] __alloc_pages [kernel] 0x114 (0xce699e6c) [<c0150274>] __pte_chain_free [kernel] 0x24 (0xce699ea8) [<c0137097>] do_wp_page [kernel] 0x1f7 (0xce699eb4) [<c01381f8>] handle_mm_fault [kernel] 0x188 (0xce699edc) [<c0130793>] in_group_p [kernel] 0x23 (0xce699ee4) [<c015c951>] cp_new_stat64 [kernel] 0x101 (0xce699f0c) [<e09fe244>] audit_lock [audit] 0x0 (0xce699f30) [<e09f6f30>] __audit_policy_check [audit] 0x40 (0xce699f40) [<e09f6f6a>] audit_policy_check [audit] 0x2a (0xce699f4c) [<e09f683b>] __audit_syscall_return [audit] 0x7b (0xce699f5c) [<c015ffb7>] getname [kernel] 0x87 (0xce699f88) [<e09f6c8a>] __audit_result [audit] 0x3a (0xce699f98) [<c01e2b01>] audit_result [kernel] 0x41 (0xce699fa8) [<c0110040>] syscall_trace_leave [kernel] 0x40 (0xce699fb4) Code: 8b 48 08 85 c9 0f 84 87 00 00 00 8d 87 00 fe ff ff 83 f8 06 Kernel panic: Fatal exception
Ok, I now seem to have found something that triggers the oops. The machine is a test pc where I evaluate Trendmicro InterScan Messaging Suite 5.5 SP2 and InterScan Web Security Suite 2.0. The problem seems to always occur when a new hour starts. Via a job in root's crontab InterScan then tries to download the latest virus pattern files. The crash seems to only occur when there really is a new pattern file available. You can download 30-day eval-version of IWSS and IMSS at: http://www.trendmicro.com/download/trial/trial-us.asp?id=34 http://www.trendmicro.com/download/trial/trial-us.asp?id=12
A fix for this problem has already been committed to the RHEL3 U5 patch pool on 15-Nov-2004 (in kernel version 2.4.21-25.1.EL). To work around this problem before U5 is released (a few months from now), please disable auditing with the following commands: chkconfig audit off service audit stop *** This bug has been marked as a duplicate of 132245 ***
After stopping service audit my /var/log/cron log file is being flooded with lines like: Dec 9 12:15:01 beverly CROND[20002]: LAuS error - do_command.c:226 - laus_attach: (19) laus_attach: No such device Dec 9 12:16:00 beverly CROND[20136]: LAuS error - do_command.c:175 - laus_log: (19) laus_log: No such device
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-043.html