From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030922 Description of problem: On a 2x Xeon 2.4Ghz, 1Gb RAM, Megaraid disk controller (DELL PERC4) using RAID-1 (2 disks). I get an OOPS on shutdown about half the time. The exact position varies, but it seems to always be in the final un-mount/sync. Version-Release number of selected component (if applicable): 2.6.3-1.91smp How reproducible: Sometimes Steps to Reproduce: 1. Boot system 2. Excercise disks for 10 minutes 3. Reboot Additional info: Here are the console messages and oops for 3 events. Sending all processes the KILL signal... Syncing hardware clock to system time Turning off swap: Unmounting file systems: Please stand by while rebooting the system... md: stopping all md devices. md: md0 switched to read-only mode. Unable to handle kernel paging request at virtual address f885ff20 printing eip: c012d8cc *pde = 00000000 Oops: 0002 [#1] CPU: 0 EIP: 0060:[<c012d8cc>] Not tainted EFLAGS: 00010002 EIP is at internal_add_timer+0x84/0x8c eax: f885ff20 ebx: c3974660 ecx: c39750f0 edx: f8a939a8 esi: 0003a161 edi: f8a939a8 ebp: 00000246 esp: c0377ef8 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0376000 task=c02f05c0) Stack: 00000000 c3974660 c012daf1 00000000 00041690 00000000 c0145684 00000001 0000000f c03c40b0 f7fff8d8 00000003 00000000 f8a93920 00041690 f8a939a0 c025d512 c0377f00 f8a93aa8 c3974660 f8a93920 c025d3bd c0377f5c c012e39b Call Trace: [<c012daf1>] __mod_timer+0x21d/0x2a3 [<c0145684>] slab_destroy+0x121/0x154 [<c025d512>] neigh_periodic_timer+0x155/0x167 [<c025d3bd>] neigh_periodic_timer+0x0/0x167 [<c012e39b>] run_timer_softirq+0x16a/0x1dd [<c0120127>] recalc_task_prio+0x141/0x14c [<c012a12d>] do_softirq+0x5d/0xb5 [<c011b82f>] smp_apic_timer_interrupt+0x124/0x129 [<c0105000>] _stext+0x0/0x65 [<c010c0ca>] apic_timer_interrupt+0x1a/0x20 [<c0109018>] default_idle+0x0/0x2c [<c0105000>] _stext+0x0/0x65 [<c0109041>] default_idle+0x29/0x2c [<c010909d>] cpu_idle+0x26/0x3b [<c037874b>] start_kernel+0x1b2/0x1b7 Code: 89 10 5b 89 42 04 5e c3 55 57 89 c7 56 53 83 ec 24 89 54 24 ==== Sending all processes the KILL signal... Syncing hardware clock to system time Turning off swap: Unmounting file systems: Unable to handle kernel paging request at virtual address f885ff20 printing eip: c012d8cc *pde = 00003631 Oops: 0002 [#1] CPU: 0 EIP: 0060:[<c012d8cc>] Not tainted EFLAGS: 00010002 EIP is at internal_add_timer+0x84/0x8c eax: f885ff20 ebx: c3974660 ecx: c39750f0 edx: f205c38c esi: 00038fe5 edi: f205c38c ebp: 00000246 esp: f554bd38 ds: 007b es: 007b ss: 0068 Process mount (pid: 8241, threadinfo=f554a000 task=f52932f0) Stack: 00000000 c3974660 c012daf1 00000000 00040515 000000a0 ffffffff 00000000 3720fe00 00000000 3720fc00 00000000 f205c324 f205c324 f6ed8000 f7bc3290 f8837745 00000000 0217e000 00000000 f184fa98 c021568b f205c324 f6e616c0 Call Trace: [<c012daf1>] __mod_timer+0x21d/0x2a3 [<f8837745>] scsi_dispatch_cmd+0xcb/0x280 [scsi_mod] [<c021568b>] as_remove_request+0xa0/0xab [<f883c833>] scsi_request_fn+0x29b/0x3dc [scsi_mod] [<f883c558>] scsi_prep_fn+0x123/0x163 [scsi_mod] [<c020fdda>] generic_unplug_device+0x6b/0x9b [<c020ff84>] blk_run_queues+0xbf/0x12e [<c0162123>] block_sync_page+0x5/0x8 [<c013ed60>] __lock_page+0x84/0xa7 [<c0124037>] autoremove_wake_function+0x0/0x28 [<c0124037>] autoremove_wake_function+0x0/0x28 [<c013edc1>] find_get_page+0x3e/0x83 [<c013f4fc>] do_generic_mapping_read+0x1e2/0x485 [<c01afad0>] avc_has_perm_noaudit+0x157/0x279 [<c013f79f>] file_read_actor+0x0/0xc9 [<c013fa14>] __generic_file_aio_read+0x1ac/0x1cc [<c013f79f>] file_read_actor+0x0/0xc9 [<c013fae1>] generic_file_read+0x66/0x7d [<c01b2ccf>] selinux_file_permission+0x127/0x131 [<c0164df4>] block_llseek+0x23/0xbd [<c015d74c>] vfs_read+0xb8/0xe4 [<c015d925>] sys_read+0x2c/0x42 [<c010b663>] syscall_call+0x7/0xb Code: 89 10 5b 89 42 04 5e c3 55 57 89 c7 56 53 83 ec 24 89 54 24 ==== Sending all processes the KILL signal... Syncing hardware clock to system time Turning off swap: Unmounting file systems: Please stand by while rebooting the system... md: stopping all md devices. md: md0 switched to read-only mode. Unable to handle kernel paging request at virtual address f885ff20 printing eip: c012d8cc *pde = 00303237 Oops: 0002 [#1] CPU: 0 EIP: 0060:[<c012d8cc>] Not tainted EFLAGS: 00010002 EIP is at internal_add_timer+0x84/0x8c eax: f885ff20 ebx: c3974660 ecx: c39750f0 edx: f8a939a8 esi: 0003a1a1 edi: f8a939a8 ebp: 00000246 esp: f5513d84 ds: 007b es: 007b ss: 0068 Process reboot (pid: 8073, threadinfo=f5512000 task=f5f26000) Stack: 00000000 c3974660 c012daf1 00000000 000416d0 00000000 c0145684 00000000 0000000f c03c40b0 f7fff8d8 00000003 00000000 f8a93920 000416d0 f8a939a0 c025d512 f5513d00 f8a93aa8 c3974660 f8a93920 c025d3bd f5513de8 c012e39b Call Trace: [<c012daf1>] __mod_timer+0x21d/0x2a3 [<c0145684>] slab_destroy+0x121/0x154 [<c025d512>] neigh_periodic_timer+0x155/0x167 [<c025d3bd>] neigh_periodic_timer+0x0/0x167 [<c012e39b>] run_timer_softirq+0x16a/0x1dd [<c012a12d>] do_softirq+0x5d/0xb5 [<c011b82f>] smp_apic_timer_interrupt+0x124/0x129 [<c010c0ca>] apic_timer_interrupt+0x1a/0x20 [<c0115a7a>] delay_tsc+0xb/0x13 [<c01bf9b9>] __delay+0x9/0xa [<f8830f2a>] __megaraid_shutdown+0x7f/0x92 [megaraid] [<c020d704>] device_shutdown+0x4f/0x7a [<c0133707>] sys_reboot+0x125/0x368 [<c014f83e>] handle_mm_fault+0xdf/0x1cd [<c01afea1>] inode_free_security+0xa5/0xac [<c01760fb>] destroy_inode+0x36/0x45 [<c01760fb>] destroy_inode+0x36/0x45 [<c0177632>] generic_forget_inode+0x16c/0x171 [<c0173887>] dput+0x1b/0x287 [<c015e559>] __fput+0xc4/0xe3 [<c015ced8>] filp_close+0x59/0x5f [<c015cf7e>] sys_close+0xa0/0xd3 [<c010b663>] syscall_call+0x7/0xb Code: 89 10 5b 89 42 04 5e c3 55 57 89 c7 56 53 83 ec 24 89 54 24
Timer list corruption. Does this happen without the megaraid module loaded? With a 2.6.1-xx kernel?
I need the megaraid module to boot, so I can't try it without it. I never tried 2.6.1*. Are there still RPMs for it someplace?
Oh crap, I thought so. FC 2 test 1 has a 2.6.1-1.65 smp kernel.
I was running watchdog-5.2 (the software watchdog). If I disable the software watchdog, then I don't see the Oops anymore (after 10 tries). If I try to do process monitoring (pidfile = /var/run/crond.pid), then the system goes unstable within seconds. Should I file a separate bug against the softdog module?
*** Bug 130089 has been marked as a duplicate of this bug. ***
fixed in the latest update ?
Fedora Core 2 has now reached end of life, and no further updates will be provided by Red Hat. The Fedora legacy project will be producing further kernel updates for security problems only. If this bug has not been fixed in the latest Fedora Core 2 update kernel, please try to reproduce it under Fedora Core 3, and reopen if necessary, changing the product version accordingly. Thank you.