Description of problem: A process locked up in the kernel. Version-Release number of selected component (if applicable): 2.4.20-20.9smp How reproducible: Rarely. Steps to Reproduce: 1. I was using anytopnm. But I have used this many times before with no difficulties. Actual results: Kernel lock-up on one processor. The other processor continued to function. Expected results: No lock-up. Additional info: Found this in /var/log/messages (twice): kernel BUG at page_alloc.c:139! invalid operand: 0000 usb-uhci nls_iso8859-1 udf snd-pcm-oss ppp_synctty ppp_async ppp_generic slhc snd-mixer-oss radeon agpgart ipt_state ipt_MASQUERADE iptable_nat ip_conntrack w CPU: 0 EIP: 0060:[<c01490fb>] Not tainted EFLAGS: 00210282 EIP is at __free_pages_ok [kernel] 0xeb (2.4.20-20.9smp) eax: 02001010 ebx: c1d31f68 ecx: c1000030 edx: e190f680 esi: 00000000 edi: 00000000 ebp: 00000000 esp: f15e3e74 ds: 0068 es: 0068 ss: 0068 Process anytopnm (pid: 4818, stackpage=f15e3000) Stack: c63eea80 fffee4cc 4213320c c72d8700 3f571067 c0136aeb c72d8700 3f571067 4213320c 00000001 3e34d163 c1d31f68 00000025 c1d31f68 c0138089 c72d8700 4213320c c1ddb0e8 c1ddb0e8 df3bdc00 efaeb420 ce1eb680 4213320c 00000001 Call Trace: [<c0136aeb>] vm_set_pte [kernel] 0x3b (0xf15e3e88)) [<c0138089>] do_wp_page [kernel] 0x159 (0xf15e3eac)) [<c0138fce>] handle_mm_fault [kernel] 0x11e (0xf15e3ed4)) [<c011c508>] do_page_fault [kernel] 0x188 (0xf15e3f04)) [<c0130bf9>] sys_rt_sigaction [kernel] 0xa9 (0xf15e3f60)) [<c0127565>] sys_wait4 [kernel] 0x1e5 (0xf15e3f74)) [<c0108d2d>] sys_sigreturn [kernel] 0xed (0xf15e3f94)) [<c011c380>] do_page_fault [kernel] 0x0 (0xf15e3fb0)) [<c01099c0>] error_code [kernel] 0x34 (0xf15e3fb8)) Code: 0f 0b 8b 00 6d 62 28 c0 b8 02 00 00 00 f0 0f b3 43 18 b8 04 ------------[ cut here ]------------ kernel BUG at page_alloc.c:139! invalid operand: 0000 usb-uhci nls_iso8859-1 udf snd-pcm-oss ppp_synctty ppp_async ppp_generic slhc snd-mixer-oss radeon agpgart ipt_state ipt_MASQUERADE iptable_nat ip_conntrack w CPU: 0 EIP: 0060:[<c01490fb>] Not tainted EFLAGS: 00210282 EIP is at __free_pages_ok [kernel] 0xeb (2.4.20-20.9smp) eax: 02001010 ebx: c1d31f68 ecx: c1000030 edx: e190f680 esi: 00000000 edi: 00000000 ebp: 00000000 esp: f15e3bb0 ds: 0068 es: 0068 ss: 0068 Process anytopnm (pid: 4821, stackpage=f15e3000) Stack: c0344680 00000002 c016a300 df3bd200 c0344680 c1c40030 c0345850 fffee4cc 00000163 00000000 c1d31f68 00000163 00000000 fffee4cc c0136bec c1d31f68 3c521045 c0139886 cb745780 42000000 fffee4cc c1df20d8 00000001 00000000 Call Trace: [<c016a300>] dput [kernel] 0x30 (0xf15e3bb8)) [<c0136bec>] __free_pte [kernel] 0x4c (0xf15e3be8)) [<c0139886>] zap_pte_range [kernel] 0x226 (0xf15e3bf4)) [<c01373cb>] zap_page_range [kernel] 0x10b (0xf15e3c20)) [<c013b050>] exit_mmap [kernel] 0xd0 (0xf15e3c64)) [<c015cf3d>] exec_mmap [kernel] 0x1fd (0xf15e3c88)) [<c0120e31>] unshare_files [kernel] 0x31 (0xf15e3c90)) [<c015cf9d>] flush_old_exec [kernel] 0x4d (0xf15e3ca4)) [<c0178bcc>] load_elf_binary [kernel] 0x2cc (0xf15e3cbc)) [<f8823237>] ext3_do_update_inode [ext3] 0x177 (0xf15e3ce4)) [<f880e02c>] journal_get_write_access_Rsmp_2b583cf6 [jbd] 0x5c (0xf15e3d04)) [<c014986d>] __alloc_pages [kernel] 0x7d (0xf15e3da8)) [<c0178900>] load_elf_binary [kernel] 0x0 (0xf15e3df0)) [<c015d674>] search_binary_handler [kernel] 0x124 (0xf15e3dfc)) [<c015d88b>] do_execve [kernel] 0x17b (0xf15e3e44)) [<c0107d90>] sys_execve [kernel] 0x50 (0xf15e3fa4)) [<c01098cf>] system_call [kernel] 0x33 (0xf15e3fc0)) Code: 0f 0b 8b 00 6d 62 28 c0 b8 02 00 00 00 f0 0f b3 43 18 b8 04
does it happen without alsa too ?
I don't know. I use Alsa all the time, which could explain the occasional lock-ups. Except I recently upgraded from a Pentium-3 to a Pentium-4 and switched to an SMP kernel. I assumed the cause of the lock-ups was miscoordination between the processors because I experienced a lot more lock-ups since upgrading. For example, see Bug 68673 -- I doubt that had anything to do with Alsa. On the other hand, when the lock-up described by this bug occurred, I was running a script that heavily loaded the filesystem for 20 minutes, and it has locked up several times in the past. If you think it could still be Alsa (I am using 0.9.8), then I can try disabling it, though it could take a while for another lock-up as they are unpredictable. Also, I sure would appreciate some tips on tracking down kernel lock-ups. I usually don't get anything in the log or an Oops screen. For all I know, most of them could be X lock-ups.
I have recently encountered this issue running the SMP kernel on a dual-Opteron system. In this case, the processing running was crond. [root@kazeon log]# uname -r 2.4.20-30.9smp Excerpt from /var/log/messages Mar 21 04:30:02 kazeon kernel: ------------[ cut here ]------------ Mar 21 04:30:02 kazeon kernel: kernel BUG at page_alloc.c:139! Mar 21 04:30:02 kazeon kernel: invalid operand: 0000 Mar 21 04:30:02 kazeon kernel: nfsd ide-cd cdrom parport_pc lp parport autofs nfs lockd sunrpc tg3 keybdev mousedev hid input usb-ohci usbcore ext3 jbd dpt_i2o sd_mod scsi_mod Mar 21 04:30:02 kazeon kernel: CPU: 1 Mar 21 04:30:02 kazeon kernel: EIP: 0060:[<c014935b>] Not tainted Mar 21 04:30:02 kazeon kernel: EFLAGS: 00010286 Mar 21 04:30:02 kazeon kernel: Mar 21 04:30:02 kazeon kernel: EIP is at __free_pages_ok [kernel] 0xeb (2.4.20-30.9smp) Mar 21 04:30:02 kazeon kernel: eax: 02001018 ebx: c1dc7818 ecx: c1000030 edx: f6ba8d00 Mar 21 04:30:02 kazeon kernel: esi: 00000000 edi: 00000000 ebp: 00000000 esp: f591be0c Mar 21 04:30:02 kazeon kernel: ds: 0068 es: 0068 ss: 0068 Mar 21 04:30:02 kazeon kernel: Process crond (pid: 22479, stackpage=f591b000) Mar 21 04:30:02 kazeon kernel: Stack: f07c9e80 ef550045 bfc02000 ffffffff 00000002 bfc02000 c01397e0 fffe4ffc Mar 21 04:30:02 kazeon kernel: ffffffff 00053e2e c1dc7818 00000003 c0000000 00000003 c0136c0c c1dc7818 Mar 21 04:30:02 kazeon kernel: 00000002 c0137485 c80c2680 de8c2bfc bfc00000 00003000 c03fd000 f591a000 Mar 21 04:30:02 kazeon kernel: Call Trace: [<c01397e0>] zap_pte_range [kernel] 0x160 (0xf591be24)) Mar 21 04:30:02 kazeon kernel: [<c0136c0c>] __free_pte [kernel] 0x4c (0xf591be44)) Mar 21 04:30:02 kazeon kernel: [<c0137485>] zap_page_range [kernel] 0x1a5 (0xf591be50)) Mar 21 04:30:02 kazeon kernel: [<c013b080>] exit_mmap [kernel] 0xd0 (0xf591be94)) Mar 21 04:30:02 kazeon kernel: [<c0120812>] mmput [kernel] 0x62 (0xf591beb8)) Mar 21 04:30:02 kazeon kernel: [<c0126b86>] do_exit [kernel] 0x136 (0xf591bec8))Mar 21 04:30:02 kazeon kernel: [<c0126f0b>] do_group_exit [kernel] 0x8b (0xf591bee4)) Mar 21 04:30:02 kazeon kernel: [<c012fd2f>] get_signal_to_deliver [kernel] 0x1df (0xf591bef8)) Mar 21 04:30:02 kazeon kernel: [<c0109634>] do_signal [kernel] 0x64 (0xf591bf20)) Mar 21 04:30:02 kazeon kernel: [<c010ee5a>] call_reschedule_interrupt [kernel] 0x5 (0xf591bf64)) Mar 21 04:30:02 kazeon kernel: [<c011e3ef>] schedule [kernel] 0x19f (0xf591bf8c)) Mar 21 04:30:02 kazeon kernel: [<c011c380>] do_page_fault [kernel] 0x0 (0xf591bfb0)) Mar 21 04:30:02 kazeon kernel: [<c011c380>] do_page_fault [kernel] 0x0 (0xf591bfbc)) Mar 21 04:30:02 kazeon kernel: [<c0109908>] signal_return [kernel] 0x14 (0xf591bfc0)) Mar 21 04:30:02 kazeon kernel: Mar 21 04:30:02 kazeon kernel: Mar 21 04:30:02 kazeon kernel: Code: 0f 0b 8b 00 1a 66 28 c0 b8 02 00 00 00 f0 0f b3 43 18 b8 04
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/