Bug 141035
Summary: | kernel: kernel BUG at mm/rmap.c:477! | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Andrew Matthews <exstatica> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED ERRATA | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3 | CC: | marius.andreiana, markku.kolkka, menscher, pfrields, riel, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-12-07 07:00:33 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Andrew Matthews
2004-11-28 18:20:03 UTC
any chance you can dig out the first oops out of the log ? it's a really hard crash so nothing gets logged to the syslog, It just goes from crond messages to syslogd restart. here is another screen shot of a crash that happened this morning. http://www.exstatica.net/post/mailgate1_11-29-2004.gif oh, no scrollback in that terminal session you're using? I assumed it was a serial console. Hmm. Any chance you could hook one up? Without the top of the first oops, its hard to tell where to start digging. I have another server same model, that is crashing exactly the same as the previous screenshot. I'm using KVM over IP, i might be able to enable console redirection. But i don't have a scrollback as of now, I'll see what i can do later this afternoon, but it's strange that both servers same os, and same hardware are getting the same error. i found some more log info when a server crashed just now Dec 1 01:40:53 mailgate1 kernel: ------------[ cut here ]------------ Dec 1 01:40:53 mailgate1 kernel: kernel BUG at mm/rmap.c:477! Dec 1 01:40:53 mailgate1 kernel: invalid operand: 0000 [#1] Dec 1 01:40:53 mailgate1 kernel: SMP Dec 1 01:40:53 mailgate1 kernel: Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc button battery ac ohci_hcd e1000 floppy sg dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod mptscsih mptbase sd_mod scsi_mod Dec 1 01:40:53 mailgate1 kernel: CPU: 3 Dec 1 01:40:53 mailgate1 kernel: EIP: 0060:[<0214ba99>] Not tainted VLI Dec 1 01:40:53 mailgate1 kernel: EFLAGS: 00010286 (2.6.9-1.681_FC3smp) Dec 1 01:40:53 mailgate1 kernel: EIP is at page_remove_rmap+0x23/0x4a Dec 1 01:40:53 mailgate1 kernel: eax: ffffffff ebx: 0004f1be ecx: 04031100 edx: 039e37c0 Dec 1 01:40:53 mailgate1 kernel: esi: 039e37c0 edi: 48881ae8 ebp: 00000000 esp: 3ee6aec4 Dec 1 01:40:53 mailgate1 kernel: ds: 007b es: 007b ss: 0068 Dec 1 01:40:53 mailgate1 kernel: Process MailScanner (pid: 22485, threadinfo=3ee6a000 task=73b1cd90) Dec 1 01:40:53 mailgate1 kernel: Stack: 02145877 4f1be067 00000000 0001a000 08543000 04031100 3c566980 3c566980 Dec 1 01:40:53 mailgate1 kernel: 08543000 08566000 47e7a218 04031100 02145973 00023000 00000000 08543000 Dec 1 01:40:53 mailgate1 kernel: 7f5d3de8 08566000 04031100 021459d2 00023000 00000000 3ee6af78 08543000 Dec 1 01:40:53 mailgate1 kernel: Call Trace: Dec 1 01:40:54 mailgate1 kernel: [<02145877>] zap_pte_range+0x206/0x2a9 Dec 1 01:40:54 mailgate1 kernel: [<02145973>] zap_pmd_range+0x59/0x7c Dec 1 01:40:54 mailgate1 kernel: [<021459d2>] unmap_page_range+0x3c/0x5f Dec 1 01:40:54 mailgate1 kernel: [<02145ae6>] unmap_vmas+0xf1/0x205 Dec 1 01:40:54 mailgate1 kernel: [<02149e03>] exit_mmap+0x79/0x148 Dec 1 01:40:54 mailgate1 kernel: [<0211e4db>] mmput+0x4e/0x72 Dec 1 01:40:54 mailgate1 kernel: [<021222e0>] do_exit+0x1f1/0x3bd Dec 1 01:40:54 mailgate1 kernel: [<0212259a>] sys_exit_group+0x0/0xd Dec 1 01:40:54 mailgate1 kernel: Code: 3b 02 ff 42 10 51 9d c3 89 c2 8b 00 f6 c4 08 74 08 0f 0b da 01 d8 12 2d 02 f0 83 42 08 ff 0f 98 c0 84 c0 74 2c 8b 42 08 40 79 08 <0f> 0b dd 01 d8 12 2d 02 9c 59 fa b8 00 f0 ff ff 21 e0 8b 40 10 Dec 1 01:40:54 mailgate1 kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Dec 1 01:40:54 mailgate1 kernel: in_atomic():1[expected: 0], irqs_disabled():0 Dec 1 01:40:54 mailgate1 kernel: [<0211df82>] __might_sleep+0x7d/0x87 Dec 1 01:40:54 mailgate1 kernel: [<02120b56>] profile_task_exit+0x1a/0x4a Dec 1 01:40:54 mailgate1 kernel: [<02122107>] do_exit+0x18/0x3bd Dec 1 01:40:54 mailgate1 kernel: [<021063eb>] do_divide_error+0x0/0xea Dec 1 01:40:54 mailgate1 kernel: [<021066cd>] do_invalid_op+0x0/0xd5 Dec 1 01:40:54 mailgate1 kernel: [<021066cd>] do_invalid_op+0x0/0xd5 Dec 1 01:40:54 mailgate1 kernel: [<02106799>] do_invalid_op+0xcc/0xd5 Dec 1 01:40:54 mailgate1 kernel: [<0214ba99>] page_remove_rmap+0x23/0x4a "Me Too". This is painful because it is a remote server in another country that I have no physical access to. The kernel 2.6.9-1.11_FC2 Jan 18 23:10:41 mog kernel: ------------[ cut here ]------------ Jan 18 23:10:41 mog kernel: kernel BUG at mm/rmap.c:479! Jan 18 23:10:41 mog kernel: invalid operand: 0000 [#1] Jan 18 23:10:41 mog kernel: Modules linked in: loop autofs4 sunrpc via_rhine mii ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables sg scsi_mod dm_mod button battery ac ext3 jbd Jan 18 23:10:41 mog kernel: CPU: 0 Jan 18 23:10:41 mog kernel: EIP: 0060:[<c0155f70>] Not tainted VLI Jan 18 23:10:41 mog kernel: EFLAGS: 00010286 (2.6.9-1.11_FC2) Jan 18 23:10:41 mog kernel: EIP is at page_remove_rmap+0x22/0x36 Jan 18 23:10:41 mog kernel: eax: ffffffff ebx: c118f8e0 ecx: c11cf840 edx: c118f8e0 Jan 18 23:10:41 mog kernel: esi: 00000000 edi: 00026000 ebp: ce1213c8 esp: c052fdc4 Jan 18 23:10:41 mog kernel: ds: 007b es: 007b ss: 0068 Jan 18 23:10:41 mog kernel: Process yahgetposts (pid: 1862, threadinfo=c052f000 task=c5562760) Jan 18 23:10:41 mog kernel: Stack: c014eec8 0c7c7005 00069000 008cc000 c03de0b4 008cc000 00935000 cd27100c Jan 18 23:10:41 mog kernel: c03de0b4 c014ef63 00069000 00000000 008cc000 cd27100c 00935000 c03de0b4 Jan 18 23:10:41 mog kernel: c014efba 00069000 00000000 00069000 008cc000 caeea1c8 c052fe6c c014f0c8 Jan 18 23:10:41 mog kernel: Call Trace: Jan 18 23:10:41 mog kernel: [<c014eec8>] zap_pte_range+0x1bd/0x221 Jan 18 23:10:41 mog kernel: [<c014ef63>] zap_pmd_range+0x37/0x52 Jan 18 23:10:41 mog kernel: [<c014efba>] unmap_page_range+0x3c/0x57 Jan 18 23:10:41 mog kernel: [<c014f0c8>] unmap_vmas+0xf3/0x1e2 Jan 18 23:10:41 mog kernel: [<c0153e51>] exit_mmap+0xb8/0x1d1 Jan 18 23:10:41 mog kernel: [<c011cf90>] mmput+0xb3/0xd6 Jan 18 23:10:41 mog kernel: [<c016cedb>] exec_mmap+0x2b7/0x2d3 Jan 18 23:10:41 mog kernel: [<c016db84>] flush_old_exec+0xa67/0xda3 Jan 18 23:10:41 mog kernel: [<c016118f>] vfs_read+0xdc/0xe4 Jan 18 23:10:41 mog kernel: [<c016cc1a>] kernel_read+0x31/0x3b Jan 18 23:10:41 mog kernel: [<c0191197>] load_elf_binary+0x50d/0xba2 Jan 18 23:10:41 mog kernel: [<c016c761>] copy_strings+0x1e2/0x1ee Jan 18 23:10:41 mog kernel: [<c016e1d5>] search_binary_handler+0x72/0x1a4 Jan 18 23:10:41 mog kernel: [<c016e473>] do_execve+0x16c/0x1fa Jan 18 23:10:41 mog kernel: [<c01048fc>] sys_execve+0x2a/0x6f Jan 18 23:10:41 mog kernel: [<c01062fb>] syscall_call+0x7/0xb Jan 18 23:10:41 mog kernel: Code: ff 05 30 28 40 c0 50 9d c3 89 c2 8b 00 f6 c4 08 74 08 0f 0b dc 01 10 58 31 c0 83 42 08 ff 0f 98 c0 84 c0 74 19 8b 42 08 40 79 08 <0f> 0b df 01 10 58 31 c0 9c 58 fa ff 0d 30 28 40 c0 50 9d c3 55 Jan 18 23:10:41 mog kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Jan 18 23:10:41 mog kernel: in_atomic():1[expected: 0], irqs_disabled():0 Jan 18 23:10:41 mog kernel: [<c011c662>] __might_sleep+0x82/0x8c Jan 18 23:10:41 mog kernel: [<c012065e>] profile_task_exit+0x1a/0x48 Jan 18 23:10:41 mog kernel: [<c012232c>] do_exit+0x18/0x59e Jan 18 23:10:41 mog kernel: [<c0106c73>] do_divide_error+0x0/0xea Jan 18 23:10:41 mog kernel: [<c0106f55>] do_invalid_op+0x0/0xd5 Jan 18 23:10:41 mog kernel: [<c013323f>] search_exception_tables+0x1f/0x21 Jan 18 23:10:41 mog kernel: [<c0106f55>] do_invalid_op+0x0/0xd5 Jan 18 23:10:41 mog kernel: [<c0107021>] do_invalid_op+0xcc/0xd5 Jan 18 23:10:41 mog kernel: [<c0147acc>] do_page_cache_readahead+0xe2/0x267 Jan 18 23:10:41 mog kernel: [<c0155f70>] page_remove_rmap+0x22/0x36 Jan 18 23:10:41 mog kernel: [<c017f982>] update_atime+0x60/0x9e Jan 18 23:10:41 mog kernel: [<c014125d>] do_generic_mapping_read+0x360/0x368 Jan 18 23:10:41 mog kernel: [<c01064a5>] error_code+0x2d/0x38 Jan 18 23:10:41 mog kernel: [<c0155f70>] page_remove_rmap+0x22/0x36 Jan 18 23:10:41 mog kernel: [<c014eec8>] zap_pte_range+0x1bd/0x221 Jan 18 23:10:41 mog kernel: [<c014ef63>] zap_pmd_range+0x37/0x52 Jan 18 23:10:41 mog kernel: [<c014efba>] unmap_page_range+0x3c/0x57 Jan 18 23:10:41 mog kernel: [<c014f0c8>] unmap_vmas+0xf3/0x1e2 Jan 18 23:10:41 mog kernel: [<c0153e51>] exit_mmap+0xb8/0x1d1 Jan 18 23:10:41 mog kernel: [<c011cf90>] mmput+0xb3/0xd6 Jan 18 23:10:41 mog kernel: [<c016cedb>] exec_mmap+0x2b7/0x2d3 Jan 18 23:10:41 mog kernel: [<c016db84>] flush_old_exec+0xa67/0xda3 Jan 18 23:10:41 mog kernel: [<c016118f>] vfs_read+0xdc/0xe4 Jan 18 23:10:41 mog kernel: [<c016cc1a>] kernel_read+0x31/0x3b Jan 18 23:10:41 mog kernel: [<c0191197>] load_elf_binary+0x50d/0xba2 Jan 18 23:10:41 mog kernel: [<c016c761>] copy_strings+0x1e2/0x1ee Jan 18 23:10:41 mog kernel: [<c016e1d5>] search_binary_handler+0x72/0x1a4 Jan 18 23:10:41 mog kernel: [<c016e473>] do_execve+0x16c/0x1fa Jan 18 23:10:41 mog kernel: [<c01048fc>] sys_execve+0x2a/0x6f Jan 18 23:10:41 mog kernel: [<c01062fb>] syscall_call+0x7/0xb Jan 18 23:10:41 mog kernel: bad: scheduling while atomic! Jan 18 23:10:41 mog kernel: [<c02fad85>] schedule+0x2d/0x58c Jan 18 23:10:41 mog kernel: [<c01068a9>] dump_stack+0x11/0x13 Jan 18 23:10:41 mog kernel: [<c011c662>] __might_sleep+0x82/0x8c Jan 18 23:10:41 mog kernel: [<c0120663>] profile_task_exit+0x1f/0x48 Jan 18 23:10:41 mog kernel: [<c012232c>] do_exit+0x18/0x59e Jan 18 23:10:41 mog kernel: [<c0106c73>] do_divide_error+0x0/0xea Jan 18 23:10:41 mog kernel: [<c0106f55>] do_invalid_op+0x0/0xd5 Jan 18 23:10:41 mog kernel: [<c013323f>] search_exception_tables+0x1f/0x21 Jan 18 23:10:41 mog kernel: [<c0106f55>] do_invalid_op+0x0/0xd5 Jan 18 23:10:41 mog kernel: [<c0107021>] do_invalid_op+0xcc/0xd5 Jan 18 23:10:41 mog kernel: [<c0147acc>] do_page_cache_readahead+0xe2/0x267 Jan 18 23:10:41 mog kernel: [<c0155f70>] page_remove_rmap+0x22/0x36 Jan 18 23:10:41 mog kernel: [<c017f982>] update_atime+0x60/0x9e Jan 18 23:10:41 mog kernel: [<c014125d>] do_generic_mapping_read+0x360/0x368 Jan 18 23:10:41 mog kernel: [<c01064a5>] error_code+0x2d/0x38 Jan 18 23:10:41 mog kernel: [<c0155f70>] page_remove_rmap+0x22/0x36 Jan 18 23:10:41 mog kernel: [<c014eec8>] zap_pte_range+0x1bd/0x221 Jan 18 23:10:41 mog kernel: [<c014ef63>] zap_pmd_range+0x37/0x52 Jan 18 23:10:41 mog kernel: [<c014efba>] unmap_page_range+0x3c/0x57 Jan 18 23:10:41 mog kernel: [<c014f0c8>] unmap_vmas+0xf3/0x1e2 Jan 18 23:10:41 mog kernel: [<c0153e51>] exit_mmap+0xb8/0x1d1 Jan 18 23:10:41 mog kernel: [<c011cf90>] mmput+0xb3/0xd6 Jan 18 23:10:41 mog kernel: [<c016cedb>] exec_mmap+0x2b7/0x2d3 Jan 18 23:10:41 mog kernel: [<c016db84>] flush_old_exec+0xa67/0xda3 Jan 18 23:10:41 mog kernel: [<c016118f>] vfs_read+0xdc/0xe4 Jan 18 23:10:41 mog kernel: [<c016cc1a>] kernel_read+0x31/0x3b Jan 18 23:10:41 mog kernel: [<c0191197>] load_elf_binary+0x50d/0xba2 Jan 18 23:10:41 mog kernel: [<c016c761>] copy_strings+0x1e2/0x1ee Jan 18 23:10:41 mog kernel: [<c016e1d5>] search_binary_handler+0x72/0x1a4 Jan 18 23:10:41 mog kernel: [<c016e473>] do_execve+0x16c/0x1fa Jan 18 23:10:41 mog kernel: [<c01048fc>] sys_execve+0x2a/0x6f Jan 18 23:10:41 mog kernel: [<c01062fb>] syscall_call+0x7/0xb Jan 18 23:10:41 mog kernel: note: yahgetposts[1862] exited with preempt_count 1 There is a second Oops from a sh process 40 minutes later. yahgetposts is a Bash script. I rebooted into 2.6.10-1.9_FC2 and am crossing bodyparts we won't see it again... any improvement with the 2.6.10 update kernels ? I have been having this flavor issue for a while, I was hoping a fresh install of FC 3 would help, alas it hasn't. I originally saw this starting in FC2. uname -r 2.6.10-1.766_FC3 Feb 19 13:16:48 gimli kernel: ------------[ cut here ]------------ Feb 19 13:16:48 gimli kernel: kernel BUG at mm/rmap.c:483! Feb 19 13:16:48 gimli kernel: invalid operand: 0000 [#1] Feb 19 13:16:48 gimli kernel: Modules linked in: radeon parport_pc lp parport autofs4 ipt_REJECT ipt_state ip_conntrack ip_tables dm_mod md5 ipv6 joydev uhci_hcd i2c_i801 i2c_core snd_ens1371 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc gameport tulip floppy ext3 jbd Feb 19 13:16:48 gimli kernel: CPU: 0 Feb 19 13:16:48 gimli kernel: EIP: 0060:[<c01555ef>] Not tainted VLI Feb 19 13:16:48 gimli kernel: EFLAGS: 00010296 (2.6.10-1.766_FC3) Feb 19 13:16:48 gimli kernel: EIP is at page_remove_rmap+0x22/0x36 Feb 19 13:16:48 gimli kernel: eax: ff800000 ebx: c13b25c0 ecx: c13b25c0 edx: c13b25c0 Feb 19 13:16:48 gimli kernel: esi: 00000000 edi: 00092000 ebp: c782e364 esp: cf544ed4 Feb 19 13:16:48 gimli kernel: ds: 007b es: 007b ss: 0068 Feb 19 13:16:48 gimli kernel: Process yum (pid: 16136, threadinfo=cf544000 task=c7f9b3a0) Feb 19 13:16:48 gimli kernel: Stack: c014dfc3 1d92e067 000cc000 b4c47000 c03d5e28 b4c47000 b4d13000 d8d41b50 Feb 19 13:16:48 gimli kernel: c03d5e28 c014e059 000cc000 00000000 b4c47000 d8d41b50 b4d13000 c03d5e28 Feb 19 13:16:48 gimli kernel: c014e0b8 000cc000 00000000 cf544f7c b4c47000 00400000 cf5aaf90 c014e1cd Feb 19 13:16:48 gimli kernel: Call Trace: Feb 19 13:16:48 gimli kernel: [<c014dfc3>] zap_pte_range+0x1bd/0x21c Feb 19 13:16:48 gimli kernel: [<c014e059>] zap_pmd_range+0x37/0x5a Feb 19 13:16:48 gimli kernel: [<c014e0b8>] unmap_page_range+0x3c/0x5f Feb 19 13:16:48 gimli kernel: [<c014e1cd>] unmap_vmas+0xf2/0x28f Feb 19 13:16:48 gimli kernel: [<c01519ce>] vma_adjust+0x32d/0x419 Feb 19 13:16:48 gimli kernel: [<c0152c25>] unmap_region+0x61/0xc6 Feb 19 13:16:48 gimli kernel: [<c0152f11>] do_munmap+0x16f/0x1e6 Feb 19 13:16:49 gimli kernel: [<c0152fd2>] sys_munmap+0x4a/0x61 Feb 19 13:16:49 gimli kernel: [<c0103443>] syscall_call+0x7/0xb Feb 19 13:16:49 gimli kernel: Code: ff 05 f0 b9 3f c0 50 9d c3 89 c2 8b 00 f6 c4 08 74 08 0f 0b e0 01 9d 06 31 c0 83 42 08 ff 0f 98 c0 84 c0 74 19 8b 42 08 40 79 08 <0f> 0b e3 01 9d 06 31 c0 9c 58 fa ff 0d f0 b9 3f c0 50 9d c3 55 Feb 19 13:16:49 gimli kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Feb 19 13:16:49 gimli kernel: in_atomic():1, irqs_disabled():0 Feb 19 13:16:49 gimli kernel: [<c01188af>] __might_sleep+0x7b/0x85 Feb 19 13:16:49 gimli kernel: [<c011c5e0>] profile_task_exit+0x18/0x41 Feb 19 13:16:49 gimli kernel: [<c011e328>] do_exit+0x17/0x591 Feb 19 13:16:49 gimli kernel: [<c0103de4>] do_trap+0x0/0xa2 Feb 19 13:16:49 gimli kernel: [<c0103fd4>] do_invalid_op+0x0/0x8b Feb 19 13:16:49 gimli kernel: [<c0104053>] do_invalid_op+0x7f/0x8b Feb 19 13:16:49 gimli kernel: [<c013e981>] sync_page+0x0/0x38 Feb 19 13:16:49 gimli kernel: [<c01555ef>] page_remove_rmap+0x22/0x36 Feb 19 13:16:49 gimli kernel: [<c0133355>] wake_bit_function+0x0/0x3c Feb 19 13:16:49 gimli kernel: [<c01465c3>] do_page_cache_readahead+0x26b/0x28a Feb 19 13:16:49 gimli kernel: [<c0133355>] wake_bit_function+0x0/0x3c Feb 19 13:16:49 gimli kernel: [<c013f07d>] find_get_page+0x78/0xf6 Feb 19 13:16:49 gimli kernel: [<c01035eb>] error_code+0x2b/0x30 Feb 19 13:16:49 gimli kernel: [<c014007b>] filemap_nopage+0xd4/0x290 Feb 19 13:16:49 gimli kernel: [<c01555ef>] page_remove_rmap+0x22/0x36 Feb 19 13:16:49 gimli kernel: [<c014dfc3>] zap_pte_range+0x1bd/0x21c Feb 19 13:16:49 gimli kernel: [<c014e059>] zap_pmd_range+0x37/0x5a Feb 19 13:16:49 gimli kernel: [<c014e0b8>] unmap_page_range+0x3c/0x5f Feb 19 13:16:49 gimli kernel: [<c014e1cd>] unmap_vmas+0xf2/0x28f Feb 19 13:16:49 gimli kernel: [<c01519ce>] vma_adjust+0x32d/0x419 Feb 19 13:16:49 gimli kernel: [<c0152c25>] unmap_region+0x61/0xc6 Feb 19 13:16:49 gimli kernel: [<c0152f11>] do_munmap+0x16f/0x1e6 Feb 19 13:16:49 gimli kernel: [<c0152fd2>] sys_munmap+0x4a/0x61 Feb 19 13:16:49 gimli kernel: [<c0103443>] syscall_call+0x7/0xb Feb 19 13:16:49 gimli kernel: note: yum[16136] exited with preempt_count 1 Feb 19 13:16:49 gimli kernel: scheduling while atomic: yum/0x00000001/16136 Feb 19 13:16:49 gimli kernel: [<c02fd29b>] schedule+0x3d/0x4ea Feb 19 13:16:49 gimli kernel: [<c011b917>] __call_console_drivers+0x36/0x40 Feb 19 13:16:49 gimli kernel: [<c011ba2f>] call_console_drivers+0xb6/0xd8 Feb 19 13:16:49 gimli kernel: [<c02fe410>] rwsem_down_read_failed+0x1f8/0x216 Feb 19 13:16:49 gimli kernel: [<c011fedf>] .text.lock.exit+0x8b/0xe8 Feb 19 13:16:49 gimli kernel: [<c0103de4>] do_trap+0x0/0xa2 Feb 19 13:16:49 gimli kernel: [<c0103fd4>] do_invalid_op+0x0/0x8b Feb 19 13:16:49 gimli kernel: [<c0104053>] do_invalid_op+0x7f/0x8b Feb 19 13:16:49 gimli kernel: [<c013e981>] sync_page+0x0/0x38 Feb 19 13:16:49 gimli kernel: [<c01555ef>] page_remove_rmap+0x22/0x36 Feb 19 13:16:49 gimli kernel: [<c0133355>] wake_bit_function+0x0/0x3c Feb 19 13:16:49 gimli kernel: [<c01465c3>] do_page_cache_readahead+0x26b/0x28a Feb 19 13:16:49 gimli kernel: [<c0133355>] wake_bit_function+0x0/0x3c Feb 19 13:16:49 gimli kernel: [<c013f07d>] find_get_page+0x78/0xf6 Feb 19 13:16:49 gimli kernel: [<c01035eb>] error_code+0x2b/0x30 Feb 19 13:16:49 gimli kernel: [<c014007b>] filemap_nopage+0xd4/0x290 Feb 19 13:16:49 gimli kernel: [<c01555ef>] page_remove_rmap+0x22/0x36 Feb 19 13:16:49 gimli kernel: [<c014dfc3>] zap_pte_range+0x1bd/0x21c Feb 19 13:16:49 gimli kernel: [<c014e059>] zap_pmd_range+0x37/0x5a Feb 19 13:16:49 gimli kernel: [<c014e0b8>] unmap_page_range+0x3c/0x5f Feb 19 13:16:49 gimli kernel: [<c014e1cd>] unmap_vmas+0xf2/0x28f Feb 19 13:16:49 gimli kernel: [<c01519ce>] vma_adjust+0x32d/0x419 Feb 19 13:16:49 gimli kernel: [<c0152c25>] unmap_region+0x61/0xc6 Feb 19 13:16:49 gimli kernel: [<c0152f11>] do_munmap+0x16f/0x1e6 Feb 19 13:16:49 gimli kernel: [<c0152fd2>] sys_munmap+0x4a/0x61 Feb 19 13:16:49 gimli kernel: [<c0103443>] syscall_call+0x7/0xb Sometimes I catch that one of these events have occcured by doing a 'ps -elf' the terminal will scroll a bit than become non responsive. If I am lucky I can open a new terminal su - reboot -r now That more often than not results in it hanging somewhere in the shutdown of running processes. If sshd is still running I can log in and do shutdown -n -r now and that will fully reboot the system. Partially unlucky when Xorg locks up but I can log in threw sshd and shut things down. Unlucky day, Total and complete hang. No X, sshd, ect ... thats also usually when theres no message in /var/log/messages My laptop with FC3 and the same kernel has never encountered this issue. I have seen someone post about changing memory modules, do they refer to sticks of ram or kernel-modules? I have run a ram tester utility for a few hours and it reported no errors. An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you. I was hit with a similar crash in FC4, kernel 2.6.12-1.1398_FC4: Aug 18 12:13:04 nightshade kernel: kernel BUG at mm/rmap.c:493! Aug 18 12:13:04 nightshade kernel: invalid operand: 0000 [#1] Aug 18 12:13:04 nightshade kernel: Modules linked in: nls_utf8 radeon drm parpor t_pc lp parport it87 eeprom i2c_sensor i2c_isa ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables pl2303 usbserial video button battery ac ohci1394 ieee 1394 ohci_hcd ehci_hcd budget_av saa7146_vv video_buf v4l1_compat v4l2_common vi deodev budget_core dvb_core saa7146 ttpci_eeprom stv0299 tda10021 tda1004x i2c_s is96x i2c_core snd_intel8x0 snd_ac97_codec snd_seq_dummy snd_seq_oss snd_seq_mid i_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd s oundcore snd_page_alloc 8139too mii floppy dm_snapshot dm_zero dm_mirror ext3 jb d dm_mod sata_sil libata sd_mod scsi_mod Aug 18 12:13:04 nightshade kernel: CPU: 0 Aug 18 12:13:04 nightshade kernel: EIP: 0060:[<c016da0b>] Not tainted VLI Aug 18 12:13:04 nightshade kernel: EFLAGS: 00210286 (2.6.12-1.1398_FC4) Aug 18 12:13:04 nightshade kernel: EIP is at page_remove_rmap+0x37/0x41 Aug 18 12:13:04 nightshade kernel: eax: ffffffff ebx: d9b17cbc ecx: c0457a28 edx: c12690c0 Aug 18 12:13:04 nightshade kernel: esi: c12690c0 edi: 00000020 ebp: 0032f000 esp: db60cebc Aug 18 12:13:04 nightshade kernel: ds: 007b es: 007b ss: 0068 Aug 18 12:13:04 nightshade kernel: Process einstein_4.81_i (pid: 2192, threadinfo= db60c000 task=df83a000) Aug 18 12:13:04 nightshade kernel: Stack: c01641d8 00000000 00400000 c0457a28 df88 7000 00400000 00437000 00436fff Aug 18 12:13:04 nightshade kernel: c016436b 00400000 00000000 c0457a28 00201000 0043 7000 cb15059c 00437000 Aug 18 12:13:04 nightshade kernel: c0164546 00437000 00000000 df8bc3e8 d5f9b6fc c016 8d13 09b4f000 09b4f000 Aug 18 12:13:04 nightshade kernel: Call Trace: Aug 18 12:13:04 nightshade kernel: [<c01641d8>] zap_pte_range+0xd6/0x1e6 Aug 18 12:13:04 nightshade kernel: [<c016436b>] unmap_page_range+0x83/0xb7 Aug 18 12:13:04 nightshade kernel: [<c0164546>] unmap_vmas+0x1a7/0x36a Aug 18 12:13:04 nightshade kernel: [<c0168d13>] vma_adjust+0x14c/0x6a1 Aug 18 12:13:04 nightshade kernel: [<c01637d3>] free_pgtables+0x7d/0xaa Aug 18 12:13:04 nightshade kernel: [<c016a66f>] unmap_region+0xb7/0x236 Aug 18 12:13:04 nightshade kernel: [<c016aa1a>] do_munmap+0xc1/0xf9 Aug 18 12:13:04 nightshade kernel: [<c016aa9d>] sys_munmap+0x4b/0x63 Aug 18 12:13:04 nightshade kernel: [<c0103a51>] syscall_call+0x7/0xb Aug 18 12:13:04 nightshade kernel: Code: ff 0f 98 c0 84 c0 75 01 c3 8b 42 08 83 c0 01 90 78 19 ba ff ff ff ff b8 10 00 00 00 e9 b5 a5 fe ff 0f 0b ea 01 8d 6d 38 c0 eb d2 <0f> 0b ed 0 1 8d 6d 38 c0 eb dd 55 57 56 53 83 ec 24 89 c7 89 d3 Aug 18 12:13:04 nightshade kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Aug 18 12:13:04 nightshade kernel: in_atomic():1, irqs_disabled():0 Aug 18 12:13:04 nightshade kernel: [<c0122653>] profile_task_exit+0x13/0x43 Aug 18 12:13:04 nightshade kernel: [<c0124acb>] do_exit+0x19/0x500 Aug 18 12:13:04 nightshade kernel: [<c0269adc>] do_unblank_screen+0x55/0x13d Aug 18 12:13:04 nightshade kernel: [<c0121748>] printk+0x1b/0x1f Aug 18 12:13:04 nightshade kernel: [<c010461e>] die+0x22c/0x2c4 Aug 18 12:13:04 nightshade kernel: [<c011970b>] fixup_exception+0xb/0x20 Aug 18 12:13:04 nightshade kernel: [<c01048ea>] do_invalid_op+0x0/0xab Aug 18 12:13:04 nightshade kernel: [<c010498c>] do_invalid_op+0xa2/0xab Aug 18 12:13:04 nightshade kernel: [<c016da0b>] page_remove_rmap+0x37/0x41 Aug 18 12:13:04 nightshade kernel: [<c01abb98>] __mark_inode_dirty+0x28/0x2e8 Aug 18 12:13:04 nightshade kernel: [<c0105b24>] do_IRQ+0x51/0x82 Aug 18 12:13:04 nightshade kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Aug 18 12:13:04 nightshade kernel: [<c0103c6b>] error_code+0x4f/0x54 Aug 18 12:13:04 nightshade kernel: [<c016da0b>] page_remove_rmap+0x37/0x41 Aug 18 12:13:04 nightshade kernel: [<c01641d8>] zap_pte_range+0xd6/0x1e6 Aug 18 12:13:04 nightshade kernel: [<c016436b>] unmap_page_range+0x83/0xb7 Aug 18 12:13:04 nightshade kernel: [<c0164546>] unmap_vmas+0x1a7/0x36a Aug 18 12:13:04 nightshade kernel: [<c0168d13>] vma_adjust+0x14c/0x6a1 Aug 18 12:13:04 nightshade kernel: [<c01637d3>] free_pgtables+0x7d/0xaa Aug 18 12:13:04 nightshade kernel: [<c016a66f>] unmap_region+0xb7/0x236 Aug 18 12:13:04 nightshade kernel: [<c016aa1a>] do_munmap+0xc1/0xf9 Aug 18 12:13:04 nightshade kernel: [<c016aa9d>] sys_munmap+0x4b/0x63 Aug 18 12:13:04 nightshade kernel: [<c0103a51>] syscall_call+0x7/0xb Aug 18 12:13:04 nightshade kernel: note: einstein_4.81_i[2192] exited with preempt_count 1 Aug 18 12:13:04 nightshade kernel: scheduling while atomic: einstein_4.81_i/0x00000001/2192 Aug 18 12:13:04 nightshade kernel: [<c03707fd>] schedule+0x56d/0x7b3 Aug 18 12:13:04 nightshade kernel: [<c011ad20>] recalc_task_prio+0xe7/0x150 Aug 18 12:13:04 nightshade kernel: [<c037241f>] rwsem_down_read_failed+0xaf/0x2b8 Aug 18 12:13:04 nightshade kernel: [<c0268d9a>] vt_console_print+0x58/0x297 Aug 18 12:13:04 nightshade kernel: [<c0268d9a>] vt_console_print+0x58/0x297 Aug 18 12:13:04 nightshade kernel: [<c0145452>] .text.lock.futex+0x7/0xd9 Aug 18 12:13:04 nightshade kernel: [<c0121446>] __call_console_drivers+0x38/0x44 Aug 18 12:13:04 nightshade kernel: [<c012152f>] call_console_drivers+0x80/0x14c Aug 18 12:13:04 nightshade kernel: [<c0145328>] do_futex+0x75/0x7c Aug 18 12:13:04 nightshade kernel: [<c014537f>] sys_futex+0x50/0x108 Aug 18 12:13:04 nightshade kernel: [<c0212758>] vscnprintf+0x14/0x21 Aug 18 12:13:04 nightshade kernel: [<c011e0b4>] mm_release+0x7c/0x83 Aug 18 12:13:04 nightshade kernel: [<c0123e61>] exit_mm+0x12/0x27d Aug 18 12:13:04 nightshade kernel: [<c0104056>] show_trace+0x2a/0x78 Aug 18 12:13:04 nightshade kernel: [<c0103a51>] syscall_call+0x7/0xb Aug 18 12:13:04 nightshade kernel: [<c0121748>] printk+0x1b/0x1f Aug 18 12:13:04 nightshade kernel: [<c0124b75>] do_exit+0xc3/0x500 Aug 18 12:13:04 nightshade kernel: [<c0121748>] printk+0x1b/0x1f Aug 18 12:13:04 nightshade kernel: [<c010461e>] die+0x22c/0x2c4 Aug 18 12:13:04 nightshade kernel: [<c011970b>] fixup_exception+0xb/0x20 Aug 18 12:13:04 nightshade kernel: [<c01048ea>] do_invalid_op+0x0/0xab Aug 18 12:13:04 nightshade kernel: [<c010498c>] do_invalid_op+0xa2/0xab Aug 18 12:13:04 nightshade kernel: [<c016da0b>] page_remove_rmap+0x37/0x41 Aug 18 12:13:04 nightshade kernel: [<c01abb98>] __mark_inode_dirty+0x28/0x2e8 Aug 18 12:13:04 nightshade kernel: [<c0105b24>] do_IRQ+0x51/0x82 Aug 18 12:13:04 nightshade kernel: [<c0103c0e>] common_interrupt+0x1a/0x20 Aug 18 12:13:04 nightshade kernel: [<c0103c6b>] error_code+0x4f/0x54 Aug 18 12:13:04 nightshade kernel: [<c016da0b>] page_remove_rmap+0x37/0x41 Aug 18 12:13:04 nightshade kernel: [<c01641d8>] zap_pte_range+0xd6/0x1e6 Aug 18 12:13:04 nightshade kernel: [<c016436b>] unmap_page_range+0x83/0xb7 Aug 18 12:13:04 nightshade kernel: [<c0164546>] unmap_vmas+0x1a7/0x36a Aug 18 12:13:04 nightshade kernel: [<c0168d13>] vma_adjust+0x14c/0x6a1 Aug 18 12:13:04 nightshade kernel: [<c01637d3>] free_pgtables+0x7d/0xaa Aug 18 12:13:04 nightshade kernel: [<c016a66f>] unmap_region+0xb7/0x236 Aug 18 12:13:04 nightshade kernel: [<c016aa1a>] do_munmap+0xc1/0xf9 Aug 18 12:13:04 nightshade kernel: [<c016aa9d>] sys_munmap+0x4b/0x63 Aug 18 12:13:04 nightshade kernel: [<c0103a51>] syscall_call+0x7/0xb I just had a similar crash on Fedora Core 4 64bit (2.6.13-1.1532_FC4smp). Logs show: Nov 17 00:23:38 hera kernel: swap_free: Bad swap file entry 7fffff802a Nov 17 00:23:38 hera kernel: swap_free: Bad swap file entry 800007fffff802a Nov 17 00:23:38 hera kernel: swap_free: Bad swap file entry 1000007fffff802a Nov 17 00:23:38 hera kernel: swap_free: Bad swap file entry 1800007fffff802a Nov 17 00:23:38 hera kernel: swap_free: Bad swap file entry 4000000000000000 Nov 17 00:23:38 hera kernel: ----------- [cut here ] --------- [please bite here ] --------- Nov 17 00:23:38 hera kernel: Kernel BUG at "mm/rmap.c":493 Nov 17 00:23:38 hera kernel: invalid operand: 0000 [1] SMP Nov 17 00:23:38 hera kernel: CPU 1 Nov 17 00:23:38 hera kernel: Modules linked in: loop nfsd exportfs parport_pc lp parport autofs4 nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc pcmcia yenta_socket rsrc_nonstatic pcmcia_core ipt_REJECT ipt_LOG ipt_ttl ipt_limit ipt_state ip_conntrack ipt_multiport iptable_filter ip_tables dm_mod video button battery ac ipv6 ohci_hcd i2c_amd8111 i2c_core hw_random shpchp e1000 floppy mptfc mptspi ext3 jbd 3w_xxxx mptscsih mptbase sd_mod scsi_mod Nov 17 00:23:38 hera kernel: Pid: 19066, comm: sh Tainted: G M 2.6.13-1.1532_FC4smp Nov 17 00:23:38 hera kernel: RIP: 0010:[<ffffffff8017475a>] <ffffffff8017475a>{page_remove_rmap+43} Nov 17 00:23:38 hera kernel: RSP: 0018:ffff8100a7de9dc0 EFLAGS: 00010286 Nov 17 00:23:38 hera kernel: RAX: 00000000ffffffff RBX: 0000000000000020 RCX: ffff810172ee94e8 Nov 17 00:23:38 hera kernel: RDX: ffff8101f03789f8 RSI: ffff81000c3e74e8 RDI: ffff8100011a69c8 Nov 17 00:23:38 hera kernel: RBP: ffff8100011a69c8 R08: ffff8100011a69c8 R09: 00000000fffffffa Nov 17 00:23:38 hera kernel: R10: ffff8100697903e0 R11: 0000000000000002 R12: ffff8100ce1ef2d0 Nov 17 00:23:38 hera kernel: R13: 000000000045a000 R14: 00000000004a8000 R15: ffff81010383e5a0 Nov 17 00:23:38 hera kernel: FS: 00002aaaaaad1000(0000) GS:ffffffff80502880(0000) knlGS:00000000f715bbb0 Nov 17 00:23:38 hera kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Nov 17 00:23:38 hera kernel: CR2: 0000003f17616178 CR3: 0000000097fc6000 CR4: 00000000000006e0 Nov 17 00:23:38 hera kernel: Process sh (pid: 19066, threadinfo ffff8100a7de8000, task ffff8101ae5bc130) Nov 17 00:23:38 hera kernel: Stack: ffffffff8016d588 ffff8100a7de9eb0 ffffffffffffffff 0000000000000000 Nov 17 00:23:38 hera kernel: ffff8100483ab788 ffff8100f9e01240 ffff8100a7de9eb8 0000000000400000 Nov 17 00:23:38 hera kernel: 00000000004a8000 0000000000000000 Nov 17 00:23:38 hera kernel: Call Trace:<ffffffff8016d588>{unmap_vmas+1228} <ffffffff80170c47>{exit_mmap+170} Nov 17 00:23:38 hera kernel: <ffffffff80132f7d>{mmput+37} <ffffffff80137a38>{do_exit+504} Nov 17 00:23:38 hera kernel: <ffffffff8013857d>{sys_exit_group+0} <ffffffff8010db02>{tracesys+209} Nov 17 00:23:38 hera kernel: Nov 17 00:23:38 hera kernel: Nov 17 00:23:38 hera kernel: Code: 0f 0b a3 f7 e4 35 80 ff ff ff ff c2 ed 01 48 c7 c6 ff ff ff Nov 17 00:23:38 hera kernel: RIP <ffffffff8017475a>{page_remove_rmap+43} RSP <ffff8100a7de9dc0> Nov 17 00:23:38 hera kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Nov 17 00:23:38 hera kernel: in_atomic():0, irqs_disabled():1 Nov 17 00:23:38 hera kernel: Nov 17 00:23:38 hera kernel: Call Trace:<ffffffff80136667>{profile_task_exit+21} <ffffffff80137862>{do_exit+34} Nov 17 00:23:38 hera kernel: <ffffffff80252864>{do_unblank_screen+142} <ffffffff8010f4a5>{default_do_nmi+0} Nov 17 00:23:38 hera kernel: <ffffffff8010fee6>{do_invalid_op+163} <ffffffff8017475a>{page_remove_rmap+43} Nov 17 00:23:38 hera kernel: <ffffffff80135f72>{printk+78} <ffffffff8010e4bd>{error_exit+0} Nov 17 00:23:38 hera kernel: <ffffffff8017475a>{page_remove_rmap+43} <ffffffff8016d588>{unmap_vmas+1228} Nov 17 00:23:38 hera kernel: <ffffffff80170c47>{exit_mmap+170} <ffffffff80132f7d>{mmput+37} Nov 17 00:23:38 hera kernel: <ffffffff80137a38>{do_exit+504} <ffffffff8013857d>{sys_exit_group+0} Nov 17 00:23:38 hera kernel: <ffffffff8010db02>{tracesys+209} Nov 17 00:23:38 hera kernel: Fixing recursive fault but reboot is needed! And then later: Nov 17 00:37:05 hera kernel: Bad page state at free_hot_cold_page (in process 'kswapd0', page ffff8100011a69c8) Nov 17 00:37:05 hera kernel: flags:0x0100000000010008 mapping:0000000000000000 mapcount:-1 count:0 (Tainted: G M ) Nov 17 00:37:05 hera kernel: Backtrace: Nov 17 00:37:05 hera kernel: Nov 17 00:37:05 hera kernel: Call Trace:<ffffffff80161998>{bad_page+135} <ffffffff80161e01>{free_hot_cold_page+126} Nov 17 00:37:05 hera kernel: <ffffffff80161e9d>{__pagevec_free+41} <ffffffff801673f7>{__pagevec_release_nonlru+154} Nov 17 00:37:05 hera kernel: <ffffffff801692d0>{shrink_zone+3155} <ffffffff801afe51>{mb_cache_shrink_fn+123} Nov 17 00:37:05 hera kernel: <ffffffff801697c3>{balance_pgdat+572} <ffffffff80169a62>{kswapd+296} Nov 17 00:37:05 hera kernel: <ffffffff8014a390>{autoremove_wake_function+0} <ffffffff80131ab6>{schedule_tail+57} Nov 17 00:37:05 hera kernel: <ffffffff8010e672>{child_rip+8} <ffffffff8011a954>{flat_send_IPI_mask+0} Nov 17 00:37:05 hera kernel: <ffffffff8016993a>{kswapd+0} <ffffffff8010e66a>{child_rip+0} Nov 17 00:37:05 hera kernel: Nov 17 00:37:05 hera kernel: Trying to fix it up, but a reboot is needed And then the machine crashed a couple minutes later. Before the crash I up2date'd to the newest FC4_64 kernel 2.6.14-1.1637_FC4smp, but I did not boot to that new kernel until this morning, after the crash. As a side note, we have three SCSI devices on our system, a 3ware RAID mirror (/dev/sda - system), and two external SCSI disc's (/dev/sdb and /dev/sdc), but after the crash, the system switch those device orders, and the system device became /dev/sdc. I did NOT touch the hardware or swap any cards in the system at all. The system switch the device order without any change in the hardware at all. I had to boot to a knoppix CD in order to fix fstab so I could boot to the system. I'm not sure if this problem is related to the kernel bug or not. Alex Alex, that one is very likely caused by the amd tlb flush filter errata, which should have been fixed in a bios update, though some vendors never seemed to. Later kernel updates apply the fix for that if it detects the BIOS didn't do it. If anyone else is seeing this on AMD64 hardware, this was the cause. As no-one managed to reproduce the other issue originally reported here on a 2.6.12 kernel, I'm fairly confident that this is fixed. Dave, see comment #11. That's a 2.6.12 kernel on a Pentium 4 (32-bit i686 kernel), so why do you say the issue hasn't been reproduced on a 2.6.12 kernel? it's not necessarily the same bug, and as FC4 has moved on since then, if it is still occuring, I'd rather that get opened as a new FC4 bug. As far as I can see, for FC3, this is fixed. (That was 64bit btw, note the addresses/registers in the oops) The specific BUG_ON that's getting hit could be caused by any number of things. Random memory corruption due to a driver bug, bad hardware, or CPU errata, so whilst many reports of it may look the same, they may well have different causes. |