Description of problem: While typing an email in X, I was logged out and had to log back in. I checked syslog and found this: Jul 8 16:31:40 centaur kernel: BUG: unable to handle kernel paging request at ffffffff8d0c5930 Jul 8 16:31:40 centaur kernel: IP: [<ffffffff8100c02e>] tracesys+0xb1/0xda Jul 8 16:31:40 centaur kernel: PGD 203067 PUD 207063 PMD 0 Jul 8 16:31:40 centaur kernel: Oops: 0000 [1] SMP Jul 8 16:31:40 centaur kernel: CPU 0 Jul 8 16:31:40 centaur kernel: Modules linked in: tun autofs4 it87 hwmon_vid hwmon nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables cpufreq_ondemand powernow_k8 freq_table dm_mirror dm_mod snd_hda_intel snd_seq_dummy ppdev snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm parport_pc floppy snd_timer parport snd_page_alloc snd_hwdep pcspkr snd tulip soundcore sg i2c_nforce2 i2c_core button usb_storage pata_amd sata_sil24 sata_nv ata_generic pata_acpi libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan] Jul 8 16:31:40 centaur kernel: Pid: 4211, comm: Xorg Not tainted 2.6.25.9-76.fc9.x86_64 #1 Jul 8 16:31:40 centaur kernel: RIP: 0010:[<ffffffff8100c02e>] [<ffffffff8100c02e>] tracesys+0xb1/0xda Jul 8 16:31:40 centaur kernel: RSP: 0018:ffff8100749a7f58 EFLAGS: 00010246 Jul 8 16:31:40 centaur kernel: RAX: 0000000000000000 RBX: 0000000000000004 RCX: ffffffffffffffff Jul 8 16:31:40 centaur kernel: RDX: 0000000000001000 RSI: 00007f6c40d9dfb0 RDI: 0000000000000028 Jul 8 16:31:40 centaur kernel: RBP: 000000000202e470 R08: 0000000000000000 R09: 0000000000000000 Jul 8 16:31:40 centaur kernel: R10: 000000000202e120 R11: 0000000000003246 R12: 0000000000000000 Jul 8 16:31:40 centaur kernel: R13: 000000000202e120 R14: 0000000000000028 R15: 000000000202ed80 Jul 8 16:31:40 centaur kernel: FS: 00007f6c55014780(0000) GS:ffffffff813f2000(0000) knlGS:0000000000000000 Jul 8 16:31:40 centaur kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 8 16:31:40 centaur kernel: CR2: ffffffff8d0c5930 CR3: 0000000074d24000 CR4: 00000000000006e0 Jul 8 16:31:40 centaur kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 8 16:31:40 centaur kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 8 16:31:40 centaur kernel: Process Xorg (pid: 4211, threadinfo ffff8100749a6000, task ffff810059068000) Jul 8 16:31:40 centaur kernel: Stack: 000000000202ed80 0000000000000028 000000000202e120 0000000000000000 Jul 8 16:31:40 centaur kernel: 000000000202e470 0000000000000004 0000000000003246 000000000202e120 Jul 8 16:31:40 centaur kernel: 0000000000000000 0000000000000000 ffffffffffffffda ffffffffffffffff Jul 8 16:31:40 centaur kernel: Call Trace: Jul 8 16:31:40 centaur kernel: Jul 8 16:31:40 centaur kernel: Jul 8 16:31:40 centaur kernel: Code: 8b 4c 24 58 48 8b 54 24 60 48 8b 74 24 68 48 8b 7c 24 70 48 8b 44 24 78 4c 8b 3c 24 4c 8b 74 24 08 4c 8b 6c 24 10 4c 8b 64 24 18 <48> 8b 6c 24 20 48 8b 5c 24 28 48 83 c4 30 48 3d 1f 01 00 00 0f Jul 8 16:31:40 centaur kernel: RIP [<ffffffff8100c02e>] tracesys+0xb1/0xda Jul 8 16:31:40 centaur kernel: RSP <ffff8100749a7f58> Jul 8 16:31:40 centaur kernel: CR2: ffffffff8d0c5930 Jul 8 16:31:40 centaur kernel: ---[ end trace 2f77648f7d959a12 ]--- Jul 8 16:31:40 centaur kernel: BUG: sleeping function called from invalid context at kernel/rwsem.c:21 Jul 8 16:31:40 centaur kernel: in_atomic():0, irqs_disabled():1 Jul 8 16:31:40 centaur kernel: Pid: 4211, comm: Xorg Tainted: G D 2.6.25.9-76.fc9.x86_64 #1 Jul 8 16:31:40 centaur kernel: Jul 8 16:31:40 centaur kernel: Call Trace: Jul 8 16:31:40 centaur kernel: [<ffffffff81048d51>] ? __remove_hrtimer+0x7f/0x8c Jul 8 16:31:40 centaur kernel: [<ffffffff8102a552>] __might_sleep+0xb4/0xb6 Jul 8 16:31:40 centaur kernel: [<ffffffff8128dd02>] down_read+0x1d/0x2e Jul 8 16:31:40 centaur kernel: [<ffffffff8105c395>] acct_collect+0x42/0x19f Jul 8 16:31:40 centaur kernel: [<ffffffff81036b13>] do_exit+0x21d/0x656 Jul 8 16:31:40 centaur kernel: [<ffffffff8128f5cd>] oops_begin+0x0/0x90 Jul 8 16:31:40 centaur kernel: [<ffffffff81291299>] do_page_fault+0x7be/0x89a Jul 8 16:31:40 centaur kernel: [<ffffffff8110648f>] ? inode_has_perm+0x5b/0x61 Jul 8 16:31:40 centaur kernel: [<ffffffff810a3dc0>] ? do_sync_read+0xe7/0x12d Jul 8 16:31:40 centaur kernel: [<ffffffff8106b0dc>] ? audit_filter_rules+0x6c0/0x77a Jul 8 16:31:40 centaur kernel: [<ffffffff81046b83>] ? autoremove_wake_function+0x0/0x38 Jul 8 16:31:40 centaur kernel: [<ffffffff8128f039>] error_exit+0x0/0x51 Jul 8 16:31:40 centaur kernel: [<ffffffff8100c02e>] ? tracesys+0xb1/0xda Jul 8 16:31:40 centaur kernel: [<ffffffff8100bfee>] ? tracesys+0x71/0xda Jul 8 16:31:40 centaur kernel: Jul 8 16:31:40 centaur gconfd (sgrubb-4338): Listener ID 4127195139 doesn't exist Jul 8 16:31:40 centaur gconfd (sgrubb-4338): Listener ID 4143972356 doesn't exist Jul 8 16:31:40 centaur kernel: mtrr: base(0xe0000000) is not aligned on a size(0xff00000) boundary Version-Release number of selected component (if applicable): 2.6.25.9-76.fc9.x86_64
/usr/src/debug////////kernel-2.6.25/linux-2.6.25.x86_64/arch/x86/kernel/entry_64.S:327 ffffffff8100c01b: 4c 8b 3c 24 mov (%rsp),%r15 ffffffff8100c01f: 4c 8b 74 24 08 mov 0x8(%rsp),%r14 ffffffff8100c024: 4c 8b 6c 24 10 mov 0x10(%rsp),%r13 ffffffff8100c029: 4c 8b 64 24 18 mov 0x18(%rsp),%r12 ffffffff8100c02e: 48 8b 6c 24 20 mov 0x20(%rsp),%rbp <===== ffffffff8100c033: 48 8b 5c 24 28 mov 0x28(%rsp),%rbx ffffffff8100c038: 48 83 c4 30 add $0x30,%rsp tracesys: SAVE_REST movq $-ENOSYS,RAX(%rsp) /* ptrace can change this for a bad syscall */ FIXUP_TOP_OF_STACK %rdi movq %rsp,%rdi call syscall_trace_enter LOAD_ARGS ARGOFFSET /* reload args from stack in case ptrace changed it */ RESTORE_REST <===== cmpq $__NR_syscall_max,%rax ja int_ret_from_sys_call /* RAX(%rsp) set to -ENOSYS above */ movq %r10,%rcx /* fixup for C */ call *sys_call_table(,%rax,8) movq %rax,RAX-ARGOFFSET(%rsp) /* Use IRET because user could have changed frame */
Either your processor is failing/overheating or something very strange is going on. That instruction should access address ffff8100749a7f78 ... and the previous instruction successfully accessed memory 8 bytes below that.
Hard to say if its failing. This is the only Oops/BUG I've seen since I've owned it. It is an AMD 9600, but its not overclocked. Hmm...will watch for other strangeness. Thanks for the analysis.
Steve, how did things work out with that hardware? Did you ever see this again?
Since reporting the bug, I have changed hardware and am now using F-10. So, I no longer see the problem. This bug can be closed unless anyone really wants it open for some reason.