Description of problem: I'm getting a few segfaults with the latest kernel, but in addition I'm also thinking that it may be the same cause of a weird X bug that I've experienced with a fully up-to-date Fedora 9 (and LiveCDs) where I can't start X more than once. I managed to catch the dmesg (attached) when I was rebooting the other day, I hadn't done anything 'special' either. There are no 3rd Party Modules, no Livna kmods/akmods either. Version-Release number of selected component (if applicable): 2.6.25.3-18.fc9.x86_64 How reproducible: I have no idea, but I suspect always for my desktop machine. Additional Info: [root@localhost ~]# lspci 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 02) 00:01.0 PCI bridge: Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 02) 00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02) 00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02) 00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02) 00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02) 00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) 00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02) 00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) 01:00.0 VGA compatible controller: nVidia Corporation G71 [GeForce 7300 GS] (rev a1) 02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit Ethernet Adapter (rev b0) 03:00.0 IDE interface: Marvell Technology Group Ltd. 88SE6121 SATA II Controller (rev b1) [root@localhost ~]# lsmod Module Size Used by ppdev 15624 0 parport_pc 33816 0 lp 19300 0 parport 42784 3 ppdev,parport_pc,lp bridge 59304 0 bnep 21632 2 rfcomm 44448 4 l2cap 29312 16 bnep,rfcomm bluetooth 59044 5 bnep,rfcomm,l2cap fuse 51008 1 sunrpc 185000 3 ipt_REJECT 11776 2 nf_conntrack_ipv4 17416 2 iptable_filter 11392 1 ip_tables 25232 1 iptable_filter ip6t_REJECT 12544 2 xt_tcpudp 11648 2 nf_conntrack_ipv6 22984 2 xt_state 10752 4 nf_conntrack 64528 3 nf_conntrack_ipv4,nf_conntrack_ipv6,xt_state ip6table_filter 11264 1 ip6_tables 26640 1 ip6table_filter x_tables 26248 6 ipt_REJECT,ip_tables,ip6t_REJECT,xt_tcpudp,xt_state,ip6_tables ipv6 276232 38 ip6t_REJECT,nf_conntrack_ipv6 cpufreq_ondemand 15760 1 acpi_cpufreq 16656 1 freq_table 13440 2 cpufreq_ondemand,acpi_cpufreq dm_mirror 32004 0 dm_multipath 24976 0 dm_mod 62104 2 dm_mirror,dm_multipath snd_hda_intel 447540 3 snd_seq_dummy 11524 0 snd_seq_oss 39232 0 ahci 35976 0 snd_seq_midi_event 15104 1 snd_seq_oss snd_seq 61840 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event floppy 66216 0 snd_seq_device 15508 3 snd_seq_dummy,snd_seq_oss,snd_seq i2c_i801 17692 0 snd_pcm_oss 52096 0 sg 40528 0 i2c_core 28448 1 i2c_i801 pcspkr 11136 0 iTCO_wdt 19920 0 snd_mixer_oss 23296 1 snd_pcm_oss snd_pcm 86024 2 snd_hda_intel,snd_pcm_oss iTCO_vendor_support 11780 1 iTCO_wdt atl1 39052 0 snd_timer 29584 2 snd_seq,snd_pcm mii 13184 1 atl1 snd_page_alloc 16912 2 snd_hda_intel,snd_pcm usb_storage 95008 0 button 15776 0 snd_hwdep 16520 1 snd_hda_intel pata_marvell 13696 0 sr_mod 23732 0 snd 66808 16 snd_hda_intel,snd_seq_dummy,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_hwdep soundcore 14864 1 snd cdrom 40616 1 sr_mod ata_piix 29188 4 pata_acpi 13824 0 ata_generic 14724 0 libata 149280 5 ahci,pata_marvell,ata_piix,pata_acpi,ata_generic sd_mod 33200 6 scsi_mod 150744 5 sg,usb_storage,sr_mod,libata,sd_mod ext3 130320 3 jbd 53160 1 ext3 mbcache 15876 1 ext3 uhci_hcd 29984 0 ohci_hcd 28932 0 ehci_hcd 40588 0
Created attachment 306326 [details] dmesg output after Recursive Fault
kernel BUG at mm/filemap.c:126! BUG_ON(page_mapped(page)); Your machine has taken a machine check error: "Tainted: G M" Can you run the mcelog program when that happens and see what the error is?
it might be worth a run of memtest86 for a while too, just to rule out bad ram. These things are common indications of hardware problems of some kind (bad ram/insufficient power/cooling, or just general flakyness)
(In reply to comment #2) > kernel BUG at mm/filemap.c:126! > BUG_ON(page_mapped(page)); > > > Your machine has taken a machine check error: > "Tainted: G M" > > Can you run the mcelog program when that happens and see what the error is? > I shall attempt this when I next time I see it, already had two segfaults today. I also just noticed another recursive fault (Kernel Oops spotted it in all fairness), so how am I meant to run mcelog? (In reply to comment #3) > it might be worth a run of memtest86 for a while too, just to rule out bad ram. > These things are common indications of hardware problems of some kind (bad > ram/insufficient power/cooling, or just general flakyness) I was considering bad RAM, I recently installed an extra 2 gig but dare I say it, it seems to run Vista okay, and I ran Fedora 8 quite happily until recently. I can understand your other points too although I'll give credit that it's worked pretty well for the last 8ish months w/ both Linux and Windows. I'll run memtest86 when I go to sleep tonight or have dinner though.
Created attachment 306449 [details] dmesg | mcelog --ascii Okay scrap my last comment, google'd and got 'dmesg | mcelog --ascii' this is the result. "HARDWARE ERROR" seems to be the tell tale sign, I take it this is referring to my nice dual core processor and in fact not a Kernel Bug? I'm a little confused here, so a point in the right direction would be most appreciated.
Looking at the mcelog manpage, I think you just want to run mcelog without any arguments. The old method of writing events to the syslog is obsolete.
(In reply to comment #6) > Looking at the mcelog manpage, I think you just want to run mcelog without any > arguments. The old method of writing events to the syslog is obsolete. That returned absolutely nothing (In reply to comment #3) > it might be worth a run of memtest86 for a while too, just to rule out bad ram. > These things are common indications of hardware problems of some kind (bad > ram/insufficient power/cooling, or just general flakyness) I think you might be right, I gave up memtest86+'ing it after 900 errors (spread over all 4 slots) in 40 minutes. Looks like I need to have a fiddle with the RAM config etc and work out whats going on. IMO it's a 'notabug' agree?
yeah, sounds like a hardware fault of some sort, and given the recent addition of RAM, that's a likely suspect. We frequently see things like this where Windows runs just fine. It's purely by luck really. The access patterns of the two operating systems are completely different, and perhaps Linux employs more aggressive caching of data (or maybe we just read more of disk, or ...) So many variables, that it's not really a data point worth putting any faith in.