From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050513 Fedora/1.0.4-1.3.1 Firefox/1.0.4 Description of problem: I've got a Thinkpad T42p 2379-DYU running a fully updated FC3. I was happily typing away into a gaim window when the machine hung. It hangs like this often, and quite randomly. Using mplayer a couple weeks ago I got it to hang about three times within an hour. When it hangs everything stops working. No mouse, no network, no SysReq, nothing. It's dead. All I can do is hard-boot. As this is a laptop, and it's not reliably reproducible (but it IS reproducible, just not on demand), I've not yet attempted to run a serial console to see what's failing (if anything). Nothing is printed in any of the logfiles when the hang occurs. No, I do not use USB storage. The system provides no warning prior to the hang. This sounds (to me) like bug #156627, except it's happened with all FC3 kernels, not just 2.6.11-1.14. Also, that bug asked others to created a new bug report, so I did. Version-Release number of selected component (if applicable): All FC3 kernels from 2.6.10-1.770 through 11-1.27 How reproducible: Sometimes Steps to Reproduce: 1. Work normally for anywhere from an hour to two weeks 2. System hangs 3. Additional info:
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you.
I have updated to this kernel and unfortunately it has not corrected the problem. My thinkpad crashed overnight last night. It crashed sometime after 3:01AM (the last cron.hourly entry in /var/log/cron was at 3:01:01 this morning).
I've turned off ACPI (running 2.6.12-1.1372_FC3 with acpi=off) and we'll see if this fixes the problem. This was suggested in bug #158455. Unfortunately it can sometimes be multiple weeks between failures, so it's a bit hard to debug this. I'll see how long the system lasts this time (it lasted a good couple weeks the last time before it hung yesterday). FWIW, I haven't really ruled out a hardware problem, but ISTR that I didn't have a problem when I first installed FC3 and it only started happening on later kernels. I've left this in NEEDINFO_REPORTER on the theory that I'll still need to respond in a week or two once my machine crashes (or shows no signs of crashing). I you have other ideas for me I'd be glad to test them!
Nope, that didn't solve it. The machine hung last night. :( Any more suggestions?
This is a mass-update to all currently open Fedora Core 3 kernel bugs. Fedora Core 3 support has transitioned to the Fedora Legacy project. Due to the limited resources of this project, typically only updates for new security issues are released. As this bug isn't security related, it has been migrated to a Fedora Core 4 bug. Please upgrade to this newer release, and test if this bug is still present there. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. Thank you.
This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you.
Closing per previous comment.
Hi, I just updated to FC5 and this problem is still happening.. However I think I might have narrowed it down to the speedstep/cpufreq subsystem. There appears to either be some incompatibility with the speedstep implementation in the CPU, or a bad set of CPUs that the cpuspeed code tickles. I'm currently running with cpuspeed turned off for testing, and I'll report back if it really seems to help (it has helped so far, but it's only been 15 hours). I'm reopening this bug because it IS still in FC5.. I would have marked it "NEEDINFO" because I do need to reply again whether disabling cpuspeed mitigates the problem, but I couldn't do that. I'll just have to remember to do that later. Sorry for the late late reply. I've also updated the summary and version
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
Nope, this doesn't seem to have fixed the problem. I was running 2.6.18-1.2220.fc5 and the machine just crashed with a pretty red stripe through the gnome taskbar at the top of the screen. So, unfortunately this bug is still there. I'm not sure whether it's a software bug or a hardware bug.. And unfortunately when it crashes/hangs while I'm sitting in X there's no log generated.
what video drivers is this using ?
Created attachment 138730 [details] X Config Using radeon driver (see attached xorg.conf). Here's the output from lsmod. I'll point out that the crash still happens without vmware, so the fact that vmware is loaded has been ruled out. Module Size Used by fuse 44885 6 wlan_wep 7296 1 autofs4 21573 0 rfcomm 37849 0 l2cap 23873 5 rfcomm bluetooth 50085 4 rfcomm,l2cap vmnet 32044 13 vmmon 175852 0 sunrpc 153725 1 ipt_REJECT 5697 1 xt_state 2625 15 ip_conntrack 52085 1 xt_state nfnetlink 7513 1 ip_conntrack xt_tcpudp 3521 17 iptable_filter 3392 1 ip_tables 12937 1 iptable_filter x_tables 14405 4 ipt_REJECT,xt_state,xt_tcpudp,ip_tables dm_mirror 29073 0 dm_mod 57433 1 dm_mirror video 17221 0 sbs 16257 0 ibm_acpi 27969 0 i2c_ec 5569 1 sbs dock 8665 0 container 4801 0 button 7249 0 battery 10565 0 asus_acpi 16857 0 ac 5701 0 ipv6 246113 20 lp 13065 0 parport_pc 27493 1 parport 37001 2 lp,parport_pc snd_intel8x0 32605 1 snd_intel8x0m 17357 0 snd_ac97_codec 91360 2 snd_intel8x0,snd_intel8x0m snd_ac97_bus 2753 1 snd_ac97_codec snd_seq_dummy 4293 0 snd_seq_oss 32705 0 snd_seq_midi_event 8001 1 snd_seq_oss snd_seq 51633 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event snd_seq_device 8781 3 snd_seq_dummy,snd_seq_oss,snd_seq snd_pcm_oss 42849 0 wlan_scan_sta 13952 1 snd_mixer_oss 16833 1 snd_pcm_oss ath_pci 92836 0 snd_pcm 76485 4 snd_intel8x0,snd_intel8x0m,snd_ac97_codec,snd_pcm_oss ath_rate_sample 14848 1 ath_pci floppy 57317 1 wlan 186588 5 wlan_wep,wlan_scan_sta,ath_pci,ath_rate_sample e1000 119505 0 ehci_hcd 31693 0 ath_hal 192208 3 ath_pci,ath_rate_sample uhci_hcd 23885 0 snd_timer 23237 2 snd_seq,snd_pcm snd 52933 12 snd_intel8x0,snd_intel8x0m,snd_ac97_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer i2c_i801 8013 0 serio_raw 7493 0 soundcore 10145 1 snd i2c_core 21697 2 i2c_ec,i2c_i801 ide_cd 38625 2 snd_page_alloc 10569 3 snd_intel8x0,snd_intel8x0m,snd_pcm pcspkr 3521 0 cdrom 34913 1 ide_cd ext3 129737 3 jbd 58473 1 ext3
And atheros too ? That thing has been almost as notoriously bad for random kernel memory corruption as the nvidia driver.
Last time I tried (which granted hasn't been with 2.6.18) I was using the e1000 driver and it crashed. If you really want I can reboot into a wired configuration and wait... But I have tried it (in 2.6.17) without atheros and it still hung.
For the sake of this bug, it'll make things a lot easier to diagnose if you don't load any of the part-binary drivers at all. (Note, that even loading them, even if they aren't in use still taints the kernel). T42p's aren't exactly uncommon, so its somewhat unusual that only you seem to be hitting this. The lack of serial port on those is going to make capturing debug info a bit tricky though. Netconsole might be worth a try, just to see if we get a backtrace when the hang occurs. netconsole is fairly trivial to set up if you have a second machine you can log to. If you can't find working instructions on the internet, let me know and I'll write up a quick recipe.
I wish it were reliably reproducible. Sometimes it'll hang multiple times in a day. Sometimes it'll go two weeks between hangs. It does seem to happen more often when I have the cpuspeed daemon installed and running. But I'll try to look at netconsole while I can (in about 3 weeks I'm going to be on the road for two months). (I'll try to keep this as NEEDINFO)
Okay, it just happened again. I had setup netconsole but there was nothing in the remote logs. :(
Created attachment 140602 [details] REPORTING_BUGS information
Fedora Core 5 is no longer maintained. Is this bug still present in Fedora 7 or Fedora 8?
I'm afraid I no longer have that piece of hardware so I don't know.