Description of problem: Have been having occasional system lockups after the 2.6.10-1.770_FC3 kernel. Seems to be pretty consistent in that it does occur, seen in all kernels since. However, the time it takes to show up varies from hours to several days. After 2 lockups on Monday (of course) while moving between gnome desktops, finally compiled Ingo's lockupcli and ran it to see if I could confirm that the nmi_watchdog=1 parameter actually worked on the box. For about 30 seconds after starting, the system did slow, but continued to take keyboard and mouse input. After 30 seconds or so, it finally hung solid. In the serial window, just prior to the lockup, the oops message did print. As soon as that popped out, the box was hung. Version-Release number of selected component (if applicable): all since 2.6.10-1.770_FC3 How reproducible: run Ingo's lockupcli. Steps to Reproduce: 1. compile Ingo's lockupcli 2. run it 3. Actual results: hard system hang after a few 10's of seconds Expected results: No hard system hangs Additional info: logs to follow. This is happening on 2 similar ICH5 based systems.
Created attachment 117643 [details] serial console log of oops prior to hang
I had previously reported this in bug 154190, but that never went anywhere. This seems to be a way to reproduce that I could not previously.
well, running lock-up-the-box did lock up the box, but the NMI watchdog got a traceback of it. So it works as advertised. do your other hard lockups (_NOT_ the self-induced ones) produce any NMI watchdog output on the serial console?
(In reply to comment #3) > well, running lock-up-the-box did lock up the box, but the NMI watchdog got a > traceback of it. So it works as advertised. Ok. thought it was going to dump me into the debugger or something. So the behavior is to dump some info, then hang, correct? > > do your other hard lockups (_NOT_ the self-induced ones) produce any NMI > watchdog output on the serial console? Yes and no. :) Did save one on the laptop that got re-installed. Will attempt another capture. One other odd point with no hard data to back it up: the fails seem to come in clusters: I can run 2 weeks or so, and then suddenly, for a day or two, things jump the track continiously. I may get 4 or 5 lockups back to back to back. Then it goes away for a coupla weeks. This happened on Monday of this week (2005-09-12) again. Hm. I see that this was about 30 days ago when this was opened.... Yes, we can leave this in needinfo until I get a chance to bang on a system with serial console attached and logging. Just that these systems are my workstations, and the boss is rather demanding. :)
Good news and bad news. The good news is that the hang is 100% reproducible with a few seconds. Fire up gmplayer on your fav video, and start moving the window around with the mouse. It will hang in 10-15 seconds. Bad news is that there is no backtrace.
One other comment, the hang almost always involves the graphical subsys. It has happened when switching between desktops, and raising and lowering something like firefox when trying to peek at a partially obscured window. On other similar systems without graphics, have never observed it. In comment 5, I let the system sit after the last hang for ~30 minutes before giving up.
This is a mass-update to all currently open Fedora Core 3 kernel bugs. Fedora Core 3 support has transitioned to the Fedora Legacy project. Due to the limited resources of this project, typically only updates for new security issues are released. As this bug isn't security related, it has been migrated to a Fedora Core 4 bug. Please upgrade to this newer release, and test if this bug is still present there. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. Thank you.
This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you.
Closing per previous comment.