Description of problem:
After recent kernel update (126.96.36.199-27.fc7) system freezes randomly without
reason. It is hard freeze and I can only reboot.
Version-Release number of selected component (if applicable):
Can't reproduce it, it happens random.
Steps to Reproduce:
System doesn't freeze.
There is nothing useful in logs before freeze so I don't know what kind of
information I could provide.
I can confirm this bug. With kernels 188.8.131.52-27 and 184.108.40.206-33, my system is
frozen every time within five minutes. I can only do a hardware reset. There is
no usable output in the log files.
I have tried the nohz=off kernel option once, and the system was running a
little longer, maybe half an hour, but then froze again.
The last stable kernel was 2.6.21-1.3228.
I have also experienced random freezes under kernel 220.127.116.11-27.fc7. I'd like to
think it related to high I/O since it usually went down during heavier cron
jobs, but am not certain about it. No messages in any logs and since it's a
remote machine I have no way to get serial or screen output.
Reverted to kernel 2.6.20-1.2962.fc6 (yes, fc6) a few days ago which so far has
It's not load dependend on my system. It even freezes when it is completely idle.
It may be a NVidia proprietary driver problem. When I install it, my system has
mentioned problems, and even after uninstalling driver, system still freezes (so
it might be that NVidia driver from Livna repo changes some crucial libs too,
I'm just guessing). I did a fresh installation, I am not using NVidia
proprietary driver anymore and my system doesn't freeze now. OTOH it is really
hard for me to say who is "guilty" here, Fedora or NVidia driver or Livna
package so I'm not sure whether I should mark this bug as NOTABUG or
WONTFIX/CANTFIX. I will leave it as it is.
This bug is not nVidia related. My example is a dedicated headless server with
an ATI Rage XL card and no custom drivers.
The only thing I had to change to make it stop freezing was the kernel.
Everything else remains exactly the same. I have not tried the newer
18.104.22.168-41.fc7 kernel yet, and won't till I have to reboot for some other reason.
I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.
I am CC'ing myself to this bug and will try and assist you in resolving it if I can.
There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel? You may wish to try some
of the following in helping diagnose the problem:
* If it's repeatable, hooking up a serial cable to a second box can be
useful for capturing kernel messages that may get printed just before the
lockup. Configure the machine being debugged to boot with console=ttyS0,115200
console=tty0 and run a terminal program such as minicom on the other end.
Configure the remote end to talk at the same baud rate (115200). (In minicom
ctrl-a, p, i, enter. More info on setting up a serial terminal can be found at
* Sometimes just getting lsmod output from users can yield enough clues if
there are multiple reports and common modules between both. (It also allows to
filter out reports from users of nvidia,vmware etc).
* Hooking up serial console / netconsole can sometimes get debug info out of
* If the hang happened whilst in X, the machine may still respond to ssh
logins from other machines. Try this to get a dmesg.
* The magic sysrq key might work. Enable it with sysctl kernel.sysrq=1 (or
put kernel.sysrq = 1 in your /etc/sysctl.conf). This will allow you to hit
ctrl-alt-sysrq and various keys to get debugging info.
m will dump information about the current state of memory
t will dump the state of every task the kernel knows about
s will sync all data pending writeback to disk. (This is useful so that this
debug info actually stands a chance of hitting the log files.)
* You can also trigger magic sysrq functions by echo'ing the relevant one
letter command to /proc/sysrq-trigger
* booting with nmi_watchdog=2 may cause a backtrace to occur when the lockup
If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.
I am currently using kernel 22.214.171.124-65.fc7 which has been nicely stable since I
updated. I consider the problem solved, whatever it was...
So far only kernel I personally know to be bad was 126.96.36.199-27.fc7, and even
then only on some machines. It'd hang daily on the servers, but has been stable
for months now on a desktop machine.
Machines 188.8.131.52-27.fc7 would daily hang on:
Machine 184.108.40.206-27.fc7 was stable on:
Okay, thanks for the update Tino, I'm closing this bug as suggested then.