Bug 249652
Summary: | System randomly freezes after kernel 2.6.22.1-27.fc7 update | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Rafał Polak <rafpolak> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 7 | CC: | chris.brown, mail, redhat |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | 2.6.22.4-65.fc7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-09-21 08:12:18 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rafał Polak
2007-07-26 07:11:19 UTC
I can confirm this bug. With kernels 2.6.22.1-27 and 2.6.22.1-33, my system is frozen every time within five minutes. I can only do a hardware reset. There is no usable output in the log files. I have tried the nohz=off kernel option once, and the system was running a little longer, maybe half an hour, but then froze again. The last stable kernel was 2.6.21-1.3228. Smolt profile: http://smolt.fedoraproject.org/show?UUID=f57fd3b1-bc30-4ab8-b0d0-70ca1a4cff07 I have also experienced random freezes under kernel 2.6.22.1-27.fc7. I'd like to think it related to high I/O since it usually went down during heavier cron jobs, but am not certain about it. No messages in any logs and since it's a remote machine I have no way to get serial or screen output. Reverted to kernel 2.6.20-1.2962.fc6 (yes, fc6) a few days ago which so far has not crashed. Smolt: http://smolt.fedoraproject.org/show?UUID=6e55d75b-36da-47a5-8de8-c9aa233a743c It's not load dependend on my system. It even freezes when it is completely idle. It may be a NVidia proprietary driver problem. When I install it, my system has mentioned problems, and even after uninstalling driver, system still freezes (so it might be that NVidia driver from Livna repo changes some crucial libs too, I'm just guessing). I did a fresh installation, I am not using NVidia proprietary driver anymore and my system doesn't freeze now. OTOH it is really hard for me to say who is "guilty" here, Fedora or NVidia driver or Livna package so I'm not sure whether I should mark this bug as NOTABUG or WONTFIX/CANTFIX. I will leave it as it is. This bug is not nVidia related. My example is a dedicated headless server with an ATI Rage XL card and no custom drivers. The only thing I had to change to make it stop freezing was the kernel. Everything else remains exactly the same. I have not tried the newer 2.6.22.1-41.fc7 kernel yet, and won't till I have to reboot for some other reason. Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug and will try and assist you in resolving it if I can. There hasn't been much activity on this bug for a while. Could you tell me if you are still having problems with the latest kernel? You may wish to try some of the following in helping diagnose the problem: * If it's repeatable, hooking up a serial cable to a second box can be useful for capturing kernel messages that may get printed just before the lockup. Configure the machine being debugged to boot with console=ttyS0,115200 console=tty0 and run a terminal program such as minicom on the other end. Configure the remote end to talk at the same baud rate (115200). (In minicom ctrl-a, p, i, enter. More info on setting up a serial terminal can be found at http://searchenterpriselinux.techtarget.com/tip/0,289483,sid39_gci1118136,00.html * Sometimes just getting lsmod output from users can yield enough clues if there are multiple reports and common modules between both. (It also allows to filter out reports from users of nvidia,vmware etc). * Hooking up serial console / netconsole can sometimes get debug info out of the machine. * If the hang happened whilst in X, the machine may still respond to ssh logins from other machines. Try this to get a dmesg. * The magic sysrq key might work. Enable it with sysctl kernel.sysrq=1 (or put kernel.sysrq = 1 in your /etc/sysctl.conf). This will allow you to hit ctrl-alt-sysrq and various keys to get debugging info. m will dump information about the current state of memory t will dump the state of every task the kernel knows about s will sync all data pending writeback to disk. (This is useful so that this debug info actually stands a chance of hitting the log files.) * You can also trigger magic sysrq functions by echo'ing the relevant one letter command to /proc/sysrq-trigger * booting with nmi_watchdog=2 may cause a backtrace to occur when the lockup happens. If the problem no longer exists then please close this bug or I'll do so in a few days if there is no additional information lodged. Cheers Chris I am currently using kernel 2.6.22.4-65.fc7 which has been nicely stable since I updated. I consider the problem solved, whatever it was... For reference: So far only kernel I personally know to be bad was 2.6.22.1-27.fc7, and even then only on some machines. It'd hang daily on the servers, but has been stable for months now on a desktop machine. Machines 2.6.22.1-27.fc7 would daily hang on: http://smolt.fedoraproject.org/show?UUID=22389d7f-e24a-474c-ae79-c3904112486a http://smolt.fedoraproject.org/show?UUID=6e55d75b-36da-47a5-8de8-c9aa233a743c Machine 2.6.22.1-27.fc7 was stable on: http://smolt.fedoraproject.org/show?UUID=694597f3-b14b-41c0-bf56-535d9f69280f Okay, thanks for the update Tino, I'm closing this bug as suggested then. Cheers Chris |