From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040518 Firefox/0.8 Description of problem: When running under normal desktop load using the SMP-enabled kernel, the system freezes on a regular basis. The system becomes complete frozen (won't respond to input, pings, ctrl-alt-del, etc) and requires a power cycle to recover. The freezes seems to occur most often when either one of the following two things occur. 1. Rhythmbox is playing MP3s 2. Xflame screensaver is running The system will usually operate for some random period of time (usually between 15 minutes and an hour) before freezing. System Specs: Gigabyte GA-7DPXDW+ motherboard (AMD 760MPX chipset) 2x Athlon MP 2400+ 2x 512 MB Registered/ECC dimms 80 GB Seagate ATA-100 drive on primary IDE channel USB logitech keyboard USB MS optical mouse Troubleshooting information: When the system freezes, there is no indication in the syslog of any problem (no oops or other error message). Attempted fixes: I tried making sure there was a PS/2 mouse attached to the machine (as suggested in http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.3/0382.html) but it had no effect. Neither did booting with 'noapic'. However, booting the UP kernel results in a machine that can play MP3s in rhythmbox for hours (36 so far) without freezing. Version-Release number of selected component (if applicable): kernel-smp-2.6.5-1.358 How reproducible: Always Steps to Reproduce: 1. Boot SMP kernel 2. Start rhythmbox or set xflame as screensaver 3. Wait for hang Actual Results: System hangs Expected Results: System doesn't hang Additional info:
Created attachment 100743 [details] output of lspci -v
Created attachment 100744 [details] dmesg output when running with UP kernel
Created attachment 100745 [details] dmesg output when running with SMP kernel
Created attachment 100747 [details] Annotated syslog output This annotated output from syslog spans the system booting into the SMP kernel, freezing a few minutes later, and then being rebooted into the UP kernel.
installed kernel 2.6.6-1.427. SMP kernel still hangs while playing music or during screensavers. UP kernel still runs flawlessly.
My Athlon MP box throws a fit if it isnt booted with acpi=off, but doesn't get as far as yours without that. Might be worth a first try though
Thanks, that helped a bit. acpi=off by itself had no effect, but combined with noapic, it increased stability somewhat. Crashes now seem to be occurring within 12-24 hours when using the SMP kernel as opposed to within 1-2 hours before. Also, upgrade to 2.6.6-1.435.2.3 had no effect.
I have a similar problem running the smp version of kernel-smp-2.6.5-1.358 on my Pentium 4 with hyperthreading. However, I did not get a hard freeze. I first noticed it when my keyboard stopped working. It took me a while to realize that it was not my wireless keyboard flaking out, but instead some code in a deadlock or infinite loop. It looked to me as though one of the "processors" was stuck a %100 executing something. However, I could still do any normal operations, except type. I believe too that other processes scheduled to use the busy CPU were also sleeping waiting for some CPU time. The solution to the problem is to reboot. Although, a couple times, the keyboard would start working again which is why I initially thought it was my keyboard. I have since been using the non-smp kernel, but it tends to freeze hard every once in a while, which is not that helpful.
Alex can you file yours as a separate bug - the two don't initially sound related bugs. In the new bug if you can attach the output of lspci -v that would also be useful. Finally if the board is Intel E75xx based you might want to try turning off USB legacy support in the bios and/or booting with acpi=off. I don't think this one is acpi however
Just an additional data point, I tried using nmi_watchdog to force an oops in case the processor was locking up. However, setting nmi_watchdog=2 in the kernel startup options didn't generate anything (oops or not) when the system froze.
Installed 2.6.8-1.521. Still seeing same problem with SMP kernel and no problem with UP kernel...
My system freezes too.... :-( I'm running FC2 on a dual Xeon 2Ghz, SE7500CW2 motherboard server based. Kernel 2.6.8-1.610smp with noapic acpi=ht. I got the messages below in the error log right afer booting: kernel: SMP mptable: checksum error! kernel: BIOS bug, MP table errors detected!... kernel: ... disabling SMP support. (tell your hw vendor) After 3 days running fine, it hung 10 minutes ago. I got no error messages in log regarding this crash..
Re: SE7500CW2 > SMP mptable: checksum error! Please verify that you're running the latest BIOS: http://support.intel.com/support/motherboards/server/se7500cw2/ If you still have a problem, you'll probably want to file a separate bug, as it is unlikely you've got the same problem as Cushing. Also, if you need either "noapic" or "acpi=ht" to make your machine run properly, that is also a bug.
Installed 2.6.9-1.6_FC2. Basically the same results, but with one new (interesting?) datapoint: 2.6.9-1.6_FC2 - fine 2.6.9-1.6_FC2 noapic acpi=off - fine 2.6.9-1.6_FC2smp - freeze 2.6.9-1.6_FC2smp noapic acpi=off - freeze However, I just found out about the maxcpus and nosmp kernel boot params. Just to test, I tried 2.6.9-1.6_FC2smp with maxcpus=1. Even without the the noapic and acpi=off directives, the result was a stable system with no freezes (albeit with only one processor running). Is this important, or does maxcpus=1 just end up recreating the equivalent of a UP kernel? I plan on testing with the nosmp directive this weekend.
Upgraded to Fedora Core 3. Still had the same problem with my default setup. However, upon further research, This is probably an issue with NFS. One detail that I hadn't mentioned before (since it didn't seem relevant) is that my MP3 are shared from my home server via NFS. After noticing reports of SMP-unsafe behavior in NFS, I decided to try my system without any mounts in the equation. The system is totally stable (2+ days so far) when playing MP3s off of a local disk, versus lockups within 2-3 hours when being retrieved over NFS. The NFS server is running RH7.3. Any ideas about how to get back my mounts without sacrificing stability? Is CIFS more stable under SMP than NFS?
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you.
This bug has been automatically closed as part of a mass update. It had been in NEEDINFO state since July 2005. If this bug still exists in current errata kernels, please reopen this bug. There are a large number of inactive bugs in the database, and this is the only way to purge them. Thank you.