Red Hat Bugzilla – Bug 80779
SMP kernel hangs solid, non-smp is fine
Last modified: 2013-07-02 22:09:06 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.7 (X11; Linux i686; U;) Gecko/20021216
Description of problem:
After typically an hour or so, the 2.4.20-2.2smp kernel pretty
consistently freezes solid in X. I presume it hangs at the kernel level, since
even the keyboard numlock etc is totally frozen. I'm running a dual Celeron
system, and it has behaved perfectly on smp with all previous versions,
including the latest rawhide before Xmas.
The freeze happens under normal light use - when moving mouse cursor etc. I have
found no particular correlation to user actions or programs.
Booting the 2.4.20-2.2 non-smp kernel, everything seems to work fine.
Note that I just now changed the X server to the latest Phoebe variant -- before
this I had the standard 8.0 variant.
Everything freezes solid, so I don't know how to extract debvug information...
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Boot phoebe smp kernel
2.Use machine normally for a hour or so
3.Everything freezes solid (inc. keyboard, network etc.)
Actual Results: Total hang
Additional info: Please tell me some hints about how to find where crash is, and
I will assist as mush as I can.
can you paste your lsmod information to this bug (so that I can make a list of
suspects); in addition can you try to add "acpi=off" to the kernel commandline
("a" in grub, or the vmlinuz line in /boot/grub/grub.conf)
Created attachment 89002 [details]
lsmod of running smp kernel
Comment on attachment 89002 [details]
lsmod of running smp kernel
This is with acpi=off in grub.conf
Will report how this switch affects stability when I know more...
With acpi=off in grub.conf, the SMP kernel seems to be stable also, so
presemably, it would seem reasonable to conclude that the problem is related to
the combination of ACPI and SMP.
Is there something that could be done to isolate the problem further?
Seems I jumped to conclusions:
The SMP system less ACPI now hung after 26 hours and 3 minutes.
Symptoms just as before: Keyvboard/screen/everything completely dead.
I had an external machine logged in via telnet on Externet, running "top". This
display also froze.
An interesting observation: Ping from the remote machine functioned flawlessly!
Perhaps one CPU was frozen, but the other was still active, being able to serve
the ping requests. Trying to log in via telnet failed, though.
The last "top" status showed a CPU load of 21% user on both, and a system load
of 5 and 7%. 368M memory used, 58M free. The top proceses were X at 28%,
bubblemon-gnome 8%, galeon-bin 6%, gnome-panel 4%, metacity 4%, top 1%,
evolution-mail 1%, evolution 0.5%, mixer_applet2 0.6%, gnome-session 0.1%,
xscreensaver 0.1%, magicdev 0.1%, eggcups 0.1%, yank 0.1%, init 0.0%
It happened again, this time after, say, 20 hours. Same circumstances, same thing.
Just for the record, I'm running the system with the single CPU kernel now, and
the system has been stable for the last 5 days. I think it is pretty safe to
conclude that the problem only occurs with SMP.
If there is anything at all I can do to extract more information from the crash
situation, then please let me know, and I will switch back to SMP mode again.
*** This bug has been marked as a duplicate of 82123 ***
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.