Bug 80779 - SMP kernel hangs solid, non-smp is fine
Summary: SMP kernel hangs solid, non-smp is fine
Status: CLOSED DUPLICATE of bug 82123
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel   
(Show other bugs)
Version: 9
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Jeff Garzik
QA Contact: Brian Brock
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 79578
TreeView+ depends on / blocked
 
Reported: 2002-12-31 08:45 UTC by Need Real Name
Modified: 2013-07-03 02:09 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-21 18:50:48 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
lsmod of running smp kernel (1.43 KB, text/plain)
2002-12-31 12:17 UTC, Need Real Name
no flags Details

Description Need Real Name 2002-12-31 08:45:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.7 (X11; Linux i686; U;) Gecko/20021216

Description of problem:
After typically an hour or so, the 2.4.20-2.2smp kernel pretty
consistently freezes solid in X. I presume it hangs at the kernel level, since
even the keyboard numlock etc is totally frozen. I'm running a dual Celeron
system, and it has behaved perfectly on smp with all previous versions,
including the latest rawhide before Xmas. 

The freeze happens under normal light use - when moving mouse cursor etc. I have
found no particular correlation to user actions or programs.

Booting the 2.4.20-2.2 non-smp kernel, everything seems to work fine.

Note that I just now changed the X server to the latest Phoebe variant -- before
this I had the standard 8.0 variant.

Everything freezes solid, so I don't know how to extract debvug information...

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Boot phoebe smp kernel
2.Use machine normally for a hour or so
3.Everything freezes solid (inc. keyboard, network etc.)
    

Actual Results:  Total hang

Additional info: Please tell me some hints about how to find where crash is, and
I will assist as mush as I can.

Comment 1 Arjan van de Ven 2002-12-31 10:23:06 UTC
can you paste your lsmod information to this bug (so that I can make a list of
suspects); in addition can you try to add "acpi=off" to the kernel commandline
("a" in grub, or the vmlinuz line in /boot/grub/grub.conf)

Comment 2 Need Real Name 2002-12-31 12:17:07 UTC
Created attachment 89002 [details]
lsmod of running smp kernel

Comment 3 Need Real Name 2002-12-31 12:21:33 UTC
Comment on attachment 89002 [details]
lsmod of running smp kernel

This is with acpi=off in grub.conf

Will report how this switch  affects stability when I know more...

Comment 4 Need Real Name 2003-01-01 10:16:03 UTC
With acpi=off in grub.conf, the SMP kernel seems to be stable also, so
presemably, it would seem reasonable to conclude that the problem is related to
the combination of ACPI and SMP.

Is there something that could be done to isolate the problem further?

Comment 5 Need Real Name 2003-01-01 14:36:14 UTC
Seems I jumped to conclusions:

The SMP system less ACPI now hung after 26 hours and 3 minutes.

Symptoms just as before: Keyvboard/screen/everything completely dead.

I had an external machine logged in via telnet on Externet, running "top". This
display also froze.

An interesting observation: Ping from the remote machine functioned flawlessly!
Perhaps one CPU was frozen, but the other was still active, being able to serve
the ping requests. Trying to log in via telnet failed, though.

The last "top" status showed a CPU load of 21% user on both, and a system load
of  5 and 7%. 368M memory used, 58M free. The top proceses were X at 28%,
bubblemon-gnome 8%, galeon-bin 6%, gnome-panel 4%, metacity 4%, top 1%,
evolution-mail 1%, evolution 0.5%, mixer_applet2 0.6%, gnome-session 0.1%,
xscreensaver 0.1%, magicdev 0.1%, eggcups 0.1%, yank 0.1%, init 0.0%

Comment 6 Need Real Name 2003-01-02 11:10:04 UTC
It happened again, this time after, say, 20 hours. Same circumstances, same thing.

Comment 7 Need Real Name 2003-01-07 06:45:39 UTC
Just for the record, I'm running the system with the single CPU kernel now, and
the system has been stable for the last 5 days. I think it is pretty safe to
conclude that the problem only occurs with SMP.

If there is anything at all I can do to extract more information from the crash
situation, then please let me know, and I will switch back to SMP mode again.

Comment 8 Jeff Garzik 2003-01-18 23:54:15 UTC

*** This bug has been marked as a duplicate of 82123 ***

Comment 9 Red Hat Bugzilla 2006-02-21 18:50:48 UTC
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.


Note You need to log in before you can comment on or make changes to this bug.