Bug 32917 - RH7.1 hangs on 5+ CPU system
RH7.1 hangs on 5+ CPU system
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-03-23 16:49 EST by Wendy Hung
Modified: 2007-04-18 12:32 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-06-26 13:07:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Wendy Hung 2001-03-23 16:49:37 EST
Installed RH7.1 QA0319 on Netfinity 8500R (8-way system).  SMP kernel autodetected and auto-installed.
Upon reboot, system hangs at processor initialization screen (prior to kernel boot: prompt screen).

Not a hard hang -- Num Lock still reponds. 
Uni-processor kernel boots OK.  Bug occurs with 5 or more CPUs.
Comment 1 Matt Wilson 2001-03-23 17:11:40 EST
did this get to the lilo screen?  or did it hang before lilo?  If it made it to
lilo, what messages were printed to the console?
Comment 2 Wendy Hung 2001-03-23 17:19:53 EST
Correction: hang occurs after selecting the 'linux' kernel screen, but before the login screen.
Screen messages refer to Starting up CPUs.
Comment 3 Arjan van de Ven 2001-03-23 17:24:26 EST
The 0319 snapshot contains a lot of debugging code to catch memory-allocation
related errors. We recently found a very important bug in the kernel that
somehow mostly triggered on SMP machines with 4 or more cpus. We have since
fixed this bug in kernel 2.4.2-0.1.35. Hopefully this kernel will be available
to betatesters soon (QA is testing it right now) either as a "kernel rpm" or
as a "full snapshot". It would be very much appreciated if you could test
such a new kernel once it becomes available. 
(This kernel is not yet present in the 0322 snaphot, but should be in any newer
 ones if/when they become available)
Comment 4 Glen Foster 2001-03-24 11:44:15 EST
I have placed the 0.1.35 kernel rpms on ftp.beta.redhat.com for your
examination/use.  Connect via ftp to ftp.beta.redhat.com, login as user "beta". 
From there, the relative paths for all the RPMs are:

pub/errata/7.1/SRPMS/kernel-2.4.2-0.1.35.src.rpm
pub/errata/7.1/i386/devfsd-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i386/kernel-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i386/kernel-BOOT-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i386/kernel-doc-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i386/kernel-headers-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i386/kernel-source-2.4.2-0.1.35.i386.rpm
pub/errata/7.1/i586/kernel-2.4.2-0.1.35.i586.rpm
pub/errata/7.1/i586/kernel-smp-2.4.2-0.1.35.i586.rpm
pub/errata/7.1/i686/kernel-2.4.2-0.1.35.i686.rpm
pub/errata/7.1/i686/kernel-enterprise-2.4.2-0.1.35.i686.rpm
pub/errata/7.1/i686/kernel-smp-2.4.2-0.1.35.i686.rpm
Comment 5 Wendy Hung 2001-03-26 17:47:13 EST
Installed the kernel-headers and kernel-smp rpms.  Ran mkinitrd and edited /etc/lilo.conf.
Upon reboot into the 2.4.2-0.1.35smp kernel, same hang as reported.
Messages on screen:

...
Asserting INIT
Waiting for send to finish...
+ Deasserting INIT
 Waiting for send to finish...
+# Startup loops:2
Sending STARTUP #1
After apic_write,
Startup point 1
Waiting for send to finish...
+Sending STARTUP #2
After apic_write,
Starting point 1
Waiting for send to finish...
+After Startup
Before Callout 1
Comment 6 Wendy Hung 2001-03-29 13:05:18 EST
Same bug with QA0327.2   (2.4.2-0.1.40smp kernel)
Comment 7 Arjan van de Ven 2001-03-29 13:09:21 EST
Does a non-redhat 2.4 kernel boot ?
If not, this sounds like a bios bug
Comment 8 Wendy Hung 2001-03-29 16:26:07 EST
Yes, a non-redhat 2.4.2 kernel boots successfully and sees all 8 processors.
Comment 9 Wendy Hung 2001-04-06 13:04:01 EDT
Fixed in QA0404 2.4.2-0.1.49smp kernel.
Comment 10 Wendy Hung 2001-05-22 10:19:28 EDT
Not fixed in RH 7.1 Gold.
Comment 11 Arjan van de Ven 2001-05-22 10:23:45 EDT
2.4.2-0.1.49 and the gold kernel are virtually identical (except for the
corruption fix)..........

I'm getting confused here.
Comment 12 Arjan van de Ven 2001-06-26 09:36:13 EDT
If this is still hapening with the released errata kernel I'd like to know
Comment 13 Wendy Hung 2001-06-26 13:07:42 EDT
Bug fixed in kernel errata (2.4.3-12smp)
http://www.redhat.com/support/errata/RHSA-2001-084.html

Also works using RH 7.1 SBE (2.4.3-6smp)

Note You need to log in before you can comment on or make changes to this bug.