Installed RH7.1 QA0319 on Netfinity 8500R (8-way system). SMP kernel autodetected and auto-installed. Upon reboot, system hangs at processor initialization screen (prior to kernel boot: prompt screen). Not a hard hang -- Num Lock still reponds. Uni-processor kernel boots OK. Bug occurs with 5 or more CPUs.
did this get to the lilo screen? or did it hang before lilo? If it made it to lilo, what messages were printed to the console?
Correction: hang occurs after selecting the 'linux' kernel screen, but before the login screen. Screen messages refer to Starting up CPUs.
The 0319 snapshot contains a lot of debugging code to catch memory-allocation related errors. We recently found a very important bug in the kernel that somehow mostly triggered on SMP machines with 4 or more cpus. We have since fixed this bug in kernel 2.4.2-0.1.35. Hopefully this kernel will be available to betatesters soon (QA is testing it right now) either as a "kernel rpm" or as a "full snapshot". It would be very much appreciated if you could test such a new kernel once it becomes available. (This kernel is not yet present in the 0322 snaphot, but should be in any newer ones if/when they become available)
I have placed the 0.1.35 kernel rpms on ftp.beta.redhat.com for your examination/use. Connect via ftp to ftp.beta.redhat.com, login as user "beta". From there, the relative paths for all the RPMs are: pub/errata/7.1/SRPMS/kernel-2.4.2-0.1.35.src.rpm pub/errata/7.1/i386/devfsd-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i386/kernel-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i386/kernel-BOOT-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i386/kernel-doc-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i386/kernel-headers-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i386/kernel-source-2.4.2-0.1.35.i386.rpm pub/errata/7.1/i586/kernel-2.4.2-0.1.35.i586.rpm pub/errata/7.1/i586/kernel-smp-2.4.2-0.1.35.i586.rpm pub/errata/7.1/i686/kernel-2.4.2-0.1.35.i686.rpm pub/errata/7.1/i686/kernel-enterprise-2.4.2-0.1.35.i686.rpm pub/errata/7.1/i686/kernel-smp-2.4.2-0.1.35.i686.rpm
Installed the kernel-headers and kernel-smp rpms. Ran mkinitrd and edited /etc/lilo.conf. Upon reboot into the 2.4.2-0.1.35smp kernel, same hang as reported. Messages on screen: ... Asserting INIT Waiting for send to finish... + Deasserting INIT Waiting for send to finish... +# Startup loops:2 Sending STARTUP #1 After apic_write, Startup point 1 Waiting for send to finish... +Sending STARTUP #2 After apic_write, Starting point 1 Waiting for send to finish... +After Startup Before Callout 1
Same bug with QA0327.2 (2.4.2-0.1.40smp kernel)
Does a non-redhat 2.4 kernel boot ? If not, this sounds like a bios bug
Yes, a non-redhat 2.4.2 kernel boots successfully and sees all 8 processors.
Fixed in QA0404 2.4.2-0.1.49smp kernel.
Not fixed in RH 7.1 Gold.
2.4.2-0.1.49 and the gold kernel are virtually identical (except for the corruption fix).......... I'm getting confused here.
If this is still hapening with the released errata kernel I'd like to know
Bug fixed in kernel errata (2.4.3-12smp) http://www.redhat.com/support/errata/RHSA-2001-084.html Also works using RH 7.1 SBE (2.4.3-6smp)