Bug 64070 - APIC error on SMP Athlon with kernel-2.4.9-31
APIC error on SMP Athlon with kernel-2.4.9-31
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
athlon Linux
medium Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-04-24 22:05 EDT by tanner
Modified: 2008-08-01 12:22 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-30 11:39:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description tanner 2002-04-24 22:05:38 EDT
Description of Problem:Installed 7.2 on the following system:

Dual (2) AMD ATHLON MP 1900+ CPUs
ASUA7M266D Motherboard
2 sticks of  Corsair CM72SD512R-2100 (1Gb RAM)

When booting the 2.4.9-31smp kernel the box hangs with the following error:

APIC error on CPU1: 00(02)
APIC error on CPU0: 00(02)

I downgraded to the 2.4.7-10smp kernel with the same result.

Version-Release number of selected component (if applicable):
kernel-2.4.9-31
kernel-2.4.7-10

Additional Information:

I believe this might be related to Bug #58814
Comment 1 tanner 2002-04-26 20:08:05 EDT
Changed the platform to athlon.

I also tried rawhide's kernel-2.4.18-0.22, same result.

I download a stock 2.4.18 kernelfrom ftp.us.kernel.org, same result.

I ran memtester <http://www.qcc.sk.ca/~charlesc/software/memtester/> on the box
for 25 hours with no errors, so I pretty sure the problem is not with the RAM.

The box runs fine on a single CPU kernel. I swapped CPUs on the motherboard (to
test the other CPU), booted an SMP kernel and the box hung. 

Booted a single CPU kernel and the box works.

Swapped BOTH CPUs out with for new CPUs. Booted and SMP kernel. Box hangs.

Swapped out the motherboard. Tried both the new CPUs and the older CPUs and an
SMP kernel. Box hangs.

I'm pretty confident that it's a kernel issue and not a hardware issue.
Comment 2 Arjan van de Ven 2002-04-27 03:09:31 EDT
APIC errors are basically hardware failures. However the kernel should recover
(eg retry). If nothing else helps you can just add "noapic" to the kernel
commandline
(the vmlinuz line in /etc/grub/grub.conf or an append line in lilo)
Comment 3 tanner 2002-04-27 03:38:18 EDT
Talking on lkml, I got some more information.

The POST show the following:

CPU0 AMD ATHLONG (TM) MP 1900+
CPU1 *AMD ATHLONG (TM) MP 1900+

Not sure what the asterick means.

BUT, 2.4.9 and 2.4.18 detect the CPU as 

CPU0 AMD ATHLON (TM) XP 1900+ stepping 02
CPU1: AMD ATHLON (TM) XP 1900+ stepping 02

Notice the POST show an MP and the kernel an XP. I know AMD "fixed" the XP
processors so they can be set in a dual configuration. I dug out the original
boxes the CPUs came in. The box does say MP processors.

I plan to yank them out and look right no the chip tommorrow.

Complete message on the screen follows prior to lockup:

CPU0 AMD ATHLON (TM) XP 1900+ stepping 02
per CPU timeslice cutoff: 731.19 usecs
task migration cache decay timeout: 10 msecs
enabled ExtINT on CPU #0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
booting processor 1/1 eip 2000
Initializing CPU#1
masked ExtINT on CPU #1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop...3198.15 BogoMIPS
CPU: L1 I Cache: 64K (64 bytes/lines, Dcache 64 (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/lines)
Intel machine check reporting enabled on CPU #1
CPU1: AMD ATHLON (TM) XP 1900+ stepping 02
total of 2 processors activated (6389.76 BogoMIPS)
Enabling IO-APIC IRQs
Setting 2 in the phy_id_present_map
... changing IO-APIC physical APIC ID to 2...ok
... TIMER vector 0x31 ping1=2 pin2=0
testing the IO APIC ...................................... done.
Using ocal APIC timer interrupts.
Calibrating APIC timer...
... CPU clock speed is 1600.0830 Mhz.
... host bus clock speed is 266.687 Mhz.
CPU: 0, clocks: 2666804, slice: 888934
CPUO <T0:2666800, T1:1777856, D:10, S:888934, C:2666804>
CPU: 1, clocks: 2666804, slice: 888934
CPU1 <T0:2666800, T1:8889286, D:4, S:888934, C:2666804>

Comment 4 tanner 2002-04-27 03:39:48 EDT
This is what I get for trying to type at 2:40 in the morning.

CPU0 AMD ATHLON (TM) XP 1900+ stepping 02
CPU1: AMD ATHLON (TM) XP 1900+ stepping 02

 Notice the POST show an MP and the kernel an XP. I know AMD "fixed" the XP
 processors so they can NOT be set in a dual configuration. I dug out the  
original boxes the CPUs came in. The box does say MP processors.

Comment 5 compwiz 2002-12-28 17:23:21 EST
I get APIC errors a lot with my dual Athlon 1.2 ghz, but it doesn't seem to
produce any ill effects on the system (perhaps a slowdown? I don't know.)

APIC error on CPU1: 02(02)
APIC error on CPU0: 02(02)
Comment 6 Ben LaHaise 2003-01-02 11:35:16 EST
This doesn't seem to occur on my dual Athlon running the 2.4.18-19.7.xsmp
kernel.  Could you try updating to this kernel?
Comment 7 Ben LaHaise 2003-01-02 11:35:59 EST
Oh, one more point, if /proc/cpuinfo claims the CPUs are Athlon XPs, then your
BIOS needs to be updated.
Comment 8 Bugzilla owner 2004-09-30 11:39:32 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.