From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux 2.4.2-2 i686; en-US; rv:0.9) Gecko/20010509 Description of problem: Kernels 2.4 writes "APIC error on CPU0: 08(02):" How reproducible: Sometimes Steps to Reproduce: 1. When the machine is heavy loaded Expected Results: "Nothing ;-)" Additional info: I had to upgrade to 2.4 kernels couse of my disks (3xIBM DTLA 30GB) and HPT366 (machine hangs with 2.2 kernels on in this configurations) MB: ABIT BP6 with 2x Celeron 500MHz CPU: L2 cache: 128K Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0183fbff 00000000 00000000 00000000 CPU: After generic, caps: 0183fbff 00000000 00000000 00000000 CPU: Common caps: 0183fbff 00000000 00000000 00000000 Enabling fast FPU save and restore... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX mtrr: v1.40 (20010327) Richard Gooch (rgooch.au) mtrr: detected mtrr type: Intel CPU: Before vendor init, caps: 0183fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 128K Intel machine check reporting enabled on CPU#0. CPU: After vendor init, caps: 0183fbff 00000000 00000000 00000000 CPU: After generic, caps: 0183fbff 00000000 00000000 00000000 CPU: Common caps: 0183fbff 00000000 00000000 00000000 CPU0: Intel Celeron (Mendocino) stepping 05 per-CPU timeslice cutoff: 365.67 usecs. Getting VERSION: 40011 Getting VERSION: 40011 Getting ID: 0 Getting ID: f000000 Getting LVT0: 700 Getting LVT1: 400 enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 CPU present map: 3 Booting processor 1/1 eip 2000 Setting warm reset code and vector. 1. 2. 3. Asserting INIT. Waiting for send to finish... +Deasserting INIT. Waiting for send to finish... +#startup loops: 2. Sending STARTUP #1. After apic_write. Initializing CPU#1 CPU#1 (phys ID: 1) waiting for CALLOUT Startup point 1. Waiting for send to finish... +Sending STARTUP #2. After apic_write. Startup point 1. Waiting for send to finish... +After Startup. Before Callout 1. After Callout 1. CALLIN, before setup_local_APIC(). masked ExtINT on CPU#1 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Calibrating delay loop... 1019.08 BogoMIPS Stack at about c1449fb8 CPU: Before vendor init, caps: 0183fbff 00000000 00000000, vendor = 0 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 128K Intel machine check reporting enabled on CPU#1. CPU: After vendor init, caps: 0183fbff 00000000 00000000 00000000 CPU: After generic, caps: 0183fbff 00000000 00000000 00000000 CPU: Common caps: 0183fbff 00000000 00000000 00000000 OK. CPU1: Intel Celeron (Mendocino) stepping 05 CPU has booted. Before bogomips. Total of 2 processors activated (2034.89 BogoMIPS). Before bogocount - setting activated=1. Boot done. ENABLING IO-APIC IRQs ...changing IO-APICphysical APIC ID to 2 ... ok. Synchronizing Arb IDs. init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-9, 2-10, 2-11, 2-20, 2-21, 2-22, 2-23 not connected. ..TIMER: vector=49 pin1=2 pin2=0 number of MP IRQ sources: 19. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 .... register #01: 00170011 ....... : max redirection entries: 0017 ....... : IO APIC version: 0011 .... register #02: 00000000 ....... : arbitration: 00 .... IRQ redirection table: NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 003 03 0 0 0 0 0 1 1 39 02 003 03 0 0 0 0 0 1 1 31 03 003 03 0 0 0 0 0 1 1 41 04 003 03 0 0 0 0 0 1 1 49 05 000 00 1 0 0 0 0 0 0 00 06 003 03 0 0 0 0 0 1 1 51 07 003 03 0 0 0 0 0 1 1 59 08 003 03 0 0 0 0 0 1 1 61 09 000 00 1 0 0 0 0 0 0 00 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 003 03 0 0 0 0 0 1 1 69 0d 003 03 0 0 0 0 0 1 1 71 0e 003 03 0 0 0 0 0 1 1 79 0f 003 03 0 0 0 0 0 1 1 81 10 003 03 1 1 0 1 0 1 1 89 11 003 03 1 1 0 1 0 1 1 91 12 003 03 1 1 0 1 0 1 1 99 13 003 03 1 1 0 1 0 1 1 A1 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 2 IRQ1 -> 1 IRQ3 -> 3 IRQ4 -> 4 IRQ6 -> 6 IRQ7 -> 7 IRQ8 -> 8 IRQ12 -> 12 IRQ13 -> 13 IRQ14 -> 14 IRQ15 -> 15 IRQ16 -> 16 IRQ17 -> 17 IRQ18 -> 18 IRQ19 -> 19 .................................... done. calibrating APIC timer ... ..... CPU clock speed is 510.0941 MHz. ..... host bus clock speed is 68.0124 MHz. cpu: 0, clocks: 680124, slice: 226708 CPU0<T0:680112,T1:453392,D:12,S:226708,C:680124> cpu: 1, clocks: 680124, slice: 226708 CPU1<T0:680112,T1:226688,D:8,S:226708,C:680124> checking TSC synchronization across CPUs: passed. Setting commenced=1, go go go mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs PCI: PCI BIOS revision 2.10 entry at 0xfb440, last bus=1 PCI: Using configuration type 1 PCI: Probing PCI hardware Unknown bridge resource 0: assuming transparent PCI: Using IRQ router PIIX [8086/7110] at 00:07.0 PCI->APIC IRQ transform: (B0,I9,P0) -> 19 PCI->APIC IRQ transform: (B0,I13,P0) -> 17 PCI->APIC IRQ transform: (B0,I15,P0) -> 16 PCI->APIC IRQ transform: (B0,I19,P0) -> 18 PCI->APIC IRQ transform: (B0,I19,P1) -> 18 PCI->APIC IRQ transform: (B1,I0,P0) -> 16 Limiting direct PCI/PCI transfers. Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket Starting kswapd v1.8 pty: 256 Unix98 ptys configured block: queued sectors max/low 169685kB/56561kB, 512 slots per queue Uniform Multi-Platform E-IDE driver Revision: 6.31 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx PIIX4: IDE controller on PCI bus 00 dev 39 PIIX4: chipset revision 1 PIIX4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio HPT366: onboard version of chipset, pin1=1 pin2=2 HPT366: IDE controller on PCI bus 00 dev 98 PCI: Enabling device 00:13.0 (0005 -> 0007) HPT366: chipset revision 1 HPT366: not 100% native mode: will probe irqs later ide2: BM-DMA at 0xd800-0xd807, BIOS settings: hde:DMA, hdf:pio HPT366: IDE controller on PCI bus 00 dev 99 HPT366: chipset revision 1 HPT366: not 100% native mode: will probe irqs later ide3: BM-DMA at 0xe400-0xe407, BIOS settings: hdg:pio, hdh:pio hda: IBM-DTLA-307030, ATA DISK drive hdc: IBM-DTLA-307030, ATA DISK drive hde: IBM-DTLA-307030, ATA DISK drive hdh: CRD-8520B, ATAPI CD/DVD-ROM drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide1 at 0x170-0x177,0x376 on irq 15 ide2 at 0xd000-0xd007,0xd402 on irq 18 ide3 at 0xdc00-0xdc07,0xe002 on irq 18 hda: 60036480 sectors (30739 MB) w/1916KiB Cache, CHS=3737/255/63, UDMA(33) hdc: 60036480 sectors (30739 MB) w/1916KiB Cache, CHS=59560/16/63, UDMA(33) hde: 60036480 sectors (30739 MB) w/1916KiB Cache, CHS=59560/16/63, UDMA(44) Partition check: hda: hda1 hda2 hda3 < hda5 hda6 > hdc: [PTBL] [3737/255/63] hdc1 hdc2 hdc3 < hdc5 hdc6 > hde: hde1 hde2 hde3 < hde5 hde6 > Serial driver version 5.05a (2001-03-20) with HUB-6 MANY_PORTS MULTIPORT SHARE_IRQ DETECT_IRQ SERIAL_PCI enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A Real Time Clock Driver v1.10d SCSI subsystem driver Revision: 1.00 request_module[scsi_hostadapter]: Root fs not mounted raid1 personality registered raid5 personality registered raid5: measuring checksumming speed 8regs : 941.600 MB/sec 32regs : 459.200 MB/sec pII_mmx : 1149.200 MB/sec p5_mmx : 1196.000 MB/sec raid5: using function: p5_mmx (1196.000 MB/sec) md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md.c: sizeof(mdp_super_t) = 4096 autodetecting RAID arrays (read) hda2's sb offset: 56128 [events: 00000044] (read) hda5's sb offset: 2562240 [events: 00000046] (read) hda6's sb offset: 27294336 [events: 00000048] (read) hdc2's sb offset: 56128 [events: 00000044] (read) hdc5's sb offset: 2562240 [events: 00000046] (read) hdc6's sb offset: 27294336 [events: 00000048] (read) hde2's sb offset: 55360 [events: 00000000] md: invalid raid superblock magic on hde2 md: hde2 has invalid sb, not importing! could not import hde2! (read) hde5's sb offset: 2560192 [events: 00000046] (read) hde6's sb offset: 27299520 [events: 00000048] autorun ... considering hde6 ... adding hde6 ... adding hdc6 ... adding hda6 ... created md2 bind<hda6,1> bind<hdc6,2> bind<hde6,3> running: <hde6><hdc6><hda6> now! hde6's event counter: 00000048 hdc6's event counter: 00000048 hda6's event counter: 00000048 md: md2: raid array is not clean -- starting background reconstruction md2: max total readahead window set to 512k md2: 2 data-disks, max readahead per data-disk: 256k raid5: device hde6 operational as raid disk 2 raid5: device hdc6 operational as raid disk 1 raid5: device hda6 operational as raid disk 0 raid5: allocated 3264kB for md2 raid5: raid level 5 set md2 active with 3 out of 3 devices, algorithm 0 raid5: raid set md2 not clean; reconstructing parity RAID5 conf printout: --- rd:3 wd:3 fd:0 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:hda6 disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdc6 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:hde6 RAID5 conf printout: --- rd:3 wd:3 fd:0 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:hda6 disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdc6 disk 2, s:0, o:1, n:2 rd:2 us:1 dev:hde6 md: updating md2 RAID superblock on device hde6 [events: 00000049](write) hde6's sb offset: 27299520 md: syncing RAID array md2 md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc. md: using maximum available idle IO bandwith (but not more than 100000 KB/sec) for reconstruction. md: using 124k window, over a total of 27294336 blocks. hdc6 [events: 00000049](write) hdc6's sb offset: 27294336 hda6 [events: 00000049](write) hda6's sb offset: 27294336 . considering hde5 ... adding hde5 ... adding hdc5 ... adding hda5 ... created md1 bind<hda5,1> bind<hdc5,2> bind<hde5,3> running: <hde5><hdc5><hda5> now! hde5's event counter: 00000046 hdc5's event counter: 00000046 hda5's event counter: 00000046 md: md1: raid array is not clean -- starting background reconstruction md1: max total readahead window set to 124k md1: 1 data-disks, max readahead per data-disk: 124k raid1: device hde5 operational as mirror 2 raid1: device hdc5 operational as mirror 1 raid1: device hda5 operational as mirror 0 raid1: raid set md1 not clean; reconstructing mirrors raid1: raid set md1 active with 3 out of 3 mirrors md: updating md1 RAID superblock on device hde5 [events: 00000047](write) hde5's sb offset: 2560192 md: serializing resync, md1 shares one or more physical units with md2! hdc5 [events: 00000047](write) hdc5's sb offset: 2562240 hda5 [events: 00000047](write) hda5's sb offset: 2562240 . considering hdc2 ... adding hdc2 ... adding hda2 ... created md0 bind<hda2,1> bind<hdc2,2> running: <hdc2><hda2> now! hdc2's event counter: 00000044 hda2's event counter: 00000044 md: md0: raid array is not clean -- starting background reconstruction md0: max total readahead window set to 124k md0: 1 data-disks, max readahead per data-disk: 124k raid1: device hdc2 operational as mirror 1 raid1: device hda2 operational as mirror 0 raid1: raid set md0 not clean; reconstructing mirrors raid1: raid set md0 active with 2 out of 2 mirrors md: updating md0 RAID superblock on device hdc2 [events: 00000045](write) hdc2's sb offset: 56128 md: serializing resync, md0 shares one or more physical units with md2! hda2 [events: 00000045](write) hda2's sb offset: 56128 . ... autorun DONE. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP IP: routing cache hash table of 2048 buckets, 16Kbytes TCP: Hash tables configured (established 16384 bind 16384) NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 204k freed Adding Swap: 104380k swap-space (priority -1) Adding Swap: 104380k swap-space (priority -2) Adding Swap: 102776k swap-space (priority -3) raid5: switching cache buffer size, 4096 --> 1024 APIC error on CPU0: 00(02) raid5: switching cache buffer size, 1024 --> 4096 eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw.com.sg> and others eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:D0:B7:0A:95:EC, IRQ 19. Board assembly 721383-008, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). eth1: Intel Corporation 82557 [Ethernet Pro 100] (#2), 00:D0:B7:0A:96:3F, IRQ 17. Board assembly 721383-008, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). eth2: Intel Corporation 82557 [Ethernet Pro 100] (#3), 00:D0:B7:0A:96:42, IRQ 16. Board assembly 721383-008, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). ip_tables: (c)2000 Netfilter core team ip_conntrack (2047 buckets, 16376 max) CSLIP: code copyright 1989 Regents of the University of California PPP generic driver version 2.4.1 PPP Deflate Compression module registered APIC error on CPU0: 02(02) APIC error on CPU0: 02(02) md: md2: sync done. raid5: resync finished. md: syncing RAID array md0 md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc. md: using maximum available idle IO bandwith (but not more than 100000 KB/sec) for reconstruction. md: using 124k window, over a total of 56128 blocks. md: serializing resync, md1 shares one or more physical units with md0! md: md0: sync done. md: syncing RAID array md1 md: minimum _guaranteed_ reconstruction speed: 100 KB/sec/disc. md: using maximum available idle IO bandwith (but not more than 100000 KB/sec) for reconstruction. md: using 124k window, over a total of 2560192 blocks. APIC error on CPU1: 00(08) md: md1: sync done. APIC error on CPU1: 08(02) APIC error on CPU1: 02(02) APIC error on CPU1: 02(08) APIC error on CPU1: 08(01) APIC error on CPU1: 01(02) APIC error on CPU1: 02(02) APIC error on CPU0: 02(08) APIC error on CPU1: 02(02) inserting floppy driver for 2.4.4 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 VFS: Disk change detected on device fd(2,0) floppy0: disk absent or changed during operation end_request: I/O error, dev 02:00 (floppy), sector 10 floppy0: disk absent or changed during operation end_request: I/O error, dev 02:00 (floppy), sector 2 VFS: Disk change detected on device fd(2,0) EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended APIC error on CPU0: 08(02) APIC error on CPU1: 02(02) APIC error on CPU1: 02(02)
"APIC error on CPU0: 08(02)" means your motherboard is bad quality.. If the kernel survives you're in luck... nothing we can do about this if the wires on the motherboard aren't good enough to send signals reliably :) sorry about that;
I see this is marked closed as not a bug, yet I see a lot of people who have this problem, and searching the web, I have not found a single solution that works for me. Since I was running 2.2 SMP, and it seemed to work fine, but now that I have upgraded to Redhat7.1, 2.4, it crashes because of these APIC errors. Where did they come from? How can I run my machine in SMP mode? Do I have to switch back to 2.2? Is there a solution other than "you suck, you bought bad hardware"? Help would be good, if anyone knows an answer to this seeming wide-spread problem. (I did try the "noapic" option as a boot parameter, as someone else had suggested, which did not solve the problem, since there was still a LOC INT on both CPUs, which I am assuming is what caused the countless ERRs I see before the crash from "cat /proc/interrupts". Is there a way to turn off ALL APIC, or a way to compile around all of this code? Is this the only way for SMP to work, with APIC?) Please post a solution (or possible work arounds, since there are obviously some) before closing this bug. Thanks.
redhat: I assume you also have a Abit BP6 motherboard. My BP6 board at home is rock solid with the 2.4.2-2 kernel; so it's not that all BP6's are bad. Some BP6 boards seem very sensitive to overheating (and thus overclockign), if cables block the aircirculation (which I had at home for a while), higher outside temperatures make the machine unstable. Also, the bios has an option for "MPS1.1" versus "MPS1.4", changing that to the other option helps sometimes. Could you please try the later ?
Yes, I do have one of these boards. I do not think it is a heating problem, since it happens right away. Thank you for the tips on things to try. I tried them out, but all of them had the same results, many ERRs. I reinstalled a 2.2 kernel, since that works fine in SMP mode for me. When it boots in SMP 2.2 mode, it displays this message at boot about the APICs (something I do not see in 2.4). Any ideas if this means that this board is not supported currently? Any ideas why 2.2 seems to work fine, but 2.4 generates endless errors? Thank you for your help. CPU0: Intel Pentium III (Coppermine) stepping 03 calibrating APIC timer ... ..... CPU clock speed is 931.7913 MHz. ..... system bus clock speed is 133.1128 MHz. Booting processor 1 eip 2000 Calibrating delay loop... 1861.22 BogoMIPS OK. CPU1: Intel Pentium III (Coppermine) stepping 03 Total of 2 processors activated (3715.89 BogoMIPS). enabling symmetric IO mode... ...done. ENABLING IO-APIC IRQs init IO_APIC IRQs IO-APIC (apicid-pin) 2-0, 2-5, 2-9, 2-10, 2-11, 2-20, 2-21, 2-22, 2-23 not connected. number of MP IRQ sources: 17. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 .... register #01: 00178011 ....... : max redirection entries: 0017 ....... : IO APIC version: 0011 WARNING: unexpected IO-APIC, please mail to linux-smp.edu .... register #02: 00000000 ....... : arbitration: 00 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect: 00 000 00 1 0 0 0 0 0 0 00 01 000 00 0 0 0 0 0 1 1 59 02 0FF 0F 0 0 0 0 0 1 1 51 03 000 00 0 0 0 0 0 1 1 61 04 000 00 0 0 0 0 0 1 1 69 05 000 00 1 0 0 0 0 0 0 00 06 000 00 0 0 0 0 0 1 1 71 07 000 00 0 0 0 0 0 1 1 79 08 000 00 0 0 0 0 0 1 1 81 09 000 00 1 0 0 0 0 0 0 00 0a 000 00 1 0 0 0 0 0 0 00 0b 000 00 1 0 0 0 0 0 0 00 0c 000 00 0 0 0 0 0 1 1 89 0d 000 00 1 0 0 0 0 0 0 00 0e 000 00 0 0 0 0 0 1 1 91 0f 000 00 0 0 0 0 0 1 1 99 10 0FF 0F 1 1 0 1 0 1 1 A1 11 0FF 0F 1 1 0 1 0 1 1 A9 12 0FF 0F 1 1 0 1 0 1 1 B1 13 0FF 0F 1 1 0 1 0 1 1 B9 14 000 00 1 0 0 0 0 0 0 00 15 000 00 1 0 0 0 0 0 0 00 16 000 00 1 0 0 0 0 0 0 00 17 000 00 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 2 IRQ1 -> 1 IRQ3 -> 3 IRQ4 -> 4 IRQ6 -> 6 IRQ7 -> 7 IRQ8 -> 8 IRQ12 -> 12 IRQ13 -> 13 IRQ14 -> 14 IRQ15 -> 15 IRQ16 -> 16 IRQ17 -> 17 IRQ18 -> 18 IRQ19 -> 19 .................................... done. .... IRQ redirection table:
I have changed MB for the other one from the same model (BP6) and it's the same.