Description of problem: System had been running fine with 2.6.13-1.1532_FC4smp. Rebooted into 2.6.14-1.1644_FC4smp but system quickly restarts after some ACPI messages. Now cannot boot into older kernels as well (back to kernel-smp-2.6.12-1.1447_FC4). Can boot 2.6.14-1.1644_FC4smp okay with acpi=off. How reproducible: very # lspci 00:00.0 Host bridge: Intel Corporation 82875P/E7210 Memory Controller Hub (rev 02) 00:01.0 PCI bridge: Intel Corporation 82875P Processor to AGP Controller (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2) 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02) 00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02) 00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller (rev 02) 01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1) 02:08.0 Ethernet controller: Intel Corporation 82562EZ 10/100 Ethernet Controller (rev 02)
System also boots okay with hyper-threading turned off in BIOS (and ACPI on).
can you attach the output of dmesg -s 128000, and dmidecode (as root) for bonus points, if you have the means to use a serial console, the exact messages it prints before it crashes would be really useful.
Looks like maybe the CPU or motherboard has gone flakey as the kernel says it cannot talk to CPU#0. Here is the serial console output: Linux version 2.6.14-1.1653_FC4smp (bhcompile.redhat.com) (gcc version 4.0.2 20051125 (Red Hat 4.0.2-8)) #1 SMP Tue Dec 13 21:46:01 EST 2005 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007ff74000 (usable) BIOS-e820: 000000007ff74000 - 000000007ff76000 (ACPI NVS) BIOS-e820: 000000007ff76000 - 000000007ff97000 (ACPI data) BIOS-e820: 000000007ff97000 - 0000000080000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fecf0000 - 00000000fecf1000 (reserved) BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) 1151MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000fe710 Using x86 segment limits to approximate NX protection DMI 2.3 present. Using APIC driver default ACPI: PM-Timer IO Port: 0x808 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] disabled) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 88000000 (gap: 80000000:7ec00000) Built 1 zonelists Kernel command line: ro root=/dev/rootvg/root console=ttyS0,115200 Initializing CPU#0 CPU 0 irqstacks, hard=c0447000 soft=c0427000 PID hash table entries: 4096 (order: 12, 65536 bytes) Detected 3192.500 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 2071240k/2096592k available (2167k kernel code, 23964k reserved, 810k data, 224k init, 1179088k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 6390.94 BogoMIPS (lpj=12781895) Security Framework v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Physical Processor ID: 0 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (12) available CPU0: Thermal monitoring enabled mtrr: v2.0 (20020519) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. CPU0: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 09 Booting processor 1/1 eip 2000 CPU 1 irqstacks, hard=c0448000 soft=c0428000 Not responding. Inquiring remote APIC #1... ... APIC #1 ID: failed ... APIC #1 VERSION: failed ... APIC #1 SPIV: failed CPU #1 not responding - cannot use it. Total of 1 processors activated (6390.94 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 pin1=2 pin2=-1 Brought up 1 CPUs checking if image is initramfs... it is Freeing initrd memory: 1760k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfbb30, last bus=2 PCI: Using configuration type 1 ACPI: Subsystem revision 20050916 then it reboots. I'll attach dmesg -s 128000 from HT BIOS off boot and dmidecode next.
Created attachment 122846 [details] dmidecode output
Created attachment 122848 [details] dmesg output
Well, system is not totally messed up. I can boot into Windows with HT on and run Intel's Hyper-Threading test utility and it checks out okay. Device manager shows two processors.
can you try the 2.6.15 based test kernel at http://people.redhat.com/davej/kernels/Fedora/FC4/ and see if that makes it work again with acpi ?
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) > Processor #0 15:2 APIC version 20 > ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) > Processor #1 15:2 APIC version 20 > ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] disabled) > ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] disabled) Hmmm, dunno if the duplicate lapic_id is an issue if the phantom one is disabled. Does the failing kernel work with "maxcpus=2"? dmesg from the latest kernel that boots properly in SMP+ ACPI mode with no cmdline params may help.
Created attachment 123400 [details] dmesg output Latest test kernel does not boot. maxcpus=2 has no effect. Note that I can nolonger boot to older kernels, but here is the dmesg output from an older kernel when it could boot.
This is a mass-update to all currently open kernel bugs. A new kernel update has been released (Version: 2.6.15-1.1830_FC4) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO_REPORTER state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. Thank you.
System boots fine with HT on with 2.6.15-1.1830_FC4