Description of problem: Hardware: DELL PowerEdge 2650 with PERC 3/Di 2.7-1 3170 1 GB RAM, 1 RAID-5 Array on 3/Di, latest BIOS A17, 1 P4 XEON Version-Release number of selected component (if applicable): kernel-smp-2.4.22-1.2149.nptl How reproducible: Always Steps to Reproduce: 1. Install plain FC1 2. Update Kernel to kernel-smp-2.4.22-1.2149.nptl 3. Reboot Actual results: System hangs while adding swapspace. Expected results: System should boot. Additional info: 2.4.22-1.2149.nptl works. 2.4.22-1.2115.nptlsmp works. 2.4.22-1.2115.nptl works.
I'm sorry, the steps to reproduce are: Steps to Reproduce: 1. Install plain FC1 2. Update all non-kernel rpm's as of 20030115 3. Update Kernel to kernel-smp-2.4.22-1.2149.nptl 4. Reboot
can you describe exactly what happens when you boot 2149 smp ? Any interesting boot messages ? What happens with the -2154 kernel in -testing ?
kernel-smp-2.4.22-1.2154.nptl.i686.rpm in updates/testing/1/i386/ also hangs during bootup! kernel-2.4.22-1.2154.nptl.i686.rpm boots fine! Note that the SMP kernel only hang after a *cold boot*. When booting the non-SMP version of the same kernel and then doing a warmstart, the SMP kernels (both 2149 and 2154) boot fine. The system hangs very lately when the console writes "Adding swap space" (2149)" or "Bringing up Interface eth0" (2154). Jan 20 13:32:45 fedoratest kernel: Linux version 2.4.22- 1.2154.nptlsmp (bhcompile.redhat.com) (gcc vers ion 3.2.3 20030422 (Red Hat Linux 3.2.3-6)) #1 SMP Tue Jan 13 14:21:29 EST 2004 Jan 20 13:32:45 fedoratest kernel: BIOS-provided physical RAM map: Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 0000000000100000 - 000000003ffe0000 (usable) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 000000003ffe0000 - 000000003ffefc00 (ACPI data) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 000000003ffefc00 - 000000003ffff000 (reserved) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) Jan 20 13:32:45 fedoratest kernel: BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) Jan 20 13:32:45 fedoratest kernel: 127MB HIGHMEM available. Jan 20 13:32:45 fedoratest kernel: 896MB LOWMEM available. Jan 20 13:32:45 fedoratest syslog: klogd startup succeeded Jan 20 13:32:46 fedoratest irqbalance: irqbalance startup succeeded Jan 20 13:32:46 fedoratest kernel: found SMP MP-table at 000fe710 Jan 20 13:32:46 fedoratest random: Initializing random number generator: succeeded Jan 20 13:32:46 fedoratest kernel: hm, page 000fe000 reserved twice. Jan 20 13:32:46 fedoratest rc: Starting pcmcia: succeeded Jan 20 13:32:46 fedoratest kernel: hm, page 000ff000 reserved twice. Jan 20 13:32:46 fedoratest kernel: hm, page 000f0000 reserved twice. Jan 20 13:32:46 fedoratest kernel: On node 0 totalpages: 262112 Jan 20 13:32:46 fedoratest kernel: zone(0): 4096 pages. Jan 20 13:32:47 fedoratest kernel: zone(1): 225280 pages. Jan 20 13:32:47 fedoratest netfs: Mounting other filesystems: succeeded Jan 20 13:32:47 fedoratest kernel: zone(2): 32736 pages. Jan 20 13:32:47 fedoratest kernel: ACPI: RSDP (v000 DELL ) @ 0x000fdc40 Jan 20 13:32:47 fedoratest autofs: automount startup succeeded Jan 20 13:32:47 fedoratest kernel: ACPI: RSDT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdc54 Jan 20 13:32:47 fedoratest smartd[553]: smartd version 5.21 Copyright (C) 2002-3 Bruce Allen Jan 20 13:32:48 fedoratest kernel: ACPI: FADT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdc84 Jan 20 13:32:48 fedoratest smartd[553]: Home page is http://smartmontools.sourceforge.net/ Jan 20 13:32:48 fedoratest kernel: ACPI: MADT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdcf8 Jan 20 13:32:48 fedoratest smartd[553]: Opened configuration file /etc/smartd.conf Jan 20 13:32:48 fedoratest kernel: ACPI: SPCR (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdd80 Jan 20 13:32:48 fedoratest smartd[553]: Configuration file /etc/smartd.conf parsed. Jan 20 13:32:48 fedoratest kernel: ACPI: DSDT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x00000000 Jan 20 13:32:48 fedoratest smartd[553]: Device: /dev/hda, opened Jan 20 13:32:48 fedoratest kernel: ACPI: Local APIC address 0xfee00000 Jan 20 13:32:48 fedoratest smartd[553]: Device: /dev/hda, unable to read Device Identity Structure Jan 20 13:32:48 fedoratest kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id [0x00] enabled) Jan 20 13:32:48 fedoratest smartd[553]: Unable to register ATA device /dev/hda at line 30 of file /etc/smartd.conf Jan 20 13:32:48 fedoratest kernel: Processor #0 Pentium 4(tm) XEON (tm) APIC version 20 Jan 20 13:32:49 fedoratest smartd[553]: Unable to register device /dev/hda (no Directive -d removable). Exiting. Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id [0x06] disabled) Jan 20 13:32:49 fedoratest smartd: smartd startup failed Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id [0x01] enabled) Jan 20 13:32:49 fedoratest kernel: Processor #1 Pentium 4(tm) XEON (tm) APIC version 20 Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id [0x07] disabled) Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC_NMI (acpi_id[0x01] polarity[0x1] trigger[0x1] lint[0x1]) Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC_NMI (acpi_id[0x02] polarity[0x1] trigger[0x1] lint[0x1]) Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC_NMI (acpi_id[0x03] polarity[0x1] trigger[0x1] lint[0x1]) Jan 20 13:32:49 fedoratest kernel: ACPI: LAPIC_NMI (acpi_id[0x04] polarity[0x1] trigger[0x1] lint[0x1]) Jan 20 13:32:49 fedoratest kernel: Using ACPI for processor (LAPIC) configuration information Jan 20 13:32:49 fedoratest sshd: succeeded Jan 20 13:32:49 fedoratest kernel: Intel MultiProcessor Specification v1.4 Jan 20 13:32:50 fedoratest kernel: Virtual Wire compatibility mode. Jan 20 13:32:50 fedoratest kernel: OEM ID: DELL Product ID: PE 0121 APIC at: 0xFEE00000 Jan 20 13:32:50 fedoratest kernel: I/O APIC #8 Version 17 at 0xFEC00000. Jan 20 13:32:50 fedoratest crond: crond startup succeeded Jan 20 13:32:50 fedoratest kernel: I/O APIC #9 Version 17 at 0xFEC01000. Jan 20 13:32:50 fedoratest anacron: anacron startup succeeded Jan 20 13:32:50 fedoratest kernel: I/O APIC #10 Version 17 at 0xFEC02000. Jan 20 13:32:51 fedoratest kernel: Processors: 2 Jan 20 13:32:51 fedoratest kernel: xAPIC support is present Jan 20 13:32:51 fedoratest kernel: Enabling APIC mode: Flat.^IUsing 3 I/O APICs Jan 20 13:32:51 fedoratest kernel: Kernel command line: ro root=LABEL=/ Jan 20 13:32:51 fedoratest kernel: Initializing CPU#0 Jan 20 13:32:51 fedoratest kernel: Detected 1993.643 MHz processor. Jan 20 13:32:51 fedoratest kernel: Console: colour VGA+ 80x25 Jan 20 13:32:52 fedoratest kernel: Calibrating delay loop... 3971.48 BogoMIPS Jan 20 13:32:52 fedoratest kernel: Memory: 1031576k/1048448k available (1613k kernel code, 16488k reserved, 1194k d ata, 164k init, 130944k highmem) Jan 20 13:32:52 fedoratest kernel: Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Jan 20 13:32:52 fedoratest kernel: Inode cache hash table entries: 65536 (order: 7, 524288 bytes) Jan 20 13:32:52 fedoratest kernel: Mount cache hash table entries: 512 (order: 0, 4096 bytes) Jan 20 13:32:52 fedoratest kernel: Buffer cache hash table entries: 65536 (order: 6, 262144 bytes) Jan 20 13:32:52 fedoratest kernel: Page-cache hash table entries: 262144 (order: 8, 1048576 bytes) Jan 20 13:32:52 fedoratest kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K Jan 20 13:32:52 fedoratest kernel: CPU: L2 cache: 512K Jan 20 13:32:52 fedoratest kernel: CPU: Physical Processor ID: 0 Jan 20 13:32:52 fedoratest kernel: Intel machine check architecture supported. Jan 20 13:32:52 fedoratest kernel: Intel machine check reporting enabled on CPU#0. Jan 20 13:32:52 fedoratest kernel: Enabling fast FPU save and restore... done. Jan 20 13:32:52 fedoratest kernel: Enabling unmasked SIMD FPU exception support... done. Jan 20 13:32:53 fedoratest kernel: Checking 'hlt' instruction... OK. Jan 20 13:32:53 fedoratest kernel: POSIX conformance testing by UNIFIX Jan 20 13:32:53 fedoratest kernel: mtrr: v1.40 (20010327) Richard Gooch (rgooch.au) Jan 20 13:32:53 fedoratest kernel: mtrr: detected mtrr type: Intel Jan 20 13:32:53 fedoratest kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K Jan 20 13:32:53 fedoratest kernel: CPU: L2 cache: 512K Jan 20 13:32:53 fedoratest kernel: CPU: Physical Processor ID: 0 Jan 20 13:32:53 fedoratest kernel: Intel machine check reporting enabled on CPU#0. Jan 20 13:32:53 fedoratest kernel: CPU0: Intel(R) Xeon(TM) CPU 2.00GHz stepping 07 Jan 20 13:32:53 fedoratest kernel: per-CPU timeslice cutoff: 1462.63 usecs. Jan 20 13:32:53 fedoratest kernel: task migration cache decay timeout: 10 msecs. Jan 20 13:32:54 fedoratest kernel: enabled ExtINT on CPU#0 Jan 20 13:32:54 fedoratest kernel: ESR value before enabling vector: 00000040 Jan 20 13:32:54 fedoratest kernel: ESR value after enabling vector: 00000000 Jan 20 13:32:54 fedoratest kernel: Booting processor 1/1 eip 3000 Jan 20 13:32:54 fedoratest kernel: Initializing CPU#1 Jan 20 13:32:54 fedoratest kernel: masked ExtINT on CPU#1 Jan 20 13:32:54 fedoratest kernel: ESR value before enabling vector: 00000000 Jan 20 13:32:54 fedoratest kernel: ESR value after enabling vector: 00000000 Jan 20 13:32:54 fedoratest kernel: Calibrating delay loop... 3984.58 BogoMIPS Jan 20 13:32:54 fedoratest kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K Jan 20 13:32:54 fedoratest kernel: CPU: L2 cache: 512K Jan 20 13:32:54 fedoratest kernel: CPU: Physical Processor ID: 0 Jan 20 13:32:54 fedoratest kernel: Intel machine check reporting enabled on CPU#1. Jan 20 13:32:54 fedoratest kernel: CPU1: Intel(R) Xeon(TM) CPU 2.00GHz stepping 07 Jan 20 13:32:54 fedoratest kernel: Total of 2 processors activated (7956.07 BogoMIPS). Jan 20 13:32:55 fedoratest kernel: ENABLING IO-APIC IRQs
if you boot with nmi_watchdog=1 we may get a backtrace shortly after it hangs.
With nmi_watchdog=1, both SMP kernels (2149 and 2154) now boot up perfectly, even after a cold boot :-O [root@fedoratest root]# uname -a Linux fedoratest 2.4.22-1.2154.nptlsmp #1 SMP Tue Jan 13 14:21:29 EST 2004 i686 i686 i386 GNU/Linux [root@fedoratest root]# uptime 16:22:13 up 10 min, 1 user, load average: 0.00, 0.00, 0.00 [root@fedoratest root]# cat /proc/interrupts CPU0 CPU1 0: 34185 32032 IO-APIC-edge timer 1: 47 99 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 5: 0 0 IO-APIC-level usb-ohci 8: 1 0 IO-APIC-edge rtc 14: 22 0 IO-APIC-edge ide0 28: 7503 0 IO-APIC-level eth0 30: 5233 608 IO-APIC-level aacraid NMI: 66164 66164 LOC: 66119 66142 ERR: 0 MIS: 0
annoying. leave it like that for a few days, as if we do get a hang we'll get a backtrace. (providing you aren't in X at the time).
Update: It seems that the smp Kernel do not boot, iff the RAID array is in state "Scrubbing" at boot time. When the kernel hangs, even with nmi_watchdog=1, I do not get a backtrace!. It also seems that nmi_watchdog=1 option doesnot work at all on Fedora: [root@proxy root]# cat /proc/interrupts CPU0 0: 41674728 XT-PIC timer 1: 527 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 0 XT-PIC usb-ohci 7: 3520670 XT-PIC aacraid 8: 117 XT-PIC rtc 10: 139328905 XT-PIC eth1 11: 116986841 XT-PIC eth0 14: 20 XT-PIC ide0 NMI: 0 ERR: 0 Shouldn't the NMI counter be > 0 if nmi_watchdog is active? (with RH8 and RH9, nmi_watchdog works on the same hardware)
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/