Bug 50185 - Unexpected reboots at boot time with SMP and multiple CPUs installed
Summary: Unexpected reboots at boot time with SMP and multiple CPUs installed
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-07-27 19:19 UTC by Chris Page
Modified: 2007-04-18 16:35 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-07-27 19:32:10 UTC
Embargoed:


Attachments (Terms of Use)

Description Chris Page 2001-07-27 19:19:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.51 [en] (WinNT; U)

Description of problem:
See additonal Info for all experiments.  Install of SMP kernel failed until I specified "noapic".  Now that I've installed, system reboots on most occasions 
during the boot sequence.  SMP kernel with a single CPU works fine.

How reproducible:
Sometimes

Steps to Reproduce:
1.Install 2 CPUs
2.Configure for SMP kernel
3.boot
	

Actual Results:  On most occassions, system will unexpectedly reboot during the boot process.  When it does boot, it will often unexpectedly reboot 
during processing.  Nothing telling in /var/log/messages.

Expected Results:  Successful boot, services start, login prompt.

Additional info:


Machine Configuration:

Compaq Presario DL360
CPU: 2 PIII 933
MEM: 1G RAM
BIOS: P21 (latest)
CARDS: Remote Insight Board
DISKS: 2 18.2G 15K ULTRA3 SCSI in RAID0 Configuration
RAID: Integrated Smart Array Controller
LINUX: RedHat 7.1
KERNEL: 2.4.2-2
KERNEL BOOT OPTIONS: noapic (see proble #1 below)


The problems:

Problem #1) Error during Install process when enterprise version of kernel seclected during the Post-Install stage.

Resolution #1): Resolved by starting the install process via

		boot: linux noapic


Problem #2) Machine intermitently reboots during the boot process when I use two CPUs.  When the machine reboots, the screen lines look like this:

ServerWorks....
	ide0...
	ide1...
hdc Compaq ATAPI...
<reboot occurs>


I've tried the following configurations with indicated results (kernel option 'noapic' is  used in all configs):

CPUs: 2
KERNEL: SMP
MEM: 128M
(1) succeeds to login prompt
(2) reboots after printing "hdc" line
(3) succeeds to login prompt

CPUs: 2
KERNEL: Enterprise
MEM: 128M
(1) reboots after printing "hdc" line
(2) reboots after printing "hdc" line
(3) reboots after printing "hdc" line
(4) succeeds to login prompt

CPUs: 2
KERNEL: SMP
MEM: 1G
(1) reboots after printing "hdc" line
(2) reboots after printing "hdc" line
(3) succeeds to login prompt

CPUs: 2
KERNEL: Enterprise
MEM: 1G
(1) reboots after printing "hdc" line
(2) reboots after printing "hdc" line
(3) succeeds to login prompt

CPUs: 1
KERNEL: SMP
MEM: 1G
(1) succeeds to login prompt
(2) succeeds to login prompt
(3) succeeds to login prompt

CPUs: 1
KERNEL: Enterprise
MEM: 1G
(1) succeeds to login prompt
(2) succeeds to login prompt
(3) succeeds to login prompt


Using a base config of:
CPUs: 2
KERNEL: SMP
MEM: 1G
OPTIONS: noapic

Experiments:
(*) Remove Remote Insight PCI board
	-Reboot problem remains

(*) Add "ide=nodma" boot option
	-Reboot problem remains

(*) Remove "noapic" option and use "ide=nodma" only
	-Reboot problem remains

End of Experiments.

Comment 1 Chris Page 2001-07-27 19:24:25 UTC
More info:  I've got a stack of DL360s and the problem reproduces on each (Ok - at least two, I haven't seen a need to try it on others).

Comment 2 Chris Page 2001-07-27 19:32:06 UTC
Info from /var/log/messages:

Jul 27 11:58:28 acre000 kernel: block: queued sectors max/low 682474kB/551402kB, 2048 slots per queue
Jul 27 11:58:28 acre000 kernel: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Jul 27 11:58:28 acre000 kernel: Uniform Multi-Platform E-IDE driver Revision: 6.31
Jul 27 11:58:28 acre000 kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Jul 27 11:58:28 acre000 kernel: ServerWorks OSB4: IDE controller on PCI bus 00 dev 79
Jul 27 11:58:28 acre000 kernel: ServerWorks OSB4: chipset revision 0
Jul 27 11:58:28 acre000 kernel: ServerWorks OSB4: not 100%% native mode: will probe irqs later
Jul 27 11:58:28 acre000 kernel:     ide0: BM-DMA at 0x2800-0x2807, BIOS settings: hda:pio, hdb:pio
Jul 27 11:58:28 acre000 kernel:     ide1: BM-DMA at 0x2808-0x280f, BIOS settings: hdc:pio, hdd:pio
Jul 27 11:58:28 acre000 kernel: hdc: Compaq CRN-8241B, ATAPI CD/DVD-ROM drive
<reboot most often occurs here>
Jul 27 11:58:28 acre000 kernel: ide1 at 0x170-0x177,0x376 on irq 15
Jul 27 11:58:28 acre000 kernel: Floppy drive(s): fd0 is 1.44M
Jul 27 11:58:28 acre000 kernel: FDC 0 is a National Semiconductor PC87306


Comment 3 Chris Page 2001-08-02 14:35:28 UTC
Problem resolved by using a new heat sink on second CPU.


Note You need to log in before you can comment on or make changes to this bug.