Bug 31038 - kernel does a 'machine check' on bootup, hangs
kernel does a 'machine check' on bootup, hangs
Status: CLOSED WORKSFORME
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
alpha Linux
medium Severity high
: ---
: ---
Assigned To: Phil Copeland
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-03-07 19:44 EST by Elliot Lee
Modified: 2007-04-18 12:32 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-04-07 20:12:45 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Some sort of boot log (5.59 KB, text/plain)
2001-03-12 18:54 EST, Elliot Lee
no flags Details

  None (edit)
Description Elliot Lee 2001-03-07 19:44:20 EST
During kernel startup, machine check happens and things hang.

This happens with 2.4 kernels only; the kernel in 7.0 works

This is a Digital Ultimate Workstation box; ostrich-deluxe.labs.redhat.com
Comment 1 Arjan van de Ven 2001-03-08 04:38:04 EST
A Machine Check Exception (I assume that is what you mean) is the way for the
CPU to indicate that it is toast.

sounds like broken hardware to me.........

If this is not what you mean, please clarify what you mean.
Comment 2 Elliot Lee 2001-03-11 15:25:59 EST
I do not know whether this is what you call a machine check exception. I do know
that a dump of a bunch of register-type things happens. Here's the last section
from the output:

IOD 1 Register Subpacket - Bridge Base Address fbe0000000
WHOAMI = 2fa
PCI_REV = 6002032
CAP_CTRL = 6480ff1
HAE_MEM = 0
HAE_IO = 0
INT_CTL = 3
INT_REG = 0
INT_MASK0 = fe0000
INT_MASK1 = 0
MC_ERR0 = e0000000
MC_ERR1 = e88ff
CAP_ERR = 0
PCI_ERR1 = 0
MDPA_STAT = 0
MDPA_SYN = 0
MDPB_STAT = 0
MDPB_SYN = 0
Comment 3 Jason Duerstock 2001-03-11 15:43:22 EST
I have an Alphastation 4/233 (Avanti) running ARC firmware and MILO
2.0.35.  I only get a screen-wide dotted line, a pause, and then
the screen flickers and goes back to MILO.
Comment 4 Elliot Lee 2001-03-12 13:57:44 EST
When I boot up using serial console (to try to get the full messages), I don't
get the machine check spewage. Instead, the last two lines on the screen are:

SMP: Total of 2 processors activated (1855.17 BogoMIPS)
  got res[8000:801f] for resource 0 of Creative Labs SB Live! EMU10000

The last line doesn't get logged to the serial console. This happens with
2.4.2-0.1.22smp.

With bryce's generic.img on serial console, which I think is 2.4.2-0.1.25smp,
the error is different:
<2>MCPCIA machine check: vector=0x670 pc=0xfffffc00008203f4 code=0x980001

and after repeating that for a while,
<1>Unable to handle kernel paging request at virtual address fffff8f980b15420

(this doesn't show up on serial console either, only the main screen).
Comment 5 Elliot Lee 2001-03-12 18:54:56 EST
Created attachment 12502 [details]
Some sort of boot log
Comment 6 Elliot Lee 2001-03-13 16:27:38 EST
2.4.2-ac20 with a very minimal configuration does same behaviour.
Comment 7 Elliot Lee 2001-03-13 17:43:04 EST
(minimal config includes no SMP, so it does seem to happen with or without SMP)

When I use addr2line to find the code that is generating the mcpcia check, it
shows up as line 231 of include/asm-alpha/core_mcpcia.h - the next-to-last line
of the mcpcia_inb routine.
Comment 8 Elliot Lee 2001-03-14 14:49:58 EST
It happens in the initialize_kbd function (drivers/char/pc_keyb.c), which does
the I/O that would call mcpcia_inb where the mcheck actually happens...

about to do initialize_kbd
MCPCIA machine check: vector=0x670 pc=0xfffffc00008d74a4 code=0x980001
machine check type: unknown

It might be useful to find out what the machine check code means... Anyone know
where to get that info?
Comment 9 Elliot Lee 2001-04-29 13:50:29 EDT
This appears to be fixed in the latest kernel (the one in wolverine-alpha2).
Other RAID problems happen, but that is probably not an Alpha bug.
Comment 10 jim halloran 2003-12-28 20:24:21 EST
I have tried 3 installs of RH-9 and each time is successful, but on 
bootup it hangs at a line that says:  INIT: version 2.84 booting.

That is as far as it get.  I'm not a programmer, but if you have an 
idea of what might be wrong.

I have a Gateway computer w/Athlon 700MZ, 128MG ram and a Western 
Digital 75GB HD.  RH9 is the only software on the HD.l

jimhllrn@yahoo.com

Note You need to log in before you can comment on or make changes to this bug.