Bug 31038
Summary: | kernel does a 'machine check' on bootup, hangs | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Elliot Lee <sopwith> | ||||
Component: | kernel | Assignee: | Phil Copeland <copeland> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Brock Organ <borgan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.3 | CC: | jimhllrn | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | alpha | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2001-04-08 00:12:45 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Elliot Lee
2001-03-08 00:44:20 UTC
A Machine Check Exception (I assume that is what you mean) is the way for the CPU to indicate that it is toast. sounds like broken hardware to me......... If this is not what you mean, please clarify what you mean. I do not know whether this is what you call a machine check exception. I do know that a dump of a bunch of register-type things happens. Here's the last section from the output: IOD 1 Register Subpacket - Bridge Base Address fbe0000000 WHOAMI = 2fa PCI_REV = 6002032 CAP_CTRL = 6480ff1 HAE_MEM = 0 HAE_IO = 0 INT_CTL = 3 INT_REG = 0 INT_MASK0 = fe0000 INT_MASK1 = 0 MC_ERR0 = e0000000 MC_ERR1 = e88ff CAP_ERR = 0 PCI_ERR1 = 0 MDPA_STAT = 0 MDPA_SYN = 0 MDPB_STAT = 0 MDPB_SYN = 0 I have an Alphastation 4/233 (Avanti) running ARC firmware and MILO 2.0.35. I only get a screen-wide dotted line, a pause, and then the screen flickers and goes back to MILO. When I boot up using serial console (to try to get the full messages), I don't get the machine check spewage. Instead, the last two lines on the screen are: SMP: Total of 2 processors activated (1855.17 BogoMIPS) got res[8000:801f] for resource 0 of Creative Labs SB Live! EMU10000 The last line doesn't get logged to the serial console. This happens with 2.4.2-0.1.22smp. With bryce's generic.img on serial console, which I think is 2.4.2-0.1.25smp, the error is different: <2>MCPCIA machine check: vector=0x670 pc=0xfffffc00008203f4 code=0x980001 and after repeating that for a while, <1>Unable to handle kernel paging request at virtual address fffff8f980b15420 (this doesn't show up on serial console either, only the main screen). Created attachment 12502 [details]
Some sort of boot log
2.4.2-ac20 with a very minimal configuration does same behaviour. (minimal config includes no SMP, so it does seem to happen with or without SMP) When I use addr2line to find the code that is generating the mcpcia check, it shows up as line 231 of include/asm-alpha/core_mcpcia.h - the next-to-last line of the mcpcia_inb routine. It happens in the initialize_kbd function (drivers/char/pc_keyb.c), which does the I/O that would call mcpcia_inb where the mcheck actually happens... about to do initialize_kbd MCPCIA machine check: vector=0x670 pc=0xfffffc00008d74a4 code=0x980001 machine check type: unknown It might be useful to find out what the machine check code means... Anyone know where to get that info? This appears to be fixed in the latest kernel (the one in wolverine-alpha2). Other RAID problems happen, but that is probably not an Alpha bug. I have tried 3 installs of RH-9 and each time is successful, but on bootup it hangs at a line that says: INIT: version 2.84 booting. That is as far as it get. I'm not a programmer, but if you have an idea of what might be wrong. I have a Gateway computer w/Athlon 700MZ, 128MG ram and a Western Digital 75GB HD. RH9 is the only software on the HD.l jimhllrn |