Bug 151934

Summary: Running lshw causes MCA on Olympia rx8620
Product: Red Hat Enterprise Linux 3 Reporter: Julie Kosakowski <julie.kosakowski>
Component: kernelAssignee: Ernie Petrides <petrides>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: anderson, arne.romo, jbaron, petrides, rick.hester, riel, suzanne.pherigo
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-18 13:29:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
mca log none

Description Julie Kosakowski 2005-03-23 17:56:04 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux ia64; en-US; rv:1.7.3) Gecko/20041007 Debian/1.7.3-5

Description of problem:
I'm using the command "lshw" to find out specific hardware information on an Olympia, rx8620. 
lshw is a tool downloaded from http://ezix.sourceforge.net/software/files/lshw-B.02.02.tar.gz

#lshw -version
B.02.02

#lshw -short (produces MCA as well)

It's possibly caused by lshw poking around /proc where the kernel does not like. probing around I suspected /proc/bus/pci but that is not the culprit.

[root@hera1 log]# lshw
DMI
 MCA EVENT occurred : SAL error processing

  Logging TOGO Errors ....
    Size of TOGO Errors : 0x0
                                                Complete
 Finish the Error Event Logging ....
                                                Complete
 Flush the cpu cache ....
                                                Complete
 ReEnabling CPU Poison Check ....
                                                Complete
Cpu12: MCA Rendez Always flag set to 1.
Cpu12: Perform rendezvous started.
Cpu12: Sent Rendez vector number 0xe8 to 3.
Cpu12: Rendezvous timeout : 20000
Cpu12: Waiting processors to acknowledge MCA vector.
Cpu12: All the processors acknowledged MCA vector.
Cpu12: Perform rendezvous complete, RendezState : 0x1 .
  Cell local CallGate Pointer 0x721ff409c00
  CallGate Pointer after moving to CoreCell 0x721ff409c00
 Firmware executing from Main Memory .......
Calling OS_MCA at 0x000000000444d6c0...

Granite Firmware (build 001.022) -- server rx8620 -- CellId: 1

Build date: Fri Aug 13 15:35:18 2004
Build directory: /home/jhack/Olyger/rom.1.22/thunderdome-mckinley
Built by: jhack

Current Date/Time:  3/23/2005 17:28:23
PAL_A version: 7.31/1.26
PAL_B version: 1.26

  Initialize flasher ...                        Complete
  Initialize options ...                        Complete
  Initialize memory ...                         Complete
  Initialize pdh ...                            Complete
  Initialize fabric ...                         Complete
  Initialize cell1 ...                          Complete
  Merge cell1 into partition tree ...           Complete
  Initialize cell3 ...                          Complete
  Merge cell3 into partition tree ...           Complete
  Initializing partition memory ...             Complete
  Enable VGA routing ...                        Complete
Partition tree creation ...                     Complete
Processors:
  cpu8   cpu12  cpu24  cpu28
Memory:  16384 MB
Building SLIT information ...                   Complete
Building CPEP information ...                   Complete
Loading Clients ...
  Loading SAL_ABI ...                           Complete
  Loading ACPI ......                           Complete
  Loading EFI ......                            Complete
  Loading SAL_PMI ...                           Complete
  MMAllocate: 4194304 bytes free
Build Tree Acpi ...
  Initialize ACPI objects ...                   Complete
  Verifying sram malloc sanity ...              Complete
  Verifying registry sanity ...                 Complete
  Verifying device tree sanity ...              Complete
  Verifying ACPI configuration consistency ...  Complete
Starting Clients ...
Initializing ACPI tables ...
  Initializing FACS table    (721ff040028/40)   Complete
  Initializing SPCR table    (721ff040068/50)   Complete
  Initializing DBGP table    (721ff0400b8/34)   Complete
  Initializing HPET table    (721ff0400f0/38)   Complete
  Initializing GSP  table    (721ff040128/74)   Complete
  Initializing MADT table    (721ff0401a0/19c)  Complete
  Initializing DSDT table    (721ff040340/198d9)Complete
  Initializing SPMI table    (721ff059c20/50)   Complete
  Initializing FADT table    (721ff059c70/f4)   Complete
  Initializing SLIT table    (721ff059d68/3c)   Complete
  Initializing CPEP table    (721ff059da8/4c)   Complete
  Initializing SRAT table    (721ff059df8/110)  Complete
  Initializing XSDT table    (721ff059f08/ec)   Complete
  Initializing SSDT table 00 (721ff059ff8/18738)Complete
  Initializing RSDP table    (721ff040000/24)   Complete
ACPI Initialization complete: table size = 206640

  Initializing HCDP table    (721ff072730/b8)   Complete
  Initializing SBCT table    (721ff0727e8/30)   Complete

Initializing SAL_PMI ...                        Complete
Installing SAL_PMI handlers ...                 Complete
Installing Cell Add handler ...                 Complete
Installing Cell Delete handler ...              Complete
Installing ACPI CellAdd handler ...             Complete

  Processing firmware memory tables ...         Complete
  Relocating PAL ...                            Complete
  Processing platform I/O information ...       Complete
  Loading ACPI tables ...                       Complete
  Building EFI memory table ...                 Complete
  Building PCI cache structure ...              Complete

Welcome to EFI 1.1!







Version-Release number of selected component (if applicable):
kernel-2.4.21-27.EL

How reproducible:
Always

Steps to Reproduce:
1. run #lshw 
2. MCA immediately occurs

  

Actual Results:  An MCA occured

Expected Results:  Should see list of hardware configuration - exact memory configuration, firmware version, mainboard configuration, CPU version and speed, cache configuration, bus speed, etc. on DMI-capable x86 or EFI (IA-64) systems and on some PowerPC machines (PowerMac G4 is known to work).

Additional info:

Comment 1 Julie Kosakowski 2005-03-23 18:00:14 UTC
Created attachment 112270 [details]
mca log

Comment 2 Ernie Petrides 2005-03-29 23:47:23 UTC
Hi, Julie.  Could you please check if "lshw" opens up /proc/kcore?
Just let us know whether "strings lshw | grep /proc/kcore" produces
any output.  Thanks in advance.

Comment 3 Julie Kosakowski 2005-03-30 16:32:17 UTC
Hi,
Here's the output:

[root@whale2 sbin]# strings lshw | grep /proc/kcore
/proc/kcore

Thanks

Comment 4 Ernie Petrides 2005-03-31 02:18:17 UTC
Excellent!  Thanks, Julie.  Please verify that the U5 beta kernel
(2.4.21-31.EL, which is available in the RHN beta channel) resolves
this problem.  We discovered and fixed a flaw in the kernel's handling
for /proc/kcore that could corrupt kernel data structures (and file
buffer and user pages).  In the meantime, I'm going to close this as
a dup of the relevant bug.


Comment 5 Ernie Petrides 2005-03-31 02:30:05 UTC
Actually, I'll just put this bug into MODIFIED state.  The related BZs are
132838, 133905, 134988, 136317, and 145563 (although some of them might have
access restrictions).

The /proc/kcore fix was committed to the RHEL3 U5 patch pool on 28-Jan-2005
(in kernel version 2.4.21-27.10.EL).


Comment 6 Tim Powers 2005-05-18 13:29:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html