Bug 185762 - Problems with EDAC module during first boot
Summary: Problems with EDAC module during first boot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Alan Cox
QA Contact: Brian Brock
URL:
Whiteboard:
: 174891 183352 (view as bug list)
Depends On:
Blocks: 198694 200936
TreeView+ depends on / blocked
 
Reported: 2006-03-17 18:30 UTC by Linda Wang
Modified: 2018-10-19 20:43 UTC (History)
10 users (show)

Fixed In Version: RHBA-2007-0304
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-05-08 00:47:04 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Upstream fix (1.57 KB, patch)
2006-05-15 17:17 UTC, Alan Cox
no flags Details | Diff
Patch from upstream 2.6.17 rebased for 2.6.9-36.1 (1.54 KB, patch)
2006-06-20 19:48 UTC, Gary Case
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0304 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 4 Update 5 2007-04-28 18:58:50 UTC

Comment 1 Jason Baron 2006-03-17 18:39:06 UTC
if we're using this bug to track EDAC issues also see bug 182137 comment 17

Comment 2 Alan Cox 2006-04-24 13:10:53 UTC
If this is an AMI BIOS please raise the issue with AMI and Intel as according to
their linux-kernel posting in that case (which looks like your report a lot)
this is a BIOS interaction problem (they hide devices under us arbitarily on an
SMI occurance). Intel indicate they will be working with BIOS vendors on the
general issue. Until then disabling EDAC and not having any EDAC support on the
platform is the only immediate safe option.



Comment 3 Alan Cox 2006-05-15 16:59:01 UTC
*** Bug 183352 has been marked as a duplicate of this bug. ***

Comment 4 Alan Cox 2006-05-15 17:05:09 UTC
*** Bug 174891 has been marked as a duplicate of this bug. ***

Comment 5 Alan Cox 2006-05-15 17:15:09 UTC
I've foulded all these bugs together as they all get triggered by the same
underlying issue where the BIOS SMI code steals the device from us and hides it.
I'll attach the proposed (and upstream) fix in a moment, basically if the BIOS
has hidden the device we don't unhide it but tell the user to go chat to their
BIOS vendor.



Comment 6 Alan Cox 2006-05-15 17:17:14 UTC
Created attachment 129099 [details]
Upstream fix

Comment 13 Gary Case 2006-06-20 19:49:00 UTC
Created attachment 131220 [details]
Patch from upstream 2.6.17 rebased for 2.6.9-36.1

Comment 18 Jay Turner 2006-08-25 18:16:59 UTC
QE ack for 4.5.

Comment 19 RHEL Program Management 2006-09-07 19:26:45 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 20 RHEL Program Management 2006-09-07 19:26:49 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 21 RHEL Program Management 2006-09-07 19:27:03 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 22 Jason Baron 2006-09-22 00:33:43 UTC
committed in stream U5 build 42.13. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 29 Aristeu Rozanski 2006-10-12 18:23:09 UTC
Gary, the Mark Gross' answer didn't get into BZ#, I only noticed it now by
accessing Issue Tracker, sorry about that.
Please try to load the edac_mc module with panic_on_ue=0 option(either by
specifying it when loading or by adding module option on modutils configuration)
and please paste the complete dmesg output here.
Thanks

Comment 31 Aristeu Rozanski 2006-10-13 15:45:10 UTC
According to comment #26, I checked the RPMs and that option is there, so the
patch appears to be correctly applied. The use of this option will avoid the
machine panic so we can have the complete dmesg.


Comment 32 Aristeu Rozanski 2006-10-13 15:51:05 UTC
To make my last comment clear: the use of panic_on_ue (on edac_mc module) option
is needed so we can get all kernel messages to check what's happening. The
force_function_unhide option is the one added by the patch (which comment #26
asserts to be on the module e752x_edac on Jason's kernel).


Comment 34 Aristeu Rozanski 2006-10-16 13:10:26 UTC
(In reply to comment #33)
(...)
> running the test with modprobe.conf option line:
> options e7552x_edac fouce_function_unhide=1 panic_on_ue=0
> results in no messages and no crashes.  (looking at edac_mc.c it looks like
> there isn't any messages that will get logged.
Please notice that "panic_on_ue" option is a edac_mc module option

> I looked in the /proc/mc
> directory but didn't find any inodes.
known problem, I'm working on it


Comment 37 Alan Cox 2006-11-14 15:53:55 UTC
force_unhide should not be set. If the problem only occurs when force_unhide is
set this is a BIOS bug and the kernel change is not needed.


Comment 38 Red Hat Bugzilla 2007-02-08 19:42:32 UTC
Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2007:9073-03.
http://errata.devel.redhat.com/errata/showrequest.cgi?advisory=4730

Comment 40 Red Hat Bugzilla 2007-05-08 00:47:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html


Note You need to log in before you can comment on or make changes to this bug.