Bug 580836
| Summary: | EDAC driver error on system with bad memory [rhel-5.5.z] | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | RHEL Program Management <pm-rhel> | |
| Component: | kernel | Assignee: | Jiri Pirko <jpirko> | |
| Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 5.6 | CC: | bhavna.sarathy, bnagendr, cward, dave.love, dhoward, jan.gerrit, jwest, jwilson, jzhenyon, kzhang, paul.lowrie, pm-eus, prarit, rdoty, rkhan, vincent | |
| Target Milestone: | rc | Keywords: | OtherQA, ZStream | |
| Target Release: | --- | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 590691 (view as bug list) | Environment: | ||
| Last Closed: | 2010-05-06 18:49:57 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 569938 | |||
| Bug Blocks: | 590691 | |||
|
Description
RHEL Program Management
2010-04-09 08:06:07 UTC
in 2.6.18-194.1.1.el5 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0398.html I can't see the original bug report, but the problem reported in the erratum appears not to be fixed in the 2.6.18-194.3.1.el5 kernel, like this (one of many reports on multiple Barcelona nodes in our cluster): Jun 24 00:32:39 lvgig025 kernel: Northbridge Error, node 1, core: -1 Jun 24 00:32:39 lvgig025 kernel: K8 ECC error. Jun 24 00:32:39 lvgig025 kernel: EDAC amd64 MC1: CE ERROR_ADDRESS= 0x375a77cc0 Jun 24 00:32:39 lvgig025 kernel: EDAC MC1: CE page 0x375a77, offset 0xcc0, grain 0, syndrome 0x4951, row 7, channel 0, label "": amd64_edac Jun 24 00:32:39 lvgig025 kernel: EDAC MC1: CE - no information available: amd64_edacError Overflow The problem is not fixed with 2.6.18-274.7.1.el5 either. Linux racdbmc1ldv 2.6.18-274.7.1.el5 #1 SMP Mon Oct 17 11:57:14 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux Apr 15 16:05:54 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x404b9f3a0 Apr 15 16:05:54 racdbmc1ldv kernel: EDAC MC0: CE page 0x404b9f, offset 0x3a0, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:05:58 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x28ea0dd40 Apr 15 16:05:58 racdbmc1ldv kernel: EDAC MC0: CE page 0x28ea0d, offset 0xd40, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:06:04 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x40382bfa0 Apr 15 16:06:04 racdbmc1ldv kernel: EDAC MC0: CE page 0x40382b, offset 0xfa0, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:06:18 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x212788140 Apr 15 16:06:18 racdbmc1ldv kernel: EDAC MC0: CE page 0x212788, offset 0x140, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:06:34 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x2d222c6e0 Apr 15 16:06:34 racdbmc1ldv kernel: EDAC MC0: CE page 0x2d222c, offset 0x6e0, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:07:34 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x3d491c8f0 Apr 15 16:07:34 racdbmc1ldv kernel: EDAC MC0: CE page 0x3d491c, offset 0x8f0, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac Apr 15 16:07:46 racdbmc1ldv kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x330d2d000 Apr 15 16:07:46 racdbmc1ldv kernel: EDAC MC0: CE page 0x330d2d, offset 0x0, grain 0, syndrome 0x2b8, row 4, channel 0, label "": amd64_edac kernel 2.6.18-398.el5 on RHEL 5.11 shows same kind of messages. |