Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
mcelogd fails to start at boot, with message:
"AMD Processor family 16: Please load edac_mce_amd module
Version-Release number of selected component (if applicable):
How reproducible:
Reboot system. Always happens.
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
I have another server running RHEL 6.2, with the exact same hardware, and mcelogd starts with no problems. This one's running 6.5. I checked lsmod, and the module is loaded.
>AMD Processor family 16: Please load edac_mce_amd module
David,
The above message is correct. In order to get the "best" ECC error information on AMD systems you should be using the edac_mce_amd module and not mcelogd.
P.
Does that mean I should disable the service?
I believe this is still a bug. The error message gives the impression that the service depends on the module being loaded.
Why is mcelog enabled by default, instead of edac, on a system that doesn't support it? This is very misleading.
Until now, I knew nothing about either of these services, and I spent a lot of time searching for information without really finding anything.
And if the two are incompatible, why are they both running on my CentOS 6.2 server?
$ edac-ctl --status
edac-ctl: drivers are loaded
$ service mcelogd status
/dev/mcelog not active
Checking for mcelog
mcelog is running
But on 6.5:
$edac-ctl --status
edac-ctl: drviers not loaded
$lsmod | grep edac
edac_core 46581 0
edac_mce_amd 14705 0
So it appears that the modules are loaded. The edac-ctl manpage shows a load option, but:
$edac-ctl --load
Unknown option: load