Bug 1885114

Summary: Unable to start mcelog service
Product: [Fedora] Fedora Reporter: Ashish Kumar <ashish.dav99>
Component: mcelogAssignee: Prarit Bhargava <prarit>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 33CC: ashish.dav99, jonathan, prarit, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-01 13:13:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ashish Kumar 2020-10-05 06:25:58 UTC
Description of problem:
mcelog service does not start automatically on boot and neither am I able to start manually.
Tested on VMWare Virtual Machine - AMD Ryzen 3900X CPU used

Version-Release number of selected component (if applicable):
mcelog-168-2.fc33.x86_64

How reproducible:
Everytime

Steps to Reproduce:
1. systemctl --all --failed
2. modprobe edac_mce_amd
3. systemctl restart mcelog
4. systemctl status mcelog

Actual results:
  UNIT           LOAD   ACTIVE SUB    DESCRIPTION                           
● mcelog.service loaded failed failed Machine Check Exception Logging Daemon

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
---------------------------------------------------------------------------
modprobe: ERROR: could not insert 'edac_mce_amd': Invalid argument
---------------------------------------------------------------------------
● mcelog.service - Machine Check Exception Logging Daemon
     Loaded: loaded (/usr/lib/systemd/system/mcelog.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2020-10-05 11:49:10 IST; 5s ago
    Process: 2846 ExecStart=/usr/sbin/mcelog --ignorenodev --daemon --foreground (code=exited, status=1/FAILURE)
   Main PID: 2846 (code=exited, status=1/FAILURE)
        CPU: 1ms

Oct 05 11:49:10 localhost.localdomain systemd[1]: Started Machine Check Exception Logging Daemon.
Oct 05 11:49:10 localhost.localdomain mcelog[2846]: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead.
Oct 05 11:49:10 localhost.localdomain mcelog[2846]: CPU is unsupported
Oct 05 11:49:10 localhost.localdomain systemd[1]: mcelog.service: Main process exited, code=exited, status=1/FAILURE
Oct 05 11:49:10 localhost.localdomain systemd[1]: mcelog.service: Failed with result 'exit-code'.


Expected results:
Service should have started on reboot or manually.

Additional info:

Comment 1 Thomas Neuber 2020-12-23 20:22:39 UTC
I observed the same issue in a VMware virtual machine that is hosted on a AMD Ryzen 7 3700U. I replaced the mcelog package with the rasdaemon.

Comment 2 Prarit Bhargava 2021-04-01 13:13:13 UTC
Thanks @Thomas Neuber.  That is the correct course of action.  mcelog is known not to work on newer AMD platforms.

P.