Bug 1166978 - mcelog: AMD Processor family 21: CPU is unsupported
Summary: mcelog: AMD Processor family 21: CPU is unsupported
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: mcelog
Version: 29
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Prarit Bhargava
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-22 09:42 UTC by bob
Modified: 2023-07-05 15:03 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-27 22:07:02 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description bob 2014-11-22 09:42:09 UTC
Description of problem:

mcelog fails with processor not supported error on AMD FX-8350 8-core processor.


Version-Release number of selected component (if applicable):

mcelog.x86_64 2:1.0-0.13.f0d7654.fc21  

How reproducible:

always


Additional info:

# dmesg | grep mce
[    0.156169] mce: CPU supports 7 MCE banks
[    8.166267] systemd[1]: Unit mcelog.service entered failed state.
[    8.177209] systemd[1]: mcelog.service failed.

# systemctl --failed
  UNIT           LOAD   ACTIVE SUB    DESCRIPTION
● mcelog.service loaded failed failed Machine Check Exception Logging Daemon
● rngd.service   loaded failed failed Hardware RNG Entropy Gatherer Daemon

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

2 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.


# systemctl status mcelog.service
● mcelog.service - Machine Check Exception Logging Daemon
   Loaded: loaded (/usr/lib/systemd/system/mcelog.service; enabled)
   Active: failed (Result: exit-code) since Fri 2014-11-21 20:48:33 CST; 6h ago
  Process: 664 ExecStart=/usr/sbin/mcelog --ignorenodev --daemon --foreground (code=exited, status=1/FAILURE)
  Process: 641 ExecStartPre=/etc/mcelog/mcelog.setup (code=exited, status=0/SUCCESS)
 Main PID: 664 (code=exited, status=1/FAILURE)

Nov 21 20:48:32 ************ systemd[1]: Starting Machine Check Exception Logging Daemon...
Nov 21 20:48:33 ************ systemd[1]: Started Machine Check Exception Logging Daemon.
Nov 21 20:48:33 ************ systemd[1]: mcelog.service: main process exited, code=exited, status=1/FAILURE
Nov 21 20:48:36 ************ mcelog.setup[641]: CPU is unsupported
Nov 21 20:48:36 ************ mcelog[664]: mcelog: AMD Processor family 21: Please load edac_mce_amd module.
Nov 21 20:48:36 ************ mcelog[664]: : Success
Nov 21 20:48:36 ************ mcelog[664]: CPU is unsupported

See:
https://bugzilla.redhat.com/show_bug.cgi?id=1069652

This error warning is incredibly obtuse.  If AMD CPUs are not supported and this error is to be ignored, then the error messages need to be toned down.  There's no point in flagging always systemd with failures.

Comment 1 bob 2014-11-22 09:50:50 UTC
Is the rngd.service failure coupled to the mcelog.service failure?

How are AMD users supposed to deal with these problems?  Should both the mcelog and rngd services be disabled on AMD boxes?  If so, chould this be done automatically so that the users aren't burdended with these problems?

thanks.

Comment 2 bob 2014-11-23 00:23:57 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1069652#c10

In regard to Comment 10 on Bug 106952, which resulted in the bug being closed IN ERROR:

>you can safely disable rndg since it tries to use a feature that is present only certain CPUs and is not present on yours.

thanks.  that fixes one problem.

>As for the mcelog.service, you may want to leave it there, since it just informs you (the warning) that your CPU does not support the intel MCE logging facility and that it has thus kicked in the AMD module, which handles that very same functionality in AMD CPUs.

that may be the way it's supposed to work, but by submission clearly indicates that it IS NOT working that way.  please don't dismiss this bug report in a cursory fashion by assuming that the system works -- IT DOESN'T.

whether or not an AMD module ever gets loaded, the mcelog.service NEVER starts, and no logs are ever generated on the system.  

the bug on AMD family 21 is real -- the daemon DOES NOT work, there is no logging functionality, and the failure reports are accurate.

Comment 3 Prarit Bhargava 2015-01-28 11:54:55 UTC
(In reply to bob from comment #2)
> https://bugzilla.redhat.com/show_bug.cgi?id=1069652#c10
> 
> In regard to Comment 10 on Bug 106952, which resulted in the bug being
> closed IN ERROR:
> 
> >you can safely disable rndg since it tries to use a feature that is present only certain CPUs and is not present on yours.
> 
> thanks.  that fixes one problem.
> 
> >As for the mcelog.service, you may want to leave it there, since it just informs you (the warning) that your CPU does not support the intel MCE logging facility and that it has thus kicked in the AMD module, which handles that very same functionality in AMD CPUs.
> 
> that may be the way it's supposed to work, but by submission clearly
> indicates that it IS NOT working that way.  please don't dismiss this bug
> report in a cursory fashion by assuming that the system works -- IT DOESN'T.
> 
> whether or not an AMD module ever gets loaded, the mcelog.service NEVER
> starts, and no logs are ever generated on the system.  
> 
> the bug on AMD family 21 is real -- the daemon DOES NOT work, there is no
> logging functionality, and the failure reports are accurate.

So what would you expect to happen here?

P.

Comment 4 Malar Kannan 2015-04-02 08:09:28 UTC
Apparently it is present since fedora 20

https://bugzilla.redhat.com/show_bug.cgi?id=1138923

I am experiencing the same bug.

Comment 5 customercare 2015-08-10 07:36:55 UTC
duplicate of bug 1207383

Comment 6 customercare 2015-08-10 07:47:33 UTC
@ Prarit Bhargava 2015-01-28 06:54:55 EST:

you wanted to know what to do about : 

"mcelog: AMD Processor family 21: Please use the edac_mce_amd module instead.
: Success
CPU is unsupported
"

advising the user to use the amd module and than print out a "Success" Message, just to end with an error message, is unlogical at best.

TODO List

1. stop after "mcelog: AMD Processor family 21: Please use the edac_mce_amd module instead."

2. remove that error message completly as it's wrong. With and without the edac_mce_amd module loaded, the outcome is the same. Why telling the admin to do somethign that does not chance anything ? 


All AMD Users : 

systemctl disable mcelog
yum -y erase mcelog

one update less to worry about :)

Comment 7 Prarit Bhargava 2015-08-10 11:58:35 UTC
(In reply to customercare from comment #6)
> @ Prarit Bhargava 2015-01-28 06:54:55 EST:
> 
> you wanted to know what to do about : 
> 
> "mcelog: AMD Processor family 21: Please use the edac_mce_amd module instead.
> : Success
> CPU is unsupported
> "
> 
> advising the user to use the amd module and than print out a "Success"
> Message, just to end with an error message, is unlogical at best.
> 
> TODO List
> 
> 1. stop after "mcelog: AMD Processor family 21: Please use the edac_mce_amd
> module instead."
> 
> 2. remove that error message completly as it's wrong. With and without the
> edac_mce_amd module loaded, the outcome is the same. Why telling the admin
> to do somethign that does not chance anything ? 
> 
> 
> All AMD Users : 
> 
> systemctl disable mcelog
> yum -y erase mcelog
> 
> one update less to worry about :)

This isn't correct.  AMD _family 21/16h_ are not supported by mcelog.

P.

Comment 8 Fedora End Of Life 2015-11-04 10:50:42 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 9 Fedora End Of Life 2015-12-02 05:10:01 UTC
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 10 Luan Cestari 2016-04-17 13:46:17 UTC
(In reply to Fedora End Of Life from comment #9)
> Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
> no longer maintained, which means that it will not receive any further
> security or bug fix updates. As a result we are closing this bug.
> 
> If you can reproduce this bug against a currently maintained version of
> Fedora please feel free to reopen this bug against that version. If you
> are unable to reopen this bug, please file a new report against the
> current release. If you experience problems, please add a comment to this
> bug.
> 
> Thank you for reporting this bug and we are sorry it could not be fixed.

Hi,

I can reproduce this one Fedora 23. Could you reopen the issue please?

Thank in advance,
Luan

Comment 11 Luan Cestari 2016-04-17 13:55:56 UTC
I think the AMD Processors are outdated in the https://github.com/andikleen/mcelog/blob/master/mcelog.c . Not sure if the changes to be made to work go beyond this file but my guess is that it might be the most critical to made CPU monitoring daemon work for AMD 16 and above (mine is 21 and isn't a new processor by the way)

Comment 12 Balázs Meskó 2018-02-18 14:38:23 UTC
I am using Fedora 27, and I have bumped into this issue too. As far as I have looked, it looks to me, that mcelog should not be used at all on AMD hardware. So I have the same question, as Comment #1:

> How are AMD users supposed to deal with these problems?  Should both the
> mcelog and rngd services be disabled on AMD boxes?  If so, chould this be
> done automatically so that the users aren't burdended with these problems?

Of course I can disable these services manually, but why should I? Can't/shouldn't we disable these services during installation?

Comment 13 Gerald Cox 2018-06-20 05:27:35 UTC
This problem still exists... it is happening on Fedora 28.

Comment 14 Prarit Bhargava 2018-08-02 11:51:53 UTC
Is the bug happening on every boot, or every installation?

P.

Comment 15 Gerald Cox 2018-08-02 16:59:09 UTC
(In reply to Prarit Bhargava from comment #14)
> Is the bug happening on every boot, or every installation?
> 
> P.

It happens on every boot and is causing systemctl to report it is running in degraded status - which goes against the recently revised Fedora policy:

https://fedoraproject.org/wiki/Packaging:DefaultServices?rd=DefaultServices

It needs to be fixed or approved as a FESCo exception.

Comment 16 Jan Kurik 2018-08-14 10:31:57 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.

Comment 17 Tom Chiverton 2019-02-06 21:43:46 UTC
Every boot here too :


Feb 06 21:20:33 bookcase systemd[1]: Started Machine Check Exception Logging Daemon.
Feb 06 21:20:33 bookcase mcelog[535]: mcelog: ERROR: AMD Processor family 21: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Feb 06 21:20:33 bookcase mcelog[535]: CPU is unsupported
Feb 06 21:20:33 bookcase systemd[1]: mcelog.service: Main process exited, code=exited, status=1/FAILURE
Feb 06 21:20:33 bookcase systemd[1]: mcelog.service: Failed with result 'exit-code'.
~

Comment 18 Anton Maklakov 2019-04-25 10:10:25 UTC
I have the same thing on my Ryzen 7 2700X, every boot.

Apr 25 16:52:19 slug mcelog[3250]: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor.  Please use the edac_mce_amd module instead.
Apr 25 16:52:19 slug mcelog[3250]: CPU is unsupported


calling "modprobe edac_mce_amd" doesn't help

Comment 19 Ben Cotton 2019-10-31 19:33:42 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 20 Ben Cotton 2019-11-27 22:07:02 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 21 Charlie Candergart 2020-10-21 11:00:51 UTC
I am experiencing this same issue in Fedora Server 32 now.
A completely fresh install.


12:50 PM
mcelog.service: Failed with result 'exit-code'.
systemd
12:50 PM
mcelog.service: Main process exited, code=exited, status=1/FAILURE
systemd
12:50 PM
CPU is unsupported
mcelog
12:50 PM
mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead.
mcelog
12:50 PM
Started Machine Check Exception Logging Daemon.
systemd
12:50 PM
mcelog.service: Failed with result 'exit-code'.
systemd
12:50 PM
mcelog.service: Main process exited, code=exited, status=1/FAILURE
systemd
12:50 PM
CPU is unsupported
mcelog
12:50 PM
mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Please use the edac_mce_amd module instead.
mcelog
12:50 PM
Started Machine Check Exception Logging Daemon.

Comment 22 Kyle 2021-05-19 15:08:28 UTC
This is a funny thread. 
Same issue with Fedora 34 and "AMD Processor family 25" 
"Use the edac_mcd_amd module instead" doesn't help me. The module is loaded but I see no logging.

Comment 23 Mike 2023-07-05 14:08:08 UTC
I have same problem on Fedora 38 Server :-)
mcelog: ERROR: AMD Processor family 20: mcelog does not support this processor.  Please use the edac_mce_amd module instead.

# sudo modprobe edac_mce_amd
# sudo systemctl restart mcelog.service

× mcelog.service - Machine Check Exception Logging Daemon
     Loaded: loaded (/usr/lib/systemd/system/mcelog.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: exit-code) since Wed 2023-07-05 17:03:17 MSK; 2min 12s ago
   Duration: 242ms
  Condition: start condition failed at Wed 2023-07-05 17:05:20 MSK; 8s ago
             └─ ConditionPathExists=!/sys/module/edac_mce_amd/initstate was not met
    Process: 668 ExecStart=/usr/sbin/mcelog --daemon --foreground (code=exited, status=1/FAILURE)
   Main PID: 668 (code=exited, status=1/FAILURE)
        CPU: 8ms

Comment 24 Kyle 2023-07-05 15:03:27 UTC
Same error here but while "edac_mce_amd" module loads, "amd64_edac" does not "No such device"
My boot says "AMD Ryzen 7 5800X 8-Core Processor (family: 0x19, model: 0x21, stepping: 0x0)"
The systemctl error message makes no sense. File is definitely there:
#  cat  /sys/module/edac_mce_amd/initstate 
live


Note You need to log in before you can comment on or make changes to this bug.