Bug 621669

Summary: mcelog: Unknown Intel CPU type for Xeon X5650
Product: Red Hat Enterprise Linux 5 Reporter: Brian Pitts <brian>
Component: mcelogAssignee: Prarit Bhargava <prarit>
Status: CLOSED ERRATA QA Contact: Evan McNabb <emcnabb>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: cdrh, fred.new2911, havard.moen, jaeshin, matthias, m.c.dixon, mpoole, sia, simon.fayer05, spurrier, vince.borrego, xiaomingzone
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
The mcelog daemon shipped with Red Hat Enterprise Linux 5 does not support all processors. Previously, mcelog did not check whether the system is supported or not before adding a cronjob. Consequent to this, an attempt to use it on an unsupported system caused the following email message to be sent to a system administrator every hour: mcelog: Unknown Intel CPU type family [cpu_family] model [model] With this update, mcelog has been adapted to ensure that the system is supported before adding a cronjob, so that system administrators no longer receive these messages.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-23 08:59:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
RHEL5 fix for this issue
none
mcelog.spec patch none

Description Brian Pitts 2010-08-05 19:06:01 UTC
Description of problem:
mcelog.cron reports 'mcelog: Unknown Intel CPU type family 6 model 2c'.


Version-Release number of selected component (if applicable):
mcelog-0.9pre-1.29.el5

How reproducible:
Always.

Steps to Reproduce:
1. Install RHEL 5.5 on a server with an Intel X5650 CPU
  
Actual results:
Error message about unknown CPU type.

Expected results:
No error message about unknown CPU type.

Additional info:
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
stepping        : 2
cpu MHz         : 2660.082
cache size      : 12288 KB
physical id     : 1
siblings        : 12
core id         : 0
cpu cores       : 6
apicid          : 32
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx pdpe1gb rdtscp lm constant_tsc ida nonstop_tsc arat pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5320.16
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: [8]

Comment 1 Fred New 2010-10-30 10:12:57 UTC
I'm also getting this error for a Xeon(R) X5670.  Is there a workaround, such as setting --cpu=cputype?

Comment 2 Fred New 2010-11-02 07:32:16 UTC
Upon closer inspection, it seems that I only receive this message when there are other problems encountered.  That is, in my server's /var/log/mcelog there are indications that I have a memory error.  When mcelog doesn't detect any machine checks, it doesn't email me the "unknown CPU type" message.

Comment 3 Matthias Saou 2010-12-10 09:54:37 UTC
Same here with a dual X5660. Same weird thing where I get the following cron email once in a while (maybe twice a week or so), definitely not every hour, and with only that message, nothing else reported :

From: root@foo (Cron Daemon)
To: root@foo
Subject: Cron <root@foo> run-parts /etc/cron.hourly
Date: Fri, 10 Dec 2010 09:01:01 +0100 (CET)

/etc/cron.hourly/mcelog.cron:

mcelog: Unknown Intel CPU type family 6 model 2c

Comment 4 Brian Pitts 2010-12-17 18:04:03 UTC
You're seeing the error message when a machine check exception occurs. The cron job runs every hour to check if there is anything new in /dev/mcelog and, if so, write a human-readable report to /var/log/mcelog.

Check if your server's manufacturer has released a BIOS update that incorporates the newest microcode from Intel for the X5600 series of CPUs; that may fix the memory errors.

mcelog still needs to be updated to recognize this series of CPUs.

Comment 5 Vince Borrego 2010-12-21 16:52:22 UTC
We are experiencing the same issue wtih the Intel E5620 CPU's as well running Dell R710 servers with the 2.2.10 bios.

Version: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz

According to the 2.2.10 bios readme.txt for Dell R710's the following fixes were implemented, but unfortunately the issue still persists wtih the error message "Unknown Intel CPU type family 6 model 2c"

From readme.txt for 2.2.10 Dell R710 bios:

* Added support for an option to disable predictive memory failure reporting using the Deployment Toolkit.
* Added TXT-SX support
* Increased single-bit error logging threshold
* Ensure BIOS has enabled processor AES-NI before booting to the operating system.
* Updated the iDRAC Configuration Utility
* Updated the embedded 5709 UEFI driver to version 6.0.0
* Updated MRC
* Updated Intel(R) Xeon(R) Processor 5600 Series B1 stepping microcode (Patch ID=0x13)
* Added SR-IOV support
* Updated the embedded 5709C PXE/iSCSI option ROM to version 6.0.11

Comment 6 Prarit Bhargava 2011-01-21 12:32:48 UTC
RHEL5 has base support for the Intel Xeon 56xx series processors.  If you require MCE or RAS support for this processor, please upgrade to RHEL6 which has full MCE support.

P.

Comment 7 Prarit Bhargava 2011-01-21 12:34:15 UTC
Actually ... I'm reopening this.  I see the issue is that the mcelog cron job is filling the log & root email with error messages.

I'll see what I can do about cleaning that up.

P.

Comment 8 clive darra 2011-01-21 14:48:29 UTC
MANY THANKS !

Comment 10 Prarit Bhargava 2011-01-24 16:47:51 UTC
Created attachment 474989 [details]
RHEL5 fix for this issue

Comment 11 Prarit Bhargava 2011-01-24 16:48:59 UTC
Created attachment 474991 [details]
mcelog.spec patch

Comment 14 Mark Dixon 2011-02-08 11:04:17 UTC
I don't understand this.

RHEL5 is supposed to remain in production phase 1 until the end of the year. Why won't Red Hat support new Intel processors on it?

Comment 15 Prarit Bhargava 2011-02-08 14:08:35 UTC
(In reply to comment #14)
> I don't understand this.
> 
> RHEL5 is supposed to remain in production phase 1 until the end of the year.
> Why won't Red Hat support new Intel processors on it?

RHEL5 continues to support the latest Intel processors but does not support advanced RAS (ie, MCE) features for those processors.

P.

Comment 23 Jaromir Hradilek 2011-02-16 12:56:45 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The mcelog daemon shipped with Red Hat Enterprise Linux 5 does not support all processors. Previously, mcelog did not check whether the system is supported or not before adding a cronjob. Consequent to this, an attempt to use it on an unsupported system caused the following email message to be sent to a system administrator every hour:

  mcelog: Unknown Intel CPU type family [cpu_family] model [model]

With this update, mcelog has been adapted to ensure that the system is supported before adding a cronjob, so that system administrators no longer receive these messages.

Comment 24 Brian Pitts 2011-02-16 14:17:24 UTC
Wait, does this mean that the information generated in /var/log/mcelog by the mcelog cronjob on a Xeon X5650 is not valid?

Comment 25 Prarit Bhargava 2011-02-16 14:28:51 UTC
(In reply to comment #24)
> Wait, does this mean that the information generated in /var/log/mcelog by the
> mcelog cronjob on a Xeon X5650 is not valid?

Brian,

For errors reported in the first six MCE banks the data is valid.

The problem is that errors reported in the upper MCE banks will NOT be reported back to the user.

P.

Comment 27 errata-xmlrpc 2011-03-23 08:59:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0377.html