Bug 814101

Summary: [RFE] microcode_ctl doesn't have microcode for AMD FX CPUs
Product: [Fedora] Fedora Reporter: Zoltan Boszormenyi <zboszor>
Component: microcode_ctlAssignee: Anton Arapov <anton>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16CC: antillon.maurizio, anton, jonathan, nobody
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-05 23:07:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Zoltan Boszormenyi 2012-04-19 09:06:07 UTC
Description of problem:

A month ago, I upgraded my computer to AMD FX-8120, ASUS M5A99X-EVO motherboard and 32GB DDR3/1600.

memtest86+ showed no RAM errors but since the upgrade I got these strange symptoms:

1. While using "make -j8" while compiling some projects, I got signal 11 errors from GCC, about once in every 5 runs in the kernel source.
2. Corrupted data in copying or compressing large amount of data. The size of files do not seem to be significant, I got errors in small files while copying or moving large source trees between filesystems. Once, when gzipping a 4GB USB pendrive disk image, the compressed file was corrupted and couldn't have been uncompressed.
3. Occasional kernel panics in 3.3.x while running these kernels:
kernel-3.3.0-8.fc16.x86_64
kernel-3.3.1-3.fc16.x86_64
kernel-3.3.1-5.fc16.x86_64

Signal 11 in GCC during "make -j8" occasionally showed up with 3 BIOS versions: the v0813 that the motherboard was shipped with, 0901 and the latest 1102.

I found this in the syslog:

Apr 18 19:03:35 localhost kernel: [   12.444977] microcode: CPU0: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.934369] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.934401] microcode: CPU1: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.938289] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.938344] microcode: CPU2: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.942047] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.942070] microcode: CPU3: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.945725] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.945817] microcode: CPU4: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.949358] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.949388] microcode: CPU5: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.953257] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.953348] microcode: CPU6: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.957163] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.957222] microcode: CPU7: patch_level=0x06000623
Apr 18 19:03:35 localhost kernel: [   12.961038] microcode: failed to load file amd-ucode/microcode_amd_fam15h.bin
Apr 18 19:03:35 localhost kernel: [   12.961174] microcode: Microcode Update Driver: v2.00 <tigran.co.uk>, Peter Oruba

I also found that the latest AMD microcode package at
    http://www.amd64.org/support/microcode.html
contains a file with the expected name above. This microcode fixes 7 CPU errata, 2 of which can hang the system, 1 can show "unpredictable system behaviour, likely leading to a system hang".

I have put the microcode_amd_fam15h.bin into the expected directory and restarted the system. Syslog showed this:

Apr 18 19:46:08 localhost kernel: [   10.744921] microcode: CPU0: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.287275] microcode: CPU0: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.305806] microcode: CPU1: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.310949] microcode: CPU1: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.318040] microcode: CPU2: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.322948] microcode: CPU2: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.346073] microcode: CPU3: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.357031] microcode: CPU3: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.369115] microcode: CPU4: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.374137] microcode: CPU4: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.397257] microcode: CPU5: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.414044] microcode: CPU5: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.420907] microcode: CPU6: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.425630] microcode: CPU6: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.448816] microcode: CPU7: patch_level=0x06000623
Apr 18 19:46:08 localhost kernel: [   11.476105] microcode: CPU7: new patch_level=0x06000624
Apr 18 19:46:08 localhost kernel: [   11.476645] microcode: Microcode Update Driver: v2.00 <tigran.co.uk>, Peter Oruba

I ran this script overnight in the kernel source:

    while `/bin/true` ; do
        make clean >/dev/null
        make -j8 all >/dev/null
    done

so only stderr from GCC were shown and no signal 11 error happened.

Version-Release number of selected component (if applicable):

microcode_ctl-1.17-20.fc16.x86_64

How reproducible:

Often but not always.

Steps to Reproduce:
1. Fedora 16
2. AMD FX CPU
3. make -j8 in the kernel or copy/compress large files, several GB in size.
  
Actual results:

Signal 11 or data corruption.

Expected results:

No errors.

Additional info:

Comment 1 Anton Arapov 2012-04-19 11:17:18 UTC
Yep. We don't have new AMD microcode in f16's package ... And that's weird.

I will push an update in a few.

Comment 2 Fedora Update System 2012-04-19 11:29:04 UTC
microcode_ctl-1.17-24.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/microcode_ctl-1.17-24.fc16

Comment 3 Fedora Update System 2012-04-22 03:27:58 UTC
Package microcode_ctl-1.17-24.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing microcode_ctl-1.17-24.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-6363/microcode_ctl-1.17-24.fc16
then log in and leave karma (feedback).

Comment 4 Fedora Update System 2012-06-05 23:07:52 UTC
microcode_ctl-1.17-24.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.