Bug 1398698 - microcode_ctl update causes duplicate package problems and non-bootable system
Summary: microcode_ctl update causes duplicate package problems and non-bootable system
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: microcode_ctl
Version: 7.3
Hardware: All
OS: Unspecified
high
urgent
Target Milestone: rc
: ---
Assignee: Petr Oros
QA Contact: Rachel Sibley
URL:
Whiteboard:
: 1397567 1405247 (view as bug list)
Depends On:
Blocks: 1382443 1402147 1402512
TreeView+ depends on / blocked
 
Reported: 2016-11-25 15:55 UTC by Blair Aitken
Modified: 2018-01-22 12:35 UTC (History)
29 users (show)

Fixed In Version: microcode_ctl-2.1-18.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1402147 1402512 (view as bug list)
Environment:
Last Closed: 2017-08-01 20:18:17 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1851 normal SHIPPED_LIVE microcode_ctl bug fix and enhancement update 2017-08-01 18:22:41 UTC
Red Hat Bugzilla 1397567 None CLOSED Updating / reinstalling 2:microcode_ctl-2.1-16.el7.x86_64 causes a hard crash on UCS B200 blade 2019-10-18 02:58:21 UTC
Red Hat Knowledge Base (Solution) 2787471 None None None 2016-12-19 13:20:08 UTC

Internal Links: 1397567

Description Blair Aitken 2016-11-25 15:55:19 UTC
Description of problem:
=======================
Multiple customers are opening cases where a RHEL 7 system is updated to 7.3 and the update is interrupted during the microcode_ctl package update. This results in a system that cannot boot into one of more of it's kernels, as all initramfs have attempted to be rebuilt. This results in an interrupted yum transaction, which leaves a systems RPM database in an inconsistent state.
 
This is observed in the duplicate bugzilla #1397567
 
  Updating / reinstalling 2:microcode_ctl-2.1-16.el7.x86_64 causes a hard crash on UCS B200 blade
  https://bugzilla.redhat.com/show_bug.cgi?id=1397567
 
Version-Release number of selected component (if applicable):
=============================================================
microcode_ctl-2.1-16.el7
 
How reproducible:
=================
Intermittently
 
Steps to Reproduce:
====================
1. Install multiple kernel revisions
2. Update microcode_ctl
3. Verify the length of time needed to complete the transaction
 
Actual results:
===============
* failed yum transaction
* duplicate packages problem
* multiple kernels, or all, do not boot due to problems with initramfs
 
Expected results:
=================
* completed yum transaction
* no duplicate packages
* all kernels booting
 
Additional info:
================
* This issue has been extremely difficult to reproduce, it has only been directly observed by end users. Some instances seem to be due to user forced restarts of systems due to the system seemingly stalled during the update. However, others resulted in a system reboot from external sources due to the exceptional period of time necessary for that update. The increase in time necessary to update this package was initiated with the bug report below:
 
        Bug 1292158 – postinstall scriptlet rebuilds initramfs only for the running kernel
        https://bugzilla.redhat.com/show_bug.cgi?id=1292158

Comment 3 Robin Edser 2016-12-01 12:08:01 UTC
Regarding reproducibility we are currently affected by this on multiple production HP BL460c Gen9 servers. This is reproducible on demand:

1) Install RHEL 7.2
2) Perform a yum update
3) Server halts at "Updating : 2:microcode_ctl-2.1-16.el7.x86_64"
4) Within a minute or so server crashes, reboots and is inaccessible.

Comment 4 Petr Oros 2016-12-01 13:03:27 UTC
Hi,

How many kernels you have installed on RHEL7.2 before yum update
Which CPU you have in server? (cpu_family/model from /proc/cpuinfo)
How you did update? (over ssh, direct)?

Thanks
-Petr

Comment 5 Robin Edser 2016-12-01 13:12:36 UTC
Hi Petr

One kernel installed:

~ # rpm -q kernel
kernel-3.10.0-327.el7.x86_64

CPU info:

~ # lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2643 v4 @ 3.40GHz
Stepping:              1
CPU MHz:               3400.000
BogoMIPS:              6819.94
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

Update performed by SSH, then "yum upgrade"

Thanks

Robin

Comment 6 Petr Oros 2016-12-01 14:40:58 UTC
What exactly means: server crashes, reboots and is inaccessible?
microcode_ctl update need some time for regenerate initramfs images. Is it possible that you just lose ssh connection -> yum transaction end unfinished.
In this case, update can break initramfs image and system will not boot.

Thanks,
-Petr

Comment 8 Stanislav Kozina 2016-12-01 14:57:07 UTC
Robin,

Thank you for detailed report on the reproducer. Can you please also provide log from the console? Does the kernel really crashes due to a kernel problem(ie. there's a backtrace printed on the console), or is it killed by the watchdog for example?

The installation of the microcode_ctl package currently performs two things:
1) First, load the provided microcode into the CPU
2) Then, re-generate all initramfs images on /boot partition. This is because the CPU microcode also needs to be updated in the initramfs environment. This can take a while.

We need to identify which of these steps is causing issues. Thank you!

Comment 9 Robin Edser 2016-12-01 16:28:03 UTC
Hi Petr, Stanislav

Sorry, by crash I mean it "instantly resets" as if one pressed the physical hard reset button on the server. No backtrace is printed.

I've just tried updating only the microcode_ctl package on it's own and it reproduces the problem (which is handy because it still allows me to boot back into the server). During a full upgrade from 7.2 -> 7.3 it left the system without an initramfs image for the new kernel, a large unfinished yum transaction and a pretty broken system. Now I can boot back into the system no problem so it seems the problem is occurring during step 1. You can even remove and reinstall microcode_ctl-2.1-16.el7.x86_64 to reproduce.

I've also done it now directly in the iLO console and saved a recording of the steps if it helps: https://www.dropbox.com/s/v73qzfntqo83n5h/microcode_ctl.ilo?dl=0

Robin

Comment 10 Stanislav Kozina 2016-12-01 17:06:37 UTC
Hi Robin,

Thank you for the information on the hard reset, that's indeed very interesting. So far we've tried (unsuccessfully) to reproduce the problem on the same CPU model which you're running. We haven't tried reproducing on the same system so far.

Unfortunately we are unable to open the iLO console dump you kindly provided. It just looks like a random binary file. Could you please advise how to open this dump, or extract the raw console text out of it for us?

Thank you!
-Stanislav

Comment 11 Robin Edser 2016-12-01 18:26:16 UTC
Hi Stanislav

You can replay the iLO file with the standalone console (for windows unfortunately) - http://h20564.www2.hpe.com/hpsc/swd/public/detail?swItemId=MTX_4f842ceb31cf48d392e22705a8

Basically there's nothing interesting to see though, here's a copy of the commands and output:

bl460c ~ # rpm -q kernel
kernel-3.10.0-327.el7.x86_64

bl460c ~ # lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2643 v4 @ 3.40GHz
Stepping:              1
CPU MHz:               3400.000
BogoMIPS:              6806.86
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              20480K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23
bl460c ~ # yum list installed microcode_ctl
Loaded plugins: aliases, changelog, langpacks, ovl, product-id, rhnplugin, search-disabled-repos, subscription-manager,
              : tmprepo, verify, versionlock
This system is receiving updates from RHN Classic or Red Hat Satellite.
Installed Packages
microcode_ctl.x86_64                                 2:2.1-12.el7                                 @rhel-x86_64-server-7

bl460c ~ # yum update -y microcode_ctl
Loaded plugins: aliases, changelog, langpacks, ovl, product-id, rhnplugin, search-disabled-repos, subscription-manager,
              : tmprepo, verify, versionlock
This system is receiving updates from RHN Classic or Red Hat Satellite.
Resolving Dependencies
--> Running transaction check
---> Package microcode_ctl.x86_64 2:2.1-12.el7 will be updated
---> Package microcode_ctl.x86_64 2:2.1-16.el7 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

=======================================================================================================================
 Package                     Arch                 Version                     Repository                          Size
=======================================================================================================================
Updating:
 microcode_ctl               x86_64               2:2.1-16.el7                rhel-x86_64-server-7               744 k

Transaction Summary
=======================================================================================================================
Upgrade  1 Package

Total download size: 744 k
Downloading packages:
No Presto metadata available for rhel-x86_64-server-7
microcode_ctl-2.1-16.el7.x86_64.rpm                                                             | 744 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Updating   : 2:microcode_ctl-2.1-16.el7.x86_64                                                                   1/2 


*** As soon as it gets here it's just a couple of seconds or so and server hard resets O_o ****



Robin

Comment 12 vishal agrawal 2016-12-01 23:25:51 UTC
Hello Robin,

I am Vishal Agrawal. Could you please try out below steps as you are able to reproduce the issue.

1) Remove the currently installed microcode_ctl rpm

# yum remove microcode_ctl

2) Download the microcode_ctl rpm locally "microcode_ctl-2.1-16.el7.x86_64"

# yum downloader microcode_ctl

3) Now install the rpm using below command:

# rpm -ivh microcode_ctl -vv

If the system gets hung, reset it and again remove the rpm

# yum remove microcode_ctl

Next, install the same rpm again using '--noscripts'

# rpm -ivh --noscripts microcode_ctl -vv

Probably it should get installed without any issue.

Once done, Execute below command:

# systemctl preset microcode.service >/dev/null 2>&1 || :

Next,

# echo 1 > /sys/devices/system/cpu/microcode/reload

Probably last command should hang the box, I tried with one of UCS B200 M4 box
of my customer and it got hanged.

Thanks,

- Vishal Agrawal.

Comment 13 Robin Edser 2016-12-02 03:39:16 UTC
Hi Vishal

1) Remove the currently installed microcode_ctl rpm
# yum remove microcode_ctl
2) Download the microcode_ctl rpm locally "microcode_ctl-2.1-16.el7.x86_64"
# yum downloader microcode_ctl
3) Now install the rpm using below command:
# rpm -ivh microcode_ctl -vv


- As you said, this resulted in the hard reset, last few lines shown in the console:

D: adding 1 entries to Sigmd5 index.
D: adding "ad61fa9edbcf0401a54dda470cd9ddbc768963d5" to Sha1header index.
D: %post(microcode_ctl-2:2.1-16.el7.x86_64): scriptlet start
D: %post(microcode_ctl-2:2.1-16.el7.x86_64): execv(/bin/sh) pid 4371
+ '[' 1 -eq 1 ']'
+ systemctl preset microcode.service



Next, install the same rpm again using '--noscripts'
# rpm -ivh --noscripts microcode_ctl -vv
Probably it should get installed without any issue.
Once done, Execute below command:
# systemctl preset microcode.service >/dev/null 2>&1 || :
Next,
# echo 1 > /sys/devices/system/cpu/microcode/reload


- Exactly as you said here too. The last command hard reset the server. It didn't even print the command on the console:

D: closed   db index       /var/lib/rpm/Basenames
D: closed   db index       /var/lib/rpm/Name
D: closed   db index       /var/lib/rpm/Packages
D: closed   db environment /var/lib/rpm
# systemctl preset microcode.service >/dev/null 2>&1 || :
# 



Thanks

Robin

Comment 16 Petr Oros 2016-12-02 09:03:28 UTC
Hi Robin,
Here si build for testing:

http://people.redhat.com/poros/38af5c54926b620264ab1501150cf189/


Build contain latest microcode 20161104 and fix for initramfs regeneration.
Please test it and let me know.

Thanks,
-Petr

Comment 18 Robin Edser 2016-12-03 12:05:39 UTC
Hi Petr

Sorry for the delay I've managed to test the updated rpm (microcode_ctl-2.1-17.bz1397567.el7.x86_64.rpm) and unfortunately the problem is the same.

I've verified on 2 machines using both yum and the rpm --noscripts method posted by Vishal.

Please let me know if you need any more information.

Thanks

Robin

Comment 21 Stephan Dühr 2016-12-06 08:54:12 UTC
I just wanted to add that I've noticed that this issue is also depending on BIOS version, I've seen this issue on HP Proliant DL360 Gen9 with Xeon E5-2637 v4 with BIOS Version P89 07/18/2016, but after updating the BIOS to P89 09/13/2016 it did not happen.

Comment 22 Petr Oros 2016-12-06 09:14:18 UTC
Which microcode version for intel cpu  is in firmware? (in both 09/13/2016 and 07/18/2016)

Comment 24 Petr Oros 2016-12-07 07:31:44 UTC
You can check it by this command:
$ dmesg | grep –i microcode
Please, run it before upgrade on affected system

Thanks,
-Petr

Comment 25 Robin Edser 2016-12-07 07:38:52 UTC
Hi Petr

Here is the output from both our affected systems:

bl460c01 ~ # dmesg | grep -i microcode
[    0.976566] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976569] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976574] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976579] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976584] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976591] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976598] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976605] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976611] microcode: CPU8 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976616] microcode: CPU9 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976622] microcode: CPU10 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976628] microcode: CPU11 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976633] microcode: CPU12 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976638] microcode: CPU13 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976643] microcode: CPU14 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976648] microcode: CPU15 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976652] microcode: CPU16 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976659] microcode: CPU17 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976664] microcode: CPU18 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976669] microcode: CPU19 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976674] microcode: CPU20 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976685] microcode: CPU21 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976690] microcode: CPU22 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976695] microcode: CPU23 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.976723] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

bl460c02 ~ # dmesg | grep -i microcode
[    0.970303] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970306] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970311] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970316] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970321] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970328] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970335] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970341] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970347] microcode: CPU8 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970353] microcode: CPU9 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970359] microcode: CPU10 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970365] microcode: CPU11 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970370] microcode: CPU12 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970375] microcode: CPU13 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970380] microcode: CPU14 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970385] microcode: CPU15 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970389] microcode: CPU16 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970395] microcode: CPU17 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970400] microcode: CPU18 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970405] microcode: CPU19 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970410] microcode: CPU20 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970415] microcode: CPU21 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970420] microcode: CPU22 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970425] microcode: CPU23 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.970448] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Thanks

Robin

Comment 27 Stephan Dühr 2016-12-07 09:02:24 UTC
(In reply to Petr Oros from comment #22)
> Which microcode version for intel cpu  is in firmware? (in both 09/13/2016
> and 07/18/2016)

HP ProLiant DL360 Gen9 Intel(R) Xeon(R) CPU E5-2637 v4 @ 3.50GHz

Bios HP P89 07/18/2016
$ dmesg | grep -i microcode
[    0.589312] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589317] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589322] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589327] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589332] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589336] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589341] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589345] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb00001a
[    0.589363] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Bios HP P89 09/13/2016
$ dmesg | grep -i microcode
[    2.420883] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420887] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420893] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420898] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420904] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420909] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420915] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420920] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb00001e
[    2.420937] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Comment 28 Maxime Veroone 2016-12-07 17:16:33 UTC
Same behaviour on a freshly kickstarted (only one kernel + rescue) RHEL 7.2 on a Dell PowerEdge R630 
$ dmesg | grep -i microcode
[    1.156154] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156158] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156163] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156173] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156178] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156184] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156190] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156195] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156201] microcode: CPU8 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156206] microcode: CPU9 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156212] microcode: CPU10 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156218] microcode: CPU11 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156223] microcode: CPU12 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156229] microcode: CPU13 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156234] microcode: CPU14 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156243] microcode: CPU15 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156248] microcode: CPU16 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156254] microcode: CPU17 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156259] microcode: CPU18 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156264] microcode: CPU19 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156269] microcode: CPU20 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156273] microcode: CPU21 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156278] microcode: CPU22 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156283] microcode: CPU23 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156288] microcode: CPU24 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156293] microcode: CPU25 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156297] microcode: CPU26 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156305] microcode: CPU27 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156310] microcode: CPU28 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156315] microcode: CPU29 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156319] microcode: CPU30 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156324] microcode: CPU31 sig=0x406f1, pf=0x1, revision=0xb000017
[    1.156345] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Comment 30 Petr Oros 2016-12-12 14:59:21 UTC
Hi Maxime,
Can you provide more info?
I need /proc/cpuinfo output and bios version on your Dell PowerEdge R630

Thanks,
-Petr

Comment 31 Maxime Veroone 2016-12-12 15:10:32 UTC
Hi, bear with me, I only copied the /proc/cpuinfo of a single core, but that's 2 physical CPUs, with 8 cores each + hyperthreading. (so 32 logical cores)

BIOS Version : 2.1.7
Microcode version : 2.30.30.30


# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 79
model name      : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
stepping        : 1
microcode       : 0xb000017
cpu MHz         : 1839.875
cache size      : 25600 KB
physical id     : 0
siblings        : 16
core id         : 0
cpu cores       : 8
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 20
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc
bogomips        : 6400.58
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Comment 32 Petr Oros 2016-12-16 11:49:22 UTC
Hi,
Please can you check this package on affected system:
http://people.redhat.com/poros/e6a8cc19d733b1526c2519fbaa1e231d/
Follow these steps:
1) It is necessary have microcode_ctl-2.12 or older or nothing

2) run cmd and save output
   $ dmesg | grep –i microcode

3) Intall provided package

4) run cmd and save output
   $ dmesg | grep –i microcode

5) reboot

6) run cmd and save output
   $ dmesg | grep –i microcode

After these steps, please send saved cmd results

Thanks,
-Petr

Comment 33 vishal agrawal 2016-12-16 19:02:11 UTC
Hello Petr,

I have good news from my customer with your new test package.


Customer's update

~~~~~
Before installing the new RPM, there was no output to the console or test.out file. Here is the output after installing the RPM and rebooting...

# cat test.out
microcode_ctl-2.1-18.el7.x86_64
[   28.236688] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.236690] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.236693] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.236696] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.280868] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.280872] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.280875] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.280878] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb000014
[   28.280891] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
After Reboot
microcode_ctl-2.1-18.el7.x86_64
[    0.000000] microcode: microcode updated early to revision 0xb00001d, date = 2016-06-06
[   29.889075] microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889085] microcode: CPU1 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889092] microcode: CPU2 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889102] microcode: CPU3 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889111] microcode: CPU4 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889121] microcode: CPU5 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.889132] microcode: CPU6 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.931667] microcode: CPU7 sig=0x406f1, pf=0x1, revision=0xb00001d
[   29.931707] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba

Looks like it updated, and the Cisco UCS blade did not crash during the process.
~~~~~

Let me know if you require any other outputs from the same system.

Thank you,

- Vishal Agrawal.

Comment 35 Stanislav Kozina 2016-12-20 08:32:59 UTC
*** Bug 1397567 has been marked as a duplicate of this bug. ***

Comment 36 Maxime Veroone 2016-12-22 15:59:11 UTC
Hi,

We have successfully installed the 2.1-18 version provided above on a configuration previously affected by this bug on version 2.1-16.

 revision=0xb000014 before update
 revision=0xb00001d after (by early update)

Comment 37 Michael 2017-01-04 18:49:20 UTC
Is it safe to deploy microcode_ctl-2.1-18 to customer systems?  When will this propagate to CentOS?

Comment 39 James Hartig 2017-01-10 21:50:29 UTC
According to the RedHat solution, microcode_ctl-2.1-16.1.el7_3.x86_64 fixes this issue, but after we upgraded, microcode.service fails to start with the following error:
[/usr/lib/systemd/system/microcode.service:10] Trailing garbage, ignoring.
microcode.service lacks both ExecStart= and ExecStop= setting. Refusing.

The ExecStart in the service file contains double quotes inside double quotes:
ExecStart=/usr/bin/bash -c "grep -l GenuineIntel /proc/cpuinfo | xargs grep -l "model.*79" > /dev/null || echo 1 > /sys/devices/system/cpu/microcode/reload".

We're running CentOS 7.3.1611.

Comment 40 Stanislav Kozina 2017-01-11 08:42:15 UTC
James,
This is known problem and is harmless, the microcode is still updated. Please see bz1411232. Thanks.

Comment 43 Stanislav Kozina 2017-01-24 12:59:51 UTC
*** Bug 1405247 has been marked as a duplicate of this bug. ***

Comment 44 marcindulak 2017-02-22 10:13:33 UTC
another related issue to watch: kickstart installation of 7.3 hangs on microcode_ctl-2.1-16.el7.x86_64

Comment 45 Stanislav Kozina 2017-06-05 13:00:20 UTC
We have identified a workaround how to install rhel-7.3 GA on one of the affected systems. If kickstart file is used for the installation of the system it's possible to add additional repositories for the installation using the 'repo' command:

http://pykickstart.readthedocs.io/en/latest/kickstart-docs.html#id44

If a repository containing a fixed microcode_ctl package (for example from rhel-7.3.z stream) is configured in the kickstart file for the installation, Anaconda will install the updated microcode_ctl package instead of the one present on the rhel-7.3 GA and thus it's possible to install the system without any issues.

Comment 46 Rachel Sibley 2017-06-05 16:12:37 UTC
Looks like the results are still the same for hardware availability to test:
https://bugzilla.redhat.com/show_bug.cgi?id=1402512#c8

I SanityOnly verified, I also see it was verified by the customer in c#33.

$ rhpkg clone microcode_ctl
$ cd microcode_ctl/
$ rhpkg switch-branch rhel-7.4
$ rhpkg prep

* Fri Dec 16 2016 Petr Oros <poros@redhat.com> - 2.1-18
- Fix issue with hot microcode cpu reload.
- Resolves: #1398698

[rasibley@localhost microcode_ctl]$ git show 9cfdb7
commit 9cfdb7fb0a29d454536170fb78f0091cd0861158
Author: Petr Oros <poros@redhat.com>
Date:   Fri Dec 16 12:34:14 2016 +0100

    Fix issue with hot microcode cpu reload.
    
    Signed-off-by: Petr Oros <poros@redhat.com>

diff --git a/microcode_ctl.spec b/microcode_ctl.spec
index aa557d9..218907e 100644
--- a/microcode_ctl.spec
+++ b/microcode_ctl.spec
@@ -3,7 +3,7 @@
 Summary:        Tool to transform and deploy CPU microcode update for x86.
 Name:           microcode_ctl
 Version:        2.1
-Release:        17%{?dist}
+Release:        18%{?dist}
 Epoch:          2
 Group:          System Environment/Base
 License:        GPLv2+ and Redistributable, no modification permitted
@@ -46,6 +46,7 @@ install -m 644 %{SOURCE2} %{buildroot}/usr/lib/dracut/dracut.conf.d
 %systemd_post microcode.service
 # "reload" file is not presented on a certain virtualized hw
 if [ -f /sys/devices/system/cpu/microcode/reload ] ; then
+       grep -l GenuineIntel /proc/cpuinfo | xargs grep -l "model.*79" > /dev/null || \
        echo 1 > /sys/devices/system/cpu/microcode/reload
 fi
 
@@ -70,6 +71,10 @@ rm -rf %{buildroot}
 
 
 %changelog
+* Fri Dec 16 2016 Petr Oros <poros@redhat.com> - 2.1-18
+- Fix issue with hot microcode cpu reload.
+- Resolves: #1398698
+
 * Wed Nov 30 2016 Petr Oros <poros@redhat.com> - 2.1-17
 - Move dracut call into posttrans phase.
 - Resolves: #1398698
diff --git a/sources b/sources
index abbfa04..bf0ebf8 100644
--- a/sources
+++ b/sources
@@ -1,3 +1,3 @@
 b5410acbdc41239d461169a3f5cdb0e3  01-microcode.conf
 fc3d62be01333a1df032592cae3a9ca7  microcode_ctl-2.1-10.tar.xz
-81742055ca17ce1c4beba603dc9ff9e6  microcode.service
+a8a9857c6335db4dbec317f36c4ea561  microcode.service

Comment 47 errata-xmlrpc 2017-08-01 20:18:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1851


Note You need to log in before you can comment on or make changes to this bug.