Bug 1532283
| Summary: | System hang during boot following update to microcode_ctl-1.17-25.2.el6_9.x86_64 | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Kyle Walker <kwalker> |
| Component: | microcode_ctl | Assignee: | Petr Oros <poros> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Rachel Sibley <rasibley> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 6.9 | CC: | cdonnell, ionut, jbastian, jpriddy, o.freyermuth, poros, riehecky, sfroemer, skozina, tgummels, tomek, vagrawal, wienemann, williamverzal1, woodard |
| Target Milestone: | rc | ||
| Target Release: | 6.10 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | microcode_ctl-1.17-25.3.el6_9.x86_64 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-06-21 12:45:11 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1425544 | ||
Same here after the update. Downgrading the microcode fixes the issue. Hardware platform is: Supermicro X10DDW-i with 2x E5-2667v4 CPUs stepping 1, BIOS 2.0a from 8/17/2016 (the latest available.) Incidentally, it appears that various kernels may or may not cause the hang to happen, and also the behavior is not consistent between machines with otherwise identical hardware. Case in point: - on one machine with the above specs, booting kernel-2.6.32-696.16.1.el6.x86_64 causes a hard hang when loading the microcode. - on the same machine, booting kernel-2.6.32-696.18.7.el6.x86_64 causes a hard reset when loading the microcode. - on the same machine, booting a custom (locally built) kernel based on 3.10.107 boots up fine. However: - on another machine with the above specs, booting the custom 3.10.107-based kernel causes a hard hang. Downgrading the microcode fixes the problem in all the problem cases encountered. We also see this on downstream distros, e.g. CentOS 6, SL 6, CentOS 7 etc. Checking the version of 06-4f-01, it seems the revision packaged in RHEL6 is 0xb000025 while the last revision released officially by Intel in the last microcode package from 2018-01-08 is 0xb000021 So I understand some pre-production version has been included and shipped to enterprise customers? Debian, Gentoo and others still ship 0xb000021 (as released officially by Intel)... Guidance we've received from Intel directly suggests that they're aware of problems in the 0xb000025 firmware for Broadwell E/EP (as well as in other firmware revisions for other CPUs) and they recommend delaying the deployment of this firmware into production. So we applied this on ~1300 servers last week (a mix of physical and virtual). What is the impact on ESX based OS instances? |
Description of problem: Following an update of the microcode_ctl package, a system hangs during boot with messages related to the microcode load operation. Observed at this time only on systems with the following CPU model information: $ awk '/model/||/stepping/||/family/||/microcode/ {if(!seen[$0]++) print $0}' proc/cpuinfo cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz stepping : 1 microcode : 184549407 Version-Release number of selected component (if applicable): 1:microcode_ctl-1.17-25.2.el6_9.x86_64 How reproducible: Easily Steps to Reproduce: 1. On a system with the above CPU information, install a base RHEL 6.9 deployment 2. Issue a "yum update" 3. Reboot Actual results: <snip> microcode: CPU0 sig=0x406f1, pf=0x1, revision=0xb00001f platform microcode: firmware: requesting intel-ucode/06-4f-01 <snip> ^- Hangs here with no further output visible Expected results: A normal boot operation with no hang observed. Additional info: A downgrade of the microcode to the previous revision resolves the hang.