Bug 1652519 - host does not meet the cluster's minimum CPU level. Missing CPU features : spec_ctrl
Summary: host does not meet the cluster's minimum CPU level. Missing CPU features : sp...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: redhat-virtualization-host
Version: 4.2.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.3.1
: 4.3.0
Assignee: Yuval Turgeman
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-22 09:38 UTC by Kumar Mashalkar
Modified: 2020-02-25 13:18 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, during an upgrade, dracut running inside chroot did not detect the cpuinfo and the kernel config files because /proc was not mounted and /boot was bindmounted. As a result, the correct microcode was missing from the initramfs. The current release bindmounts /proc to the chroot and removes the --hostonly flag. This change inserts both AMD and Intel microcodes into the initramfs and boots the host after an upgrade.
Clone Of:
Environment:
Last Closed: 2019-05-08 12:32:19 UTC
oVirt Team: Node
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:1053 0 None None None 2019-05-08 12:32:38 UTC
oVirt gerrit 98033 0 master MERGED osupdater: remove hostonly from dracut 2020-07-25 06:44:40 UTC
oVirt gerrit 98035 0 master MERGED osupdater: bind mount /proc for dracut 2020-07-25 06:44:40 UTC

Description Kumar Mashalkar 2018-11-22 09:38:57 UTC
Description of problem:
Host xxx moved to Non-Operational state as host does not meet the cluster's minimum CPU level. Missing CPU features : spec_ctrl

Version-Release number of selected component (if applicable):
imgbased-1.0.29-1.el7ev.noarch

How reproducible:
100% at cu

Steps to Reproduce:
1. Upgrade host to rhvh-4.2.7.3
2. Activate the host.
3. Host goes non-operational in cluster with IBRS.

Actual results:
Host goes to Non-Operational state due to missing CPU features : spec_ctrl

Expected results:
Host should be active.

Additional info:
Issue was reported in Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1624453 but only one flag issue was resolved. spec_ctrl flag is still missing. As mentioned in it, I am opening new Bugzilla here.

I can also see one upstream Bugzilla reporting the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=1595378

Comment 1 Michal Skrivanek 2018-11-23 06:04:37 UTC
You need to add same details and investigate similarly to the original bug. Otherwise this report is kind of unhelpful. 

Also please add sql dump of the corresponding host and cluster tables. Thanks

Comment 2 Ryan Barry 2018-11-23 06:49:27 UTC
To follow up, please provide, at a minimum, the output of lscpu, a sql dump or screenshot of the cluster CPU level, and the host CPU details as reported by RHVM

Comment 4 Michal Skrivanek 2018-11-23 08:47:21 UTC
it's still the same thing, the CPU is plain Haswell-noTSX without any microcode update (record 3 in your list seems to be updated, record 4 does not)

Comment 5 Ryan Barry 2018-11-28 23:25:01 UTC
rhvh 4.2.7 ships with microcode_ctl-2.1-47.el7.x86_64, which should be up to date unless some change in 7.6 disabled default mitigation.

Please post dmesg and the output of /proc/cmdline

Additionally, please ensure that 'rpm -q microcode_ctl' shows the version above

Comment 7 Michal Skrivanek 2018-12-10 11:30:43 UTC
you still do not have the latest microcode applied (running 0x38), AFAICT the latest in that microcode_ctl is 0x3d.
Ryan, is it possible it's wrong in the dracut image of rhvh? The microcode_ctl version look ok, it's just not getting applied it seems.

Comment 8 Ryan Barry 2018-12-10 11:53:07 UTC
It's possible, but would require user intervention.

Kumar, can the custoemr please try unpacking the initrd  to verify the firmware files present?

Comment 9 Ryan Barry 2018-12-10 13:39:41 UTC
Moving back to Node, since the engine is doing what it's supposed to be doing

Comment 16 Huijuan Zhao 2018-12-20 06:44:01 UTC
cshao@ reproduced this issue in https://bugzilla.redhat.com/show_bug.cgi?id=1624453#c17
But with the same rhvh version(from rhvh-4.2.4.3-0.20180622 to rhvh-4.2.5.2-0.20180813), I did not reproduce this issue with rhvm-4.2.7.5-0.1.el7ev, maybe related to the old rhvm-4.2.5 then.

According to https://bugzilla.redhat.com/show_bug.cgi?id=1624453#c17, QE will flag qa_ack+

Comment 21 Sandro Bonazzola 2019-02-18 07:57:56 UTC
Moving to 4.3.2 not being identified as blocker for 4.3.1

Comment 23 Huijuan Zhao 2019-02-26 06:10:20 UTC
The bug is fixed in rhvh-4.3.0.5-0.20190225.0 with rhvm-4.2.8.2-0.1.el7ev

Test version:
# imgbase layout
rhvh-4.2.4.3-0.20180622.0
 +- rhvh-4.2.4.3-0.20180622.0+1
rhvh-4.3.0.5-0.20190225.0
 +- rhvh-4.3.0.5-0.20190225.0+1

Test steps:
1. Install rhvh-4.2.4.3-0.20180622.0
2. Add rhvh to rhvm
3. Upgrade rhvh to rhvh-4.3.0.5-0.20190225.0 from rhvm side

Test results:
After step 3, rhvh is active in rhvm


Moving status to VERIFIED.

Comment 26 errata-xmlrpc 2019-05-08 12:32:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1053

Comment 27 Daniel Gur 2019-08-28 13:14:06 UTC
sync2jira

Comment 28 Daniel Gur 2019-08-28 13:18:22 UTC
sync2jira


Note You need to log in before you can comment on or make changes to this bug.