Bug 1870040 - model_Skylake-Server - vmx, nx host feature not detected
Summary: model_Skylake-Server - vmx, nx host feature not detected
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.40.25
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: bugs@ovirt.org
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-19 09:13 UTC by Oleh Horbachov
Modified: 2020-08-19 11:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1854922
Environment:
Last Closed: 2020-08-19 11:48:05 UTC
oVirt Team: Virt
Embargoed:


Attachments (Terms of Use)

Description Oleh Horbachov 2020-08-19 09:13:25 UTC
+++ This bug was initially created as a clone of Bug #1854922 +++

Description of problem:

I've upgraded our engine to 4.4 and wanted to upgrade our host to 4.4 as well. I've reinstalled the machine with CentOS 8.2 and wanted to add it back to the cluster but it fails with a message indicating that some needed cpu flags are missing.

Initially I tried to add it to an existing 4.3 cluster with cpu type Intel Skylake Server IBRS SSBD MDS Family but this failed. I've then created a new 4.4 cluster with cpu type Secure Intel Skylake Server Family but this also fails wit the message:

The host CPU does not match the Cluster CPU Type and is running in a degraded mode. It is missing the following CPU flags: vmx, ssbd, md_clear, model_Skylake-Server, spec_ctrl. Please update the host CPU microcode or change the Cluster CPU Type.

When I look at the detected features in the vdsm log:

'info': {'kvmEnabled': 'true', 'cpuCores': '10', 'cpuThreads': '20', 'cpuSockets': '1', 'onlineCpus':
'0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19', 'cpuSpeed': '1002.074', 'cpuModel': 'Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz', 'cpuFlags': 'tpr_shadow,monitor,nx,rtm,cqm_mbm_local,pat,fxsr,adx,rdtscp,pku,cat_l3,avx,smap,x2apic,pcid,mca,ht,cmov,dts,popcnt,fsgsbase,rdseed,erms,art,avx2,fma,invpcid_single,fpu,pdpe1gb,bmi1,ospke,flush_l1d,msr,sep,vmx,sse2,md_clear,tm,invpcid,clflushopt,intel_pt,smep,epb,aes,pclmulqdq,ssse3,avx512f,pts,stibp,mtrr,vme,xsaveopt,ept,pbe,flexpriority,hypervisor,hle,smx,dca,tsc_adjust,sse4_2,ibpb,rdt_a,skip-l1dfl-vmentry,clwb,sse4_1,avx512dq,cpuid_fault,ds_cpl,pdcm,arat,apic,avx512bw,de,pae,vnmi,cqm,f16c,cqm_llc,xtopology,amd-ssbd,avx512cd,pge,xtpr,constant_tsc,pse,arch-capabilities,nopl,sse,clflush,xsaves,cqm_occup_llc,pti,xgetbv1,sdbg,ss,xsave,pebs,cx16,mmx,syscall,lahf_lm,abm,ssbd,aperfmperf,cpuid,pse36,3dnowprefetch,mce,mba,dtes64,dtherm,mpx,intel_ppin,tsc_deadline_timer,tm2,vpid,nonstop_tsc,arch_perfmon,movbe,umip,md-clear,est,tsc,rdrand,cqm_mbm_total,pni,cdp_l3,cx8,acpi,rep_good,lm,bts,xsavec,bmi2,ida,pln,invtsc,avx512vl,ibrs,model_Broadwell-IBRS,model_Skylake-Server-IBRS,model_Skylake-Client-IBRS,model_n270,model_Penryn,model_Opteron_G2,model_coreduo,model_Westmere,model_Skylake-Client,model_Nehalem,model_Westmere-IBRS,model_Opteron_G1,model_qemu32,model_Nehalem-IBRS,model_SandyBridge,model_pentium2,model_SandyBridge-IBRS,model_Haswell-noTSX-IBRS,model_Haswell-IBRS,model_IvyBridge,model_qemu64,model_pentium,model_Haswell,model_kvm64,model_Broadwell-noTSX-IBRS,model_pentium3,model_Broadwell-noTSX,model_Broadwell,model_IvyBridge-IBRS,model_Conroe,model_Haswell-noTSX,model_core2duo,model_486,model_Skylake-Server,model_kvm32', 'version_name': 'Snow Man', 'software_version': '4.40.22', 'software_revision': '1', 'supportedENGINEs': ['4.2', '4.3', '4.4'], 'clusterLevels': ['4.2', '4.3', '4.4']

The "missing" features are all there except spec_ctrl. According to https://bugzilla.redhat.com/show_bug.cgi?id=1837266 this should get added automatically on IBRS cpu's, but it seems it isn't in this case.

With the cpu type set to Intel Skylake Server Family it also complains about other missing features that are also clearly present in /proc/cpuinfo (such as vmx and nx)


Version-Release number of selected component (if applicable):
vdsm-4.40.22-1.el8.x86_64

How reproducible:


Steps to Reproduce:
1. Add an Intel(R) Xeon(R) Silver 4114 CPU host to a 4.4 cluster
2.
3.

Actual results:
oVirt complains about missing cpu features

Expected results:
Host added to cluster

Additional info:

--- Additional comment from Michal Skrivanek on 2020-07-09 04:32:29 UTC ---

Can you please attach full vdsm.log and engine.log? And virsh domcapabilities and virsh capabilities output if you can. Thanks!

--- Additional comment from Rik Theys on 2020-07-09 06:13:21 UTC ---



--- Additional comment from Rik Theys on 2020-07-09 06:13:42 UTC ---



--- Additional comment from Rik Theys on 2020-07-09 06:14:02 UTC ---



--- Additional comment from Rik Theys on 2020-07-09 06:15:00 UTC ---

Hi,

I've attached the requested logs and command output.

The logs will show a lot of attempts to get this host up as I'm having multiple issues.

Regards,
Rik

--- Additional comment from Michal Skrivanek on 2020-07-13 15:46:30 UTC ---

It could be because your host is Cascadelake-Server and bug 1837266  is adding it only for names ending with -IBRS...but in this case it doesn't. Milan, it may need another exception or maybe blacklist rather than a whitelist for this...

--- Additional comment from Rik Theys on 2020-07-14 11:15:04 UTC ---

Hi Michal,

(In reply to Michal Skrivanek from comment #6)
> It could be because your host is Cascadelake-Server and bug 1837266  is
> adding it only for names ending with -IBRS...but in this case it doesn't.
> Milan, it may need another exception or maybe blacklist rather than a
> whitelist for this...

Are you sure my cpu is a Cascadelake? According to the 'virsh capabilities' my cpu model is Skylake-Server-IBRS. Since it ends with -IBRS, it makes me believe the feature should have been automatically added to my feature list already.

According to https://ark.intel.com/content/www/us/en/ark/products/123550/intel-xeon-silver-4114-processor-13-75m-cache-2-20-ghz.html my cpu is a Skylake cpu.

Regards,
Rik

--- Additional comment from Milan Zamazal on 2020-07-14 14:52:56 UTC ---

Hi Rik,

indeed your physical CPU model is reported as Skylake-Server-IBRS. For some reason, libvirt apparently decides to use Cascadelake model for your guests, as reported in `virsh domcapabilities'. Both the models should report spec_ctrl, but Vdsm currently reports it only for *-IBRS. So I think Michal's analysis above still applies and we need one of the suggested fixes.

--- Additional comment from Yaning Wang on 2020-08-11 06:22:33 UTC ---

Verified on:

rhv-4.4.2-2
vdsm-4.40.24-1

Steps:

1. Add an Intel(R) Xeon(R) Silver 4110 CPU host to a 4.4 cluster


Actual results:
hosts successfully added to cluster without any complaints

--- Additional comment from Oleh Horbachov on 2020-08-17 09:58:26 UTC ---

I created cluster for v4.4.0 and after upgrade I had the same problem on ovirt v4.4.1.4. I found this bugreport and updated to version 4.4.2-pre, the problem remained with error

The host CPU does not match the Cluster CPU Type and is running in a degraded mode. It is missing the following CPU flags: vmx, model_Skylake-Server, nx. Please update the host CPU microcode or change the Cluster CPU Type.

In additional I tried reinstall
CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
ovirt-engine-4.4.2.2-1.el8.noarch
vdsm-4.40.25-1.el8.x86_64

--- Additional comment from Oleh Horbachov on 2020-08-17 09:59:45 UTC ---



--- Additional comment from Oleh Horbachov on 2020-08-17 10:03:40 UTC ---

Sorry missed text
I tried reinstall exist host and catch same error

Comment 1 Michal Skrivanek 2020-08-19 11:37:59 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1854922#c13

please do not clone bugs, just open a new one and link it in comment


Note You need to log in before you can comment on or make changes to this bug.