Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2030006

Summary: [RHEL8.2Z - OSP16.1] Wrongly support "Cascadelake-Server" on physical host without avx512_vnni cpu flag
Product: Red Hat Enterprise Linux 8 Reporter: Priscila <pveiga>
Component: libvirtAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Luyao Huang <lhuang>
Severity: high Docs Contact:
Priority: medium    
Version: 8.2CC: ailan, cmayapka, dhill, dyuan, gveitmic, jdenemar, jiyan, lhuang, lijin, lmen, rbalakri, smooney, vasanth.mohanraj, virt-bugs, virt-maint, xuzhang, yalzhang
Target Milestone: rcKeywords: Triaged, Upstream
Target Release: ---Flags: pm-rhel: mirror+
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1761678 Environment:
Last Closed: 2022-07-04 23:23:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1761678    
Bug Blocks: 1840010    

Comment 4 John Ferlan 2021-12-08 18:24:23 UTC
Updating needinfo to Jira who resolved the cloned from bug 1761678

Comment 5 Jiri Denemark 2021-12-10 12:35:59 UTC
I'm confused. The bug was already backported and fixed in libvirt-6.0.0-21 and
released with RHEL-AV 8.2.1 in July 2020. See bug 1840010. So I would say
there's nothing to do here.

But just to be sure... what issues do you see and with what libvirt release
exactly?

Comment 6 David Hill 2021-12-22 17:38:03 UTC
We get the following failure when migrating from cascadelake to skylake:

2021-11-23 11:21:07.243 7 INFO nova.compute.manager [req-a8f868ee-6dff-49b1-aa6e-bc08b6b56014 - - - - -] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] During the sync_power process the instance has moved from host overcloud-novacompute-8.localdomain to host overcloud-novacompute-7.localdomain
2021-11-23 11:21:13.232 7 INFO nova.compute.manager [req-582a0e19-5770-43c4-a48a-9d9a207507db f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Post operation of migration started
2021-11-23 11:21:30.561 7 WARNING nova.compute.manager [req-b20eb4d0-27c6-4925-81a6-630741b33996 1edce6bc10fa4a13bebea68dd17ce68e 3aa0f6362c874915b069ebddb82aea3f - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Received unexpected event network-vif-plugged-a90fe5f2-012e-4e65-b3e3-48574f729860 for instance with vm_state active and task_state None.
2021-11-23 11:21:32.615 7 WARNING nova.compute.manager [req-8ae6142f-acdc-4213-a568-58a7ad826cfb 1edce6bc10fa4a13bebea68dd17ce68e 3aa0f6362c874915b069ebddb82aea3f - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Received unexpected event network-vif-plugged-a90fe5f2-012e-4e65-b3e3-48574f729860 for instance with vm_state active and task_state None.
2021-11-23 11:33:36.592 7 INFO nova.virt.libvirt.driver [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] Instance launched has CPU info: {"arch": "x86_64", "model": "Cascadelake-Server", "vendor": "Intel", "topology": {"cells": 2, "sockets": 1, "cores": 24, "threads": 2}, "features": ["tsc-deadline", "mca", "acpi", "apic", "stibp", "smep", "pat", "monitor", "mce", "xsaveopt", "bmi2", "vme", "mpx", "avx512f", "mtrr", "rdctl-no", "vmx", "3dnowprefetch", "dtes64", "avx512dq", "rdtscp", "avx2", "xgetbv1", "cmov", "avx512cd", "sse4.1", "rdrand", "intel-pt", "tsc", "erms", "pni", "cx8", "cx16", "xtpr", "ht", "tsc_adjust", "clflushopt", "fpu", "xsavec", "pku", "bmi1", "md-clear", "invpcid", "avx512vl", "tm", "arat", "skip-l1dfl-vmentry", "ds_cpl", "adx", "smap", "ss", "clflush", "syscall", "fsgsbase", "sse4.2", "spec-ctrl", "avx512bw", "nx", "lahf_lm", "msr", "de", "pse36", "clwb", "pdpe1gb", "fxsr", "est", "arch-capabilities", "f16c", "aes", "pbe", "abm", "ssbd", "ds", "rtm", "pse", "mds-no", "x2apic", "dca", "sep", "smx", "pcid", "pclmuldq", "hle", "popcnt", "fma", "sse", "ssse3", "pge", "lm", "pdcm", "tm2", "invtsc", "avx", "tsx-ctrl", "xsaves", "mmx", "ibrs-all", "rdseed", "movbe", "pae", "xsave", "sse2", "avx512vnni"]}
2021-11-23 11:33:36.595 7 ERROR nova.virt.libvirt.driver [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] CPU doesn't have compatibility.

even with:

        nova :: compute :: libvirt :: libvirt_cpu_model: 'Skylake-Server-IBRS'

set.

Comment 7 yalzhang@redhat.com 2022-01-10 05:37:11 UTC
(In reply to David Hill from comment #6)
> "mmx", "ibrs-all", "rdseed", "movbe", "pae", "xsave", "sse2", "avx512vnni"]}
> 2021-11-23 11:33:36.595 7 ERROR nova.virt.libvirt.driver
> [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8
> 7ff9802d352043b0b0f6648441b12324 - default default] CPU doesn't have
> compatibility.

Cascadelake is newer than skylake, and avx512vnni is not supported on skylake, so I think it is expected result. 

> even with:
> 
>         nova :: compute :: libvirt :: libvirt_cpu_model:
> 'Skylake-Server-IBRS'
> 
> set.
Do you mean migrate from Cascadelake-Server to Skylake with guest xml as below failed? I will try to reproduce it when I get the appropriate hardware.
 <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Server-IBRS</model>
</cpu>

Comment 9 Germano Veit Michel 2022-03-24 04:20:35 UTC
(In reply to David Hill from comment #6)
> 7ff9802d352043b0b0f6648441b12324 - default default] Instance launched has
> CPU info: {"arch": "x86_64", "model": "Cascadelake-Server", ......
> 
> even with:
> 
>         nova :: compute :: libvirt :: libvirt_cpu_model:
> 'Skylake-Server-IBRS'

If nova set the CPU to Skylake, why is the log above saying it launched an instance with Cascadelake? Isn't this nova launching a VM with the wrong CPU?

I've just tested the hypothesis in comment #7, and it works for me

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Server-noTSX-IBRS</model>

Migrates fine from one host machine to another, both being Cascadelake-Server-noTSX.

David, could you please provide more information so we can move on with this bug or close it?

Thanks!

Comment 10 Germano Veit Michel 2022-06-21 23:00:05 UTC
I think this could be another victim of this KCS.
https://access.redhat.com/solutions/2891431

cpu_model_extra_flags needs to be set in nova, not just cpu_model.

So I'm afraid this is not a bug, just somewhat incorrect configuration.

Can someone still reproduce this? If not, I suggest we close this bug.