Bug 2030006
Summary: | [RHEL8.2Z - OSP16.1] Wrongly support "Cascadelake-Server" on physical host without avx512_vnni cpu flag | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Priscila <pveiga> |
Component: | libvirt | Assignee: | Virtualization Maintenance <virt-maint> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Luyao Huang <lhuang> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 8.2 | CC: | ailan, cmayapka, dhill, dyuan, gveitmic, jdenemar, jiyan, lhuang, lijin, lmen, rbalakri, smooney, vasanth.mohanraj, virt-bugs, virt-maint, xuzhang, yalzhang |
Target Milestone: | rc | Keywords: | Triaged, Upstream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1761678 | Environment: | |
Last Closed: | 2022-07-04 23:23:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1761678 | ||
Bug Blocks: | 1840010 |
Comment 4
John Ferlan
2021-12-08 18:24:23 UTC
I'm confused. The bug was already backported and fixed in libvirt-6.0.0-21 and released with RHEL-AV 8.2.1 in July 2020. See bug 1840010. So I would say there's nothing to do here. But just to be sure... what issues do you see and with what libvirt release exactly? We get the following failure when migrating from cascadelake to skylake: 2021-11-23 11:21:07.243 7 INFO nova.compute.manager [req-a8f868ee-6dff-49b1-aa6e-bc08b6b56014 - - - - -] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] During the sync_power process the instance has moved from host overcloud-novacompute-8.localdomain to host overcloud-novacompute-7.localdomain 2021-11-23 11:21:13.232 7 INFO nova.compute.manager [req-582a0e19-5770-43c4-a48a-9d9a207507db f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Post operation of migration started 2021-11-23 11:21:30.561 7 WARNING nova.compute.manager [req-b20eb4d0-27c6-4925-81a6-630741b33996 1edce6bc10fa4a13bebea68dd17ce68e 3aa0f6362c874915b069ebddb82aea3f - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Received unexpected event network-vif-plugged-a90fe5f2-012e-4e65-b3e3-48574f729860 for instance with vm_state active and task_state None. 2021-11-23 11:21:32.615 7 WARNING nova.compute.manager [req-8ae6142f-acdc-4213-a568-58a7ad826cfb 1edce6bc10fa4a13bebea68dd17ce68e 3aa0f6362c874915b069ebddb82aea3f - default default] [instance: e9cd1ff8-56f3-496e-a948-6d90c5532487] Received unexpected event network-vif-plugged-a90fe5f2-012e-4e65-b3e3-48574f729860 for instance with vm_state active and task_state None. 2021-11-23 11:33:36.592 7 INFO nova.virt.libvirt.driver [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] Instance launched has CPU info: {"arch": "x86_64", "model": "Cascadelake-Server", "vendor": "Intel", "topology": {"cells": 2, "sockets": 1, "cores": 24, "threads": 2}, "features": ["tsc-deadline", "mca", "acpi", "apic", "stibp", "smep", "pat", "monitor", "mce", "xsaveopt", "bmi2", "vme", "mpx", "avx512f", "mtrr", "rdctl-no", "vmx", "3dnowprefetch", "dtes64", "avx512dq", "rdtscp", "avx2", "xgetbv1", "cmov", "avx512cd", "sse4.1", "rdrand", "intel-pt", "tsc", "erms", "pni", "cx8", "cx16", "xtpr", "ht", "tsc_adjust", "clflushopt", "fpu", "xsavec", "pku", "bmi1", "md-clear", "invpcid", "avx512vl", "tm", "arat", "skip-l1dfl-vmentry", "ds_cpl", "adx", "smap", "ss", "clflush", "syscall", "fsgsbase", "sse4.2", "spec-ctrl", "avx512bw", "nx", "lahf_lm", "msr", "de", "pse36", "clwb", "pdpe1gb", "fxsr", "est", "arch-capabilities", "f16c", "aes", "pbe", "abm", "ssbd", "ds", "rtm", "pse", "mds-no", "x2apic", "dca", "sep", "smx", "pcid", "pclmuldq", "hle", "popcnt", "fma", "sse", "ssse3", "pge", "lm", "pdcm", "tm2", "invtsc", "avx", "tsx-ctrl", "xsaves", "mmx", "ibrs-all", "rdseed", "movbe", "pae", "xsave", "sse2", "avx512vnni"]} 2021-11-23 11:33:36.595 7 ERROR nova.virt.libvirt.driver [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8 7ff9802d352043b0b0f6648441b12324 - default default] CPU doesn't have compatibility. even with: nova :: compute :: libvirt :: libvirt_cpu_model: 'Skylake-Server-IBRS' set. (In reply to David Hill from comment #6) > "mmx", "ibrs-all", "rdseed", "movbe", "pae", "xsave", "sse2", "avx512vnni"]} > 2021-11-23 11:33:36.595 7 ERROR nova.virt.libvirt.driver > [req-310c7a72-80ba-41e0-ad90-0e538d8cf291 f04d44cea47a4197b14111e32f22f3c8 > 7ff9802d352043b0b0f6648441b12324 - default default] CPU doesn't have > compatibility. Cascadelake is newer than skylake, and avx512vnni is not supported on skylake, so I think it is expected result. > even with: > > nova :: compute :: libvirt :: libvirt_cpu_model: > 'Skylake-Server-IBRS' > > set. Do you mean migrate from Cascadelake-Server to Skylake with guest xml as below failed? I will try to reproduce it when I get the appropriate hardware. <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Skylake-Server-IBRS</model> </cpu> (In reply to David Hill from comment #6) > 7ff9802d352043b0b0f6648441b12324 - default default] Instance launched has > CPU info: {"arch": "x86_64", "model": "Cascadelake-Server", ...... > > even with: > > nova :: compute :: libvirt :: libvirt_cpu_model: > 'Skylake-Server-IBRS' If nova set the CPU to Skylake, why is the log above saying it launched an instance with Cascadelake? Isn't this nova launching a VM with the wrong CPU? I've just tested the hypothesis in comment #7, and it works for me <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Skylake-Server-noTSX-IBRS</model> Migrates fine from one host machine to another, both being Cascadelake-Server-noTSX. David, could you please provide more information so we can move on with this bug or close it? Thanks! I think this could be another victim of this KCS. https://access.redhat.com/solutions/2891431 cpu_model_extra_flags needs to be set in nova, not just cpu_model. So I'm afraid this is not a bug, just somewhat incorrect configuration. Can someone still reproduce this? If not, I suggest we close this bug. |