Bug 2107503
Summary: | RHEL 8.6 VM with "qemu64" CPU model can't start because "the CPU is incompatible with host CPU: Host CPU does not provide required features: svm" | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Vera <vwu> | ||||
Component: | virt-v2v | Assignee: | Laszlo Ersek <lersek> | ||||
Status: | CLOSED ERRATA | QA Contact: | Vera <vwu> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 9.1 | CC: | chhu, hongzliu, juzhou, lersek, mxie, rjones, tyan, tzheng, xiaodwan | ||||
Target Milestone: | rc | Keywords: | FutureFeature, Triaged | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | virt-v2v-2.0.7-3.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2022-11-15 09:56:15 UTC | Type: | Feature Request | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Changing the summary because I don't think this has anything to do with -oo compressed. It's similar to bug 2076013. Vera, could you attach the full debug output from the virt-v2v conversion please? I guess we need to figure out, based on the libvirt documentation, what the difference is, between <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> and <cpu match='minimum'> <model fallback='allow'>qemu64</model> </cpu> The qemu64 model is supposed to work on all hosts, per <https://gitlab.com/qemu-project/qemu/-/blob/master/docs/system/cpu-models-x86.rst.inc>. from https://libvirt.org/formatdomain.html#cpu-model-and-topology: - mode='custom' is the default, so there is no difference in that regard - match='minimum' is laxer than match='exact', so the conversion output is more lenient than the original - model fallback='allow' is laxer than fallback='forbid', so again the conversion output is more lenient - the culprit seems to be that we don't output check='none'. In the original domain def, check='none' means: > Libvirt does no checking and it is up to the hypervisor to refuse to start the domain if it cannot provide the requested CPU. With QEMU this means no > checking is done at all since the default behavior of QEMU is to emit warnings, but start the domain anyway. However when we omit it in the conversion output, check='none' is *not* the default: > Once the domain starts, libvirt will automatically change the check attribute to the best supported value to ensure the virtual CPU does not change when > the domain is migrated to another host and indeed we can see that libvirt fills it in as "partial" > Libvirt will check the guest CPU specification before starting a domain [...] I think we need to add the check='none' attribute to the "cpu" element in the libvirt output, as we don't care about migratability here. wild guess (untested): diff --git a/output/create_libvirt_xml.ml b/output/create_libvirt_xml.ml index 531a4f75bf3e..0343d3194268 100644 --- a/output/create_libvirt_xml.ml +++ b/output/create_libvirt_xml.ml @@ -192,6 +192,7 @@ let create_libvirt_xml ?pool source inspect List.push_back cpu_attrs ("mode", "host-passthrough"); | Some model -> List.push_back cpu_attrs ("match", "minimum"); + List.push_back cpu_attrs ("check", "none"); (match source.s_cpu_vendor with | None -> () | Some vendor -> Turns out I can reproduce, and therefore test, this, myself. [v2v PATCH] output/create_libvirt_xml: generate @check='none' CPU attribute Message-Id: <20220720110913.14058-1-lersek> https://listman.redhat.com/archives/libguestfs/2022-July/029504.html [v2v PATCH v2] output/create_libvirt_xml: relax VCPU feature checking for "qemu64" Message-Id: <20220722073627.6511-1-lersek> https://listman.redhat.com/archives/libguestfs/2022-July/029519.html (In reply to Laszlo Ersek from comment #8) > [v2v PATCH v2] output/create_libvirt_xml: relax VCPU feature checking for "qemu64" > Message-Id: <20220722073627.6511-1-lersek> > https://listman.redhat.com/archives/libguestfs/2022-July/029519.html Upstream commit e5297c3180fd. Verified with the versions: qemu-img-7.0.0-9.el9.x86_64 nbdkit-1.30.8-1.el9.x86_64 libvirt-8.5.0-3.el9.x86_64 libnbd-1.12.6-1.el9.x86_64 virt-v2v-2.0.7-4.el9.x86_64 Steps: 1. Prepare a VM with "qemu64" CPU model # virsh list --all |grep nvme - esx6.7-rhel8.6-nvme-disk shut off # virsh dumpxml esx6.7-rhel8.6-nvme-disk ... <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> ... 2. Convert with -oo compressed option. # virt-v2v -i libvirt -ic qemu:///system esx6.7-rhel8.6-nvme-disk -o libvirt -os default -oo compressed -on esx6.7-rhel8.6-nvme-disk-compress -of qcow2 [ 0.0] Setting up the source: -ic qemu:///system -i libvirt esx6.7-rhel8.6-nvme-disk [ 1.1] Opening the source [ 5.7] Inspecting the source [ 14.6] Checking for sufficient free disk space in the guest [ 14.6] Converting Red Hat Enterprise Linux 8.6 Beta (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 119.3] Mapping filesystem data to avoid copying unused and blank areas [ 121.5] Closing the overlay [ 121.8] Assigning disks to buses [ 121.8] Checking if the guest needs BIOS or UEFI to boot [ 121.8] Setting up the destination: -o libvirt -os default [ 123.0] Copying disk 1/1 █ 100% [****************************************] [ 259.5] Creating output metadata [ 259.5] Finishing off 3.Start VM and check the checkpoints. # virsh list --all |grep nvme - esx6.7-rhel8.6-nvme-disk shut off - esx6.7-rhel8.6-nvme-disk-compress shut off # virsh start esx6.7-rhel8.6-nvme-disk-compress Domain 'esx6.7-rhel8.6-nvme-disk-compress' started # virsh list --all |grep nvme 4 esx6.7-rhel8.6-nvme-disk-compress running - esx6.7-rhel8.6-nvme-disk shut off # virsh dumpxml esx6.7-rhel8.6-nvme-disk-compress ... <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Cascadelake-Server</model> <feature policy='require' name='ss'/> <feature policy='require' name='vmx'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='tsc_adjust'/> <feature policy='require' name='umip'/> <feature policy='require' name='pku'/> <feature policy='require' name='md-clear'/> <feature policy='require' name='stibp'/> <feature policy='require' name='arch-capabilities'/> <feature policy='require' name='xsaves'/> <feature policy='require' name='ibpb'/> <feature policy='require' name='ibrs'/> <feature policy='require' name='amd-stibp'/> <feature policy='require' name='amd-ssbd'/> <feature policy='require' name='rdctl-no'/> <feature policy='require' name='ibrs-all'/> <feature policy='require' name='skip-l1dfl-vmentry'/> <feature policy='require' name='mds-no'/> <feature policy='require' name='pschange-mc-no'/> <feature policy='require' name='tsx-ctrl'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> <feature policy='disable' name='mpx'/> </cpu> ... It is working with this version. Lazsole, I have a question on this. From the patch, When the source domain specifies a CPU model, is it correct @check='full'? Hi Vera, something doesn't add up here. In your comment 16, the source domain seems to have an explicit "qemu64" CPU model, but the converted domain's CPU model is "Cascadelake-Server". virt-v2v should never do this. Can you please repeat the test and: - while the converted domain is running, attach the output of "virsh dumpxml --inactive" (not just "virsh dumpxml"), - provide the full conversion log? From my experimentation, what happens is that libvirt modifies the <cpu> element, but only for the *running* instance of the domain. For example, if I have a local domain with <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> and I start it, I get the following command outputs (with the domain running): - virsh dumpxml --inactive: <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> - virsh dumpxml: <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>qemu64</model> <feature policy='require' name='x2apic'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='lahf_lm'/> <feature policy='disable' name='svm'/> </cpu> Note that @check changes from "none" to "full", from the config version to the live version, of the domain XML. Additionally, some CPU flags get explicitly listed, such as the disablement of "svm". However, the model name itself does *not* change, from qemu64 to anything else. So this is the justification for my two requests above: - please pass "--inactive" to "virsh dump" (we care about the config version, not the live version, of the domain XML), - please upload the full conversion log (that's how we best see what virt-v2v creates from what). Thank you. Hi Laszlo, Please check the 2 attachments on the log as requested. Thanks. Thank you. I think everything is fine. The conversion output, and the "virsh dump --inactive" command, record the following output domain XML fragment (respectively): <cpu match='minimum' check='none'> <model fallback='allow'>qemu64</model> </cpu> <cpu mode='custom' match='minimum' check='none'> <model fallback='allow'>qemu64</model> </cpu> This is what the patch intends to do. Given that the guest actually boots and works, I think the only lesson here is that for a *running* (live) domain, the "live" domain XML may contain a <cpu> element that significantly differs from the "config" (= inactive) variant of the same element. Apparently libvirtd translates the "config" fragment <cpu mode='custom' match='minimum' check='none'> <model fallback='allow'>qemu64</model> </cpu> to a very different "live" fragment, but -- importantly -- with the correct, desired effect. So I think it's fine to set this BZ to VERIFIED. Laszlo, Thanks for confirmation. Marking the bug to VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: virt-v2v security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7968 |
Created attachment 1897349 [details] the output of "virsh capabilities" on the conversion host Description of problem: VM with "qemu64" CPU model can't start after "-oo compressed" conversion via virt-v2v Version-Release number of selected component (if applicable): qemu-guest-agent-7.0.0-8.el9.x86_64 libnbd-1.12.5-1.el9.x86_64 virt-v2v-2.0.7-1.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare a VM with "qemu64" CPU model # virsh list --all |grep nvme - esx6.7-rhel8.6-nvme-disk shut off # virsh dumpxml esx6.7-rhel8.6-nvme-disk ... <cpu mode='custom' match='exact' check='none'> <model fallback='forbid'>qemu64</model> </cpu> ... 2. Convert with -oo compressed option. # virt-v2v -i libvirt -ic qemu:///system esx6.7-rhel8.6-nvme-disk -o libvirt -os default -oo compressed -on esx6.7-rhel8.6-nvme-disk-compress -of qcow2 [ 0.2] Setting up the source: -ic qemu:///system -i libvirt esx6.7-rhel8.6-nvme-disk [ 1.3] Opening the source [ 5.8] Inspecting the source [ 15.2] Checking for sufficient free disk space in the guest [ 15.2] Converting Red Hat Enterprise Linux 8.6 Beta (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 122.1] Mapping filesystem data to avoid copying unused and blank areas [ 123.7] Closing the overlay [ 124.0] Assigning disks to buses [ 124.0] Checking if the guest needs BIOS or UEFI to boot [ 124.0] Setting up the destination: -o libvirt -os default [ 125.2] Copying disk 1/1 █ 100% [****************************************] [ 221.4] Creating output metadata [ 221.4] Finishing off 3.Start VM and check the checkpoints. # virsh list --all |grep nvme - esx6.7-rhel8.6-nvme-disk shut off - esx6.7-rhel8.6-nvme-disk-compress shut off # virsh start esx6.7-rhel8.6-nvme-disk-compress error: Failed to start domain 'esx6.7-rhel8.6-nvme-disk-compress' error: the CPU is incompatible with host CPU: Host CPU does not provide required features: svm # virsh dumpxml esx6.7-rhel8.6-nvme-disk-compress ... <cpu mode='custom' match='minimum' check='partial'> <model fallback='allow'>qemu64</model> </cpu> ... Actual results: VM can't start into OS successfully Expected results: VM can start into OS successfully Additional info: Please check attachment on the output of "virsh capabilities"