Bug 2051564
Summary: | [RFE]Limiting the maximum number of disks per guest for v2v conversions | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | mxie <mxie> |
Component: | virt-v2v | Assignee: | Laszlo Ersek <lersek> |
Status: | CLOSED ERRATA | QA Contact: | mxie <mxie> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 9.0 | CC: | chhu, hongzliu, juzhou, kkiwi, lersek, mzhan, rjones, tyan, tzheng, vwu, xiaodwan |
Target Milestone: | rc | Keywords: | FutureFeature, Triaged |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | virt-v2v-2.0.7-1.el9 | Doc Type: | Enhancement |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-11-15 09:55:51 UTC | Type: | Feature Request |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
mxie@redhat.com
2022-02-07 14:05:03 UTC
Thanks for documenting this! I'll leave it on the backlog, with Rich on cc since he started a internal discussion about it too. What is the expected result here? Thanks. (In reply to Laszlo Ersek from comment #4) > What is the expected result here? Thanks. Hi Laslzo, RHV can't support more than 23 disks because of the limitation of PCI slot or amount of IO space, maybe v2v conversion should be failed directly and give a clear error info when converting a guest with more than 23 disks Right, fail early. Later on we may consider fixing this - switching to virtio-scsi etc., but that's significantly hard work. I should say, fail early, only if the output mode (eg. RHV + virtio-blk) doesn't support it. For output modes which do support it (eg. probably libvirt), we don't want it to fail. Right, that was going to be my next question -- so it's an output-specific "fail early". I'll investigate. It looks like we best catch this in the RHV modules' "parse_options" function. Based on the names, it is not really a great fit ("setup" would fit better). However, between calling "Output_module.parse_options" and "Output_module.setup", v2v does the conversion (calls Convert.convert), and that's a lot of time wasted if we already know we have too many disks. And in "parse_options", we *can* know that, as we already pass "source" to "parse_options". In fact the call site to "Output_module.parse_options" is documented with "Do this before starting conversion to catch errors early". Now here's one doubt I have: in "Output_rhv_upload.setup", we don't actually use "source.s_disks"; we use (via "Output.get_disks") the input-mode NBD sockets that the Input module created. In "parse_options", we *could* use that too -- but should we? I think it would be safer, as (I think?) "source.s_disks" may not be fully mapped to input-mode NBD sockets, but the latter is what is actually used for copying and for populating the OVF. "Output_module.parse_options" is intentionally called after "Input_module.setup": (* Check and parse the output options on the command line. * Do this before starting conversion to catch errors early, but * we have to do it after creating the source above. *) therefore I think "Output.get_disks" should be meaningfully callable from "parse_options" too. (Well, a variant of it -- we don't need the disk sizes, just their count, so we only need to count the input sockets, and not call NBD.get_size on each.) Hmmm... at the same time, there's prior art for checking nr_disks only in "setup" [output/output_rhv_upload.ml]: let disk_uuids = match rhv_disk_uuids with | Some uuids -> let nr_disks = List.length disks in if List.length uuids <> nr_disks then error (f_"the number of ‘-oo rhv-disk-uuid’ parameters passed on th e command line has to match the number of guest disk images (for this guest: %d)") nr_disks; uuids | None -> List.map (fun _ -> uuidgen ()) disks in This is a check that could be performed in fact in "parse_options"; both the input sockets are available there, and (obviously) the "-oo rhv-disk-uuid" parameters! Anyway, this existing practice gives me a license to perform the check in "setup". Much easier there; just hoist "nr_disks" near let rec setup dir options source = let disks = get_disks dir in and check if the list is longer than 23. (Based on "guestcaps.gcaps_block_bus", in "lib/create_ovf.ml" we create each disk description either as VirtIO (virtio-blk) or "IDE" -- and IDE is even stricter than virtio!) I'll try a patch tomorrow. Affected modules (I think): all those that call "create_ovf" in "finalize": - output/output_rhv.ml - output/output_rhv_upload.ml - output/output_vdsm.ml It's probably worth introducing "Output.get_disk_count"! Because the input modules create the "in%d" sockets consecutively (starting with 0), if in<N> exists, that's equivalent to the domain having more than N disks that are going to be copied. Therefore we don't even need to enumerate all "in%d" sockets (like get_disks does); we only need to check if "in23" exists. If it does, we have at least 24 disks, so abort. I kind of liked the idea of just checking List.length source.s_disks in Output_*.setup (comment 9), isn't that sufficient? The patch I'm about to post is only slightly more complicated than that; my concern is that some entries in "source.s_disks" may be filtered out and not be represented as input NBD sockets (i.e., may not be copied ultimately). I couldn't reassuringly say that I'd investigated all spots in all input modules that filtered and/or translated "source.s_disks" to input NBD sockets. (I'd found it very hard to track the "disk list" across various types and representations -- the variable names don't help (they're all called "disks") and the types are only inferred, not coded in the source.) Because the rhv-upload module already uses the input NBD socket count as "nr_disks" (for comparison against the "rhv-disk-uuid" count), I think it's safe to check for socket existence for this purpose as well. It is not computationally expensive and sidesteps a lot of uncomfortable auditing. [v2v PATCH] RHV outputs: limit copied disk count to 23 Message-Id: <20220617090852.7534-1-lersek> https://listman.redhat.com/archives/libguestfs/2022-June/029250.html [v2v PATCH v2] RHV outputs: limit copied disk count to 23 Message-Id: <20220617095337.9122-1-lersek> https://listman.redhat.com/archives/libguestfs/2022-June/029254.html (In reply to Laszlo Ersek from comment #15) > [v2v PATCH v2] RHV outputs: limit copied disk count to 23 > Message-Id: <20220617095337.9122-1-lersek> > https://listman.redhat.com/archives/libguestfs/2022-June/029254.html Upstream commit e186cc2bea99. Test the bug with below builds: virt-v2v-2.0.6-3.el9.x86_64 libguestfs-1.48.3-4.el9.x86_64 guestfs-tools-1.48.2-4.el9.x86_64 nbdkit-server-1.30.6-1.el9.x86_64 libnbd-1.12.4-2.el9.x86_64 libvirt-libs-8.5.0-1.el9.x86_64 qemu-img-7.0.0-8.el9.x86_64 Steps: 1.Convert a guest which has 41 disks from VMware to rhv via vddk by v2v # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=76:75:59:0E:32:F5:1E:58:69:93:75:5A:7B:51:32:C5:D1:6D:F1:21 -o rhv-upload -of qcow2 -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data -b ovirtmgmt -ip /home/passwd esx7.0-rhel8.5-x86_64-num41-disks [ 0.0] Setting up the source: -i libvirt -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk esx7.0-rhel8.5-x86_64-num41-disks [ 46.4] Opening the source [ 96.3] Inspecting the source [ 107.9] Checking for sufficient free disk space in the guest [ 107.9] Converting Red Hat Enterprise Linux 8.5 (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 228.7] Mapping filesystem data to avoid copying unused and blank areas [ 232.6] Closing the overlay [ 233.4] Assigning disks to buses [ 233.4] Checking if the guest needs BIOS or UEFI to boot [ 233.4] Setting up the destination: -o rhv-upload -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data virt-v2v: error: this output module doesn't support copying more than 23 disks If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] [root@dell-per740-53 home]# nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: nbdkit: error: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420.VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. 2.Convert a guest which has 41 disks from VMware to rhv without vddk by v2v # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=76:75:59:0E:32:F5:1E:58:69:93:75:5A:7B:51:32:C5:D1:6D:F1:21 -o rhv -os 10.73.195.48:/home/nfs_export -of qcow2 -b ovirtmgmt -ip /home/passwd esx7.0-rhel8.5-x86_64-num41-disks [ 0.0] Setting up the source: -i libvirt -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk esx7.0-rhel8.5-x86_64-num41-disks [ 46.4] Opening the source [ 96.6] Inspecting the source [ 108.2] Checking for sufficient free disk space in the guest [ 108.2] Converting Red Hat Enterprise Linux 8.5 (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 230.8] Mapping filesystem data to avoid copying unused and blank areas [ 234.5] Closing the overlay [ 235.4] Assigning disks to buses [ 235.4] Checking if the guest needs BIOS or UEFI to boot [ 235.4] Setting up the destination: -o rhv virt-v2v: error: this output module doesn't support copying more than 23 disks If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] [root@dell-per740-53 home]# nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: nbdkit: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420.error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: nbdkit: error: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420.VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. nbdkit: error: VDDK_PhoneHome: VddkVacPersistSessionData : Failed to persist session data at line 1420. Hi Laszlo, v2v can report correct error about rhv doesn't support copying more than 23 disks, but unexpected errors pop up after virt-v2v error, bug2083617 is a similar problem, details please check debug log Hi mxie, two comments: (1) Your two scenarios do not actually differ in vddk use. Both command lines use the exact same input parameters, and even some of the output parameters; the difference between them is the output mode: scenario 1: # -o rhv-upload -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data scenario 2: # -o rhv -os 10.73.195.48:/home/nfs_export I'm highlighting this because you wrote: > 1.Convert a guest which has 41 disks from VMware to rhv via vddk by v2v > 2.Convert a guest which has 41 disks from VMware to rhv without vddk by v2v but the true difference between both scenarios is not vddk-or-not, but rhv-upload vs. rhv. This matters because the second scenario still emits the VDDK_PhoneHome messages, and that would be inexplicable if vddk were not in use at all. So, I'm just saying that vddk *is* used in scenario#2 as well. (2) I think that the new VDDK_PhoneHome error messages are unrelated to this BZ. What happens is that, during output module setup, virt-v2v exits due to the domain having too many disks, and then (a) virt-v2v kills its nbdkit children (see "On_exit.kill pid" in "input/input_vddk.ml"), (b) the nbdkit children were started with "--exit-with-parent" anyway [lib/nbdkit.ml], so they'd exit themselves too. And this is when the vddk library (threads) in nbdkit throw a fit. So the only thing that matters here is that the nbdkit processes exit -- due to their parent virt-v2v process exiting -- when the vddk library (threads) don't expect that. It's not specific to virt-v2v at all, IMO. I wonder if we should generalize the patch for bug 2083617 (that is, commit 270ee75ede38, "vddk: Demote another useless phone-home error message to debug", 2022-05-10) -- should we demote *all* vddk error messages to debug messages that contain "VDDK_PhoneHome"? This "phone home" feature of vddk ought to be something that we *never* care about, right? That would require a separate BZ. Rich, what's your take? Thanks. The new VDDK_PhoneHome messages do indeed represent a new variation of bug 2083617, and can be fixed by generalising the strstr expression. Can you (mxie) file a new bug about that one? (In reply to Laszlo Ersek from comment #19) > Hi mxie, > > two comments: > > (1) Your two scenarios do not actually differ in vddk use. Both command > lines use the exact same input parameters, and even some of the output > parameters; the difference between them is the output mode: > I'm highlighting this because you wrote: > > > 1.Convert a guest which has 41 disks from VMware to rhv via vddk by v2v > > 2.Convert a guest which has 41 disks from VMware to rhv without vddk by v2v > > but the true difference between both scenarios is not vddk-or-not, but > rhv-upload vs. rhv. Sorry, it's my mistake (In reply to Richard W.M. Jones from comment #20) > The new VDDK_PhoneHome messages do indeed represent a new variation > of bug 2083617, and can be fixed by generalising the strstr expression. > Can you (mxie) file a new bug about that one? Filed bug2104720, thanks Verify the bug with below builds: virt-v2v-2.0.7-4.el9.x86_64 libguestfs-1.48.4-1.el9.x86_64 guestfs-tools-1.48.2-5.el9.x86_64 nbdkit-server-1.30.8-1.el9.x86_64 libnbd-1.12.6-1.el9.x86_64 libvirt-libs-8.5.0-4.el9.x86_64 qemu-img-7.0.0-9.el9.x86_64 Steps: 1.Convert a guest which has 41 disks from VMware to rhv via '-o rhv-upload' by v2v # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=76:75:59:0E:32:F5:1E:58:69:93:75:5A:7B:51:32:C5:D1:6D:F1:21 -o rhv-upload -of qcow2 -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data -b ovirtmgmt -ip /home/passwd Auto-esx7.0-rhel8.7-with-41-disks [ 0.3] Setting up the source: -i libvirt -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk Auto-esx7.0-rhel8.7-with-41-disks [ 46.7] Opening the source [ 97.8] Inspecting the source [ 117.7] Checking for sufficient free disk space in the guest [ 117.7] Converting Red Hat Enterprise Linux 8.7 Beta (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 260.0] Mapping filesystem data to avoid copying unused and blank areas [ 263.7] Closing the overlay [ 264.7] Assigning disks to buses [ 264.7] Checking if the guest needs BIOS or UEFI to boot [ 264.7] Setting up the destination: -o rhv-upload -oc https://dell-per740-22.lab.eng.pek2.redhat.com/ovirt-engine/api -os nfs_data virt-v2v: error: this output module doesn't support copying more than 23 disks If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] 2.Convert a guest which has 41 disks from VMware to rhv via '-o rhv' by v2v # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=76:75:59:0E:32:F5:1E:58:69:93:75:5A:7B:51:32:C5:D1:6D:F1:21 -o rhv -os 10.73.195.48:/home/nfs_export -b ovirtmgmt -ip /home/passwd Auto-esx7.0-rhel8.7-with-41-disks # virt-v2v -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vddk7.0.3 -io vddk-thumbprint=76:75:59:0E:32:F5:1E:58:69:93:75:5A:7B:51:32:C5:D1:6D:F1:21 -o rhv -os 10.73.195.48:/home/nfs_export -b ovirtmgmt -ip /home/passwd Auto-esx7.0-rhel8.7-with-41-disks [ 0.2] Setting up the source: -i libvirt -ic vpx://root.227.27/data/10.73.199.217/?no_verify=1 -it vddk Auto-esx7.0-rhel8.7-with-41-disks [ 46.4] Opening the source [ 125.0] Inspecting the source [ 138.0] Checking for sufficient free disk space in the guest [ 138.0] Converting Red Hat Enterprise Linux 8.7 Beta (Ootpa) to run on KVM virt-v2v: This guest has virtio drivers installed. [ 247.6] Mapping filesystem data to avoid copying unused and blank areas [ 252.6] Closing the overlay [ 253.6] Assigning disks to buses [ 253.6] Checking if the guest needs BIOS or UEFI to boot [ 253.6] Setting up the destination: -o rhv virt-v2v: error: this output module doesn't support copying more than 23 disks If reporting bugs, run virt-v2v with debugging enabled and include the complete output: virt-v2v -v -x [...] Result: The bug has been fixed, move the bug from ON_QA to VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: virt-v2v security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7968 |