Bug 1916176 - Virt-v2v can't convert guest from ESXi6.0 and ESXi6.5 via (vddk6.7 or vddk7.0) + rhv-upload to rhv4.4
Summary: Virt-v2v can't convert guest from ESXi6.0 and ESXi6.5 via (vddk6.7 or vddk7.0...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: virt-v2v
Version: 8.3
Hardware: x86_64
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 8.4
Assignee: Richard W.M. Jones
QA Contact: liuzi
URL:
Whiteboard:
Depends On: 1911568
Blocks: 1939375
TreeView+ depends on / blocked
 
Reported: 2021-01-14 11:59 UTC by mxie@redhat.com
Modified: 2021-10-13 09:28 UTC (History)
14 users (show)

Fixed In Version: virt-v2v-1.42.0-7.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1911568
: 1939375 (view as bug list)
Environment:
Last Closed: 2021-02-22 15:39:42 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine.log (55.30 KB, text/plain)
2021-01-14 15:11 UTC, mxie@redhat.com
no flags Details
vdsm.log (140.45 KB, text/plain)
2021-01-14 15:12 UTC, mxie@redhat.com
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 113088 0 master NEW image_transfer: Add more options for testing 2021-02-08 05:28:45 UTC

Description mxie@redhat.com 2021-01-14 11:59:46 UTC
As the bug has blocked some v2v test cases and can be reproduced on rhel8.3.1, so clone the bug.

Packages:
virt-v2v-1.42.0-6.module+el8.3.0+7898+13f907d5.x86_64
libguestfs-1.42.0-2.module+el8.3.0+6798+ad6e66be.x86_64
libvirt-6.6.0-11.module+el8.3.1+9196+74a80ca4.x86_64
qemu-kvm-5.1.0-17.module+el8.3.1+9213+7ace09c3.x86_64
nbdkit-1.22.0-2.module+el8.3.0+8203+18ecf00e.x86_64
kernel-4.18.0-240.el8.x86_64



+++ This bug was initially created as a clone of Bug #1911568 +++

Description of problem:
Can't convert guest from ESXi6.5 host to rhv4.4 via rhv-upload by virt-v2v 

Version-Release number of selected component (if applicable):
virt-v2v-1.42.0-6.module+el8.4.0+8855+a9e237a9.x86_64
libguestfs-1.42.0-2.module+el8.4.0+8855+a9e237a9.x86_64
libvirt-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64
qemu-kvm-5.2.0-2.module+el8.4.0+9186+ec44380f.x86_64
nbdkit-1.22.0-2.module+el8.4.0+8855+a9e237a9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Convert a guest from ESXi6.5 host to rhv4.4 via rhv-upload and vddk7.0 by virt-v2v
# virt-v2v -ic esx://root.196.89/?no_verify=1 -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io  vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93 --password-file /home/esxpw  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx6.5-rhel8.3-x86_64
[   0.6] Opening the source -i libvirt -ic esx://root.196.89/?no_verify=1 esx6.5-rhel8.3-x86_64 -it vddk  -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93
[   1.9] Creating an overlay to protect the source from being modified
[   5.5] Opening the overlay
[  12.6] Inspecting the overlay
[  23.9] Checking for sufficient free disk space in the guest
[  23.9] Estimating space required on target for each disk
[  23.9] Converting Red Hat Enterprise Linux 8.3 (Ootpa) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  87.8] Mapping filesystem data to avoid copying unused and blank areas
[  89.1] Closing the overlay
[  89.4] Assigning disks to buses
[  89.4] Checking if the guest needs BIOS or UEFI to boot
[  89.4] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[  90.7] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.klWPv9/nbdkit4.sock", "file.export": "/" } (qcow2)
nbdkit: python[1]: error: /var/tmp/rhvupload.i9oNvl/rhv-upload-plugin.py: pwrite: error: Traceback (most recent call last):
   File "/var/tmp/rhvupload.i9oNvl/rhv-upload-plugin.py", line 94, in wrapper
    return func(h, *args)
   File "/var/tmp/rhvupload.i9oNvl/rhv-upload-plugin.py", line 230, in pwrite
    r = http.getresponse()
   File "/usr/lib64/python3.6/http/client.py", line 1361, in getresponse
    response.begin()
   File "/usr/lib64/python3.6/http/client.py", line 311, in begin
    version, status, reason = self._read_status()
   File "/usr/lib64/python3.6/http/client.py", line 280, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
 http.client.RemoteDisconnected: Remote end closed connection without response

qemu-img: error while writing at byte 0: Input/output error

virt-v2v: error: qemu-img command failed, see earlier errors

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

2. Convert a guest from ESXi6.5 host to rhv4.4 via rhv-upload and vddk6.7 by virt-v2v
#  virt-v2v -ic esx://root.196.89/?no_verify=1 -it vddk -io vddk-libdir=/root/vmware-vix-disklib-distrib -io  vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93 --password-file /home/esxpw  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx6.5-rhel8.3-x86_64
[   0.5] Opening the source -i libvirt -ic esx://root.196.89/?no_verify=1 esx6.5-rhel8.3-x86_64 -it vddk  -io vddk-libdir=/root/vmware-vix-disklib-distrib -io vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93
[   1.7] Creating an overlay to protect the source from being modified
[   5.1] Opening the overlay
[  12.5] Inspecting the overlay
[  23.8] Checking for sufficient free disk space in the guest
[  23.8] Estimating space required on target for each disk
[  23.8] Converting Red Hat Enterprise Linux 8.3 (Ootpa) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  89.9] Mapping filesystem data to avoid copying unused and blank areas
[  91.2] Closing the overlay
[  91.4] Assigning disks to buses
[  91.4] Checking if the guest needs BIOS or UEFI to boot
[  91.4] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[  92.7] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.8T1pH3/nbdkit4.sock", "file.export": "/" } (qcow2)
nbdkit: python[1]: error: /var/tmp/rhvupload.Op62uQ/rhv-upload-plugin.py: pwrite: error: Traceback (most recent call last):
   File "/var/tmp/rhvupload.Op62uQ/rhv-upload-plugin.py", line 94, in wrapper
    return func(h, *args)
   File "/var/tmp/rhvupload.Op62uQ/rhv-upload-plugin.py", line 230, in pwrite
    r = http.getresponse()
   File "/usr/lib64/python3.6/http/client.py", line 1361, in getresponse
    response.begin()
   File "/usr/lib64/python3.6/http/client.py", line 311, in begin
    version, status, reason = self._read_status()
   File "/usr/lib64/python3.6/http/client.py", line 280, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
 http.client.RemoteDisconnected: Remote end closed connection without response

qemu-img: error while writing at byte 0: Input/output error

virt-v2v: error: qemu-img command failed, see earlier errors

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]



Actual results:
As above description

Expected results:
Can convert guests from ESXi6.5 host to rhv4.4 via rhv-upload by virt-v2v

Additional info:
1. Can convert a guest from ESXi6.5 host to rhv4.3 via rhv-upload and vddk6.7 by virt-v2v
#  virt-v2v -ic esx://root.196.89/?no_verify=1 -it vddk -io vddk-libdir=/root/vmware-vix-disklib-distrib -io  vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93 --password-file /home/esxpw  -o rhv-upload -of qcow2 -os nfs_data -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  esx6.5-rhel8.3-x86_64
[   0.5] Opening the source -i libvirt -ic esx://root.196.89/?no_verify=1 esx6.5-rhel8.3-x86_64 -it vddk  -io vddk-libdir=/root/vmware-vix-disklib-distrib -io vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93
[   2.3] Creating an overlay to protect the source from being modified
[   6.6] Opening the overlay
[  14.0] Inspecting the overlay
[  25.5] Checking for sufficient free disk space in the guest
[  25.5] Estimating space required on target for each disk
[  25.5] Converting Red Hat Enterprise Linux 8.3 (Ootpa) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  91.9] Mapping filesystem data to avoid copying unused and blank areas
[  93.1] Closing the overlay
[  93.4] Assigning disks to buses
[  93.4] Checking if the guest needs BIOS or UEFI to boot
[  93.4] Initializing the target -o rhv-upload -oc https://ibm-x3250m5-03.rhts.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[  94.6] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.ZnxySH/nbdkit4.sock", "file.export": "/" } (qcow2)
    (100.00/100%)
[1105.2] Creating output metadata
[1106.6] Finishing off

2. Can convert a guest from ESXi7.0 host to rhv4.4 via rhv-upload by virt-v2v
# virt-v2v -ic esx://root.199.217/?no_verify=1 -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib -io  vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75 --password-file /home/esxpw  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx7.0-rhel8.3-x86_64
[   0.5] Opening the source -i libvirt -ic esx://root.199.217/?no_verify=1 esx7.0-rhel8.3-x86_64 -it vddk  -io vddk-libdir=/home/vmware-vix-disklib-distrib -io vddk-thumbprint=C2:99:4E:B8:87:75:E8:41:71:6B:38:DA:07:C4:6B:0E:66:18:C0:75
[   1.6] Creating an overlay to protect the source from being modified
[   2.2] Opening the overlay
[   6.2] Inspecting the overlay
[  16.4] Checking for sufficient free disk space in the guest
[  16.4] Estimating space required on target for each disk
[  16.4] Converting Red Hat Enterprise Linux 8.3 (Ootpa) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  68.5] Mapping filesystem data to avoid copying unused and blank areas
[  68.9] Closing the overlay
[  69.2] Assigning disks to buses
[  69.2] Checking if the guest needs BIOS or UEFI to boot
[  69.2] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[  70.5] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.B0OElw/nbdkit4.sock", "file.export": "/" } (qcow2)
    (100.00/100%)
[ 403.4] Creating output metadata
[ 405.8] Finishing off

3.Can convert a guest from ESXi6.5 host to rhv4.4 via rhv by virt-v2v
# virt-v2v -ic esx://root.196.89/?no_verify=1 -it vddk -io vddk-libdir=/root/vmware-vix-disklib-distrib -io  vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93 --password-file /home/esxpw -o rhv -os 10.73.224.29:/home/iscsi_export esx6.5-win2008r2-x86_64 -on esx6.5-win2008r2-x86_64-juzhou
[   0.0] Opening the source -i libvirt -ic esx://root.196.89/?no_verify=1 esx6.5-win2008r2-x86_64 -it vddk  -io vddk-libdir=/root/vmware-vix-disklib-distrib -io vddk-thumbprint=23:4D:35:12:8A:34:64:B2:53:5F:EA:E9:E0:6D:48:CC:9B:4E:48:93
[   1.4] Creating an overlay to protect the source from being modified
[   4.7] Opening the overlay
[ 633.9] Inspecting the overlay
[ 882.9] Checking for sufficient free disk space in the guest
[ 882.9] Estimating space required on target for each disk
[ 882.9] Converting Windows Server 2008 R2 Standard to run on KVM
virt-v2v: warning: /usr/share/virt-tools/pnp_wait.exe is missing.  
Firstboot scripts may conflict with PnP.
[ 990.5] Mapping filesystem data to avoid copying unused and blank areas
[ 999.1] Closing the overlay
[ 999.4] Assigning disks to buses
[ 999.4] Checking if the guest needs BIOS or UEFI to boot
[ 999.4] Initializing the target -o rhv -os 10.73.224.29:/home/iscsi_export
[ 999.6] Copying disk 1/1 to /tmp/v2v.PqWFZA/153d6d22-8da8-43a9-ae8d-16d432f7614e/images/51c670f9-a434-4fa0-95d5-d8c0214a0a7e/0045a4dd-a24a-492b-a17a-ebe204404541 (raw)
    (100.00/100%)
[1627.0] Creating output metadata
[1627.0] Finishing off

--- Additional comment from mxie on 2021-01-04 02:55:18 UTC ---

Add more additional info, still did't find root cause:

1. Can convert a guest from ESXi6.5 host to rhv4.4 via rhv-upload by virt-v2v if not use vddk in conversion 
# virt-v2v -ic vpx://root.73.141/data/10.73.196.89/?no_verify=1  --password-file /home/passwd  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx6.5-rhel8.3-x86_64
[   1.2] Opening the source -i libvirt -ic vpx://root.73.141/data/10.73.196.89/?no_verify=1 esx6.5-rhel8.3-x86_64
[   4.1] Creating an overlay to protect the source from being modified
[   4.6] Opening the overlay
[  40.4] Inspecting the overlay
[ 109.1] Checking for sufficient free disk space in the guest
[ 109.1] Estimating space required on target for each disk
[ 109.1] Converting Red Hat Enterprise Linux 8.3 (Ootpa) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 994.6] Mapping filesystem data to avoid copying unused and blank areas
[ 997.6] Closing the overlay
[ 997.9] Assigning disks to buses
[ 997.9] Checking if the guest needs BIOS or UEFI to boot
[ 997.9] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[ 999.5] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.RmFM5N/nbdkit4.sock", "file.export": "/" } (qcow2)
    (100.00/100%)
[1716.2] Creating output metadata
[1718.8] Finishing off

2.Can convert a guest from ESXi6.7 host to rhv4.4 via vddk7.0 + rhv-upload by virt-v2v
# virt-v2v -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 -it vddk -io vddk-libdir=/home/vmware-vix-disklib-distrib/ -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA esx6.7-rhel6.10-x86_64 -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS -ip /home/passwd 
[   1.2] Opening the source -i libvirt -ic vpx://root.73.141/data/10.73.75.219/?no_verify=1 esx6.7-rhel6.10-x86_64 -it vddk  -io vddk-libdir=/home/vmware-vix-disklib-distrib/ -io vddk-thumbprint=1F:97:34:5F:B6:C2:BA:66:46:CB:1A:71:76:7D:6B:50:1E:03:00:EA
[   3.2] Creating an overlay to protect the source from being modified
[   4.3] Opening the overlay
[  11.1] Inspecting the overlay
[  29.8] Checking for sufficient free disk space in the guest
[  29.8] Estimating space required on target for each disk
[  29.8] Converting Red Hat Enterprise Linux Server release 6.10 (Santiago) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 172.5] Mapping filesystem data to avoid copying unused and blank areas
[ 173.4] Closing the overlay
[ 173.8] Assigning disks to buses
[ 173.8] Checking if the guest needs BIOS or UEFI to boot
[ 173.8] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[ 175.3] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.pOtojF/nbdkit4.sock", "file.export": "/" } (qcow2)
    (100.00/100%)
[ 763.8] Creating output metadata
[ 766.3] Finishing off

3. Can't convert a guest from ESXi6.0 host to rhv4.4 via vddk6.7 + rhv-upload by virt-v2v
# virt-v2v -ic vpx://root.73.148/data/10.73.72.61/?no_verify=1 -it vddk -io vddk-libdir=/root/vmware-vix-disklib-distrib -io  vddk-thumbprint=AA:F5:4C:48:C9:BF:75:1A:94:41:61:4C:D5:EC:DF:46:48:B5:9B:4D  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx6.0-rhel7.9-x86_64 -ip /home/passwd -oo rhv-direct
[   0.8] Opening the source -i libvirt -ic vpx://root.73.148/data/10.73.72.61/?no_verify=1 esx6.0-rhel7.9-x86_64 -it vddk  -io vddk-libdir=/root/vmware-vix-disklib-distrib -io vddk-thumbprint=AA:F5:4C:48:C9:BF:75:1A:94:41:61:4C:D5:EC:DF:46:48:B5:9B:4D
[   2.4] Creating an overlay to protect the source from being modified
[   5.9] Opening the overlay
[  13.6] Inspecting the overlay
[  44.6] Checking for sufficient free disk space in the guest
[  44.6] Estimating space required on target for each disk
[  44.6] Converting Red Hat Enterprise Linux Server 7.9 (Maipo) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 228.5] Mapping filesystem data to avoid copying unused and blank areas
[ 228.9] Closing the overlay
[ 229.1] Assigning disks to buses
[ 229.1] Checking if the guest needs BIOS or UEFI to boot
[ 229.1] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[ 230.4] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.0nzR8E/nbdkit4.sock", "file.export": "/" } (qcow2)
nbdkit: python[1]: error: /var/tmp/rhvupload.0OKqWA/rhv-upload-plugin.py: pwrite: error: Traceback (most recent call last):
   File "/var/tmp/rhvupload.0OKqWA/rhv-upload-plugin.py", line 94, in wrapper
    return func(h, *args)
   File "/var/tmp/rhvupload.0OKqWA/rhv-upload-plugin.py", line 230, in pwrite
    r = http.getresponse()
   File "/usr/lib64/python3.6/http/client.py", line 1361, in getresponse
    response.begin()
   File "/usr/lib64/python3.6/http/client.py", line 311, in begin
    version, status, reason = self._read_status()
   File "/usr/lib64/python3.6/http/client.py", line 280, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
 http.client.RemoteDisconnected: Remote end closed connection without response

qemu-img: error while writing at byte 0: Input/output error

virt-v2v: error: qemu-img command failed, see earlier errors

If reporting bugs, run virt-v2v with debugging enabled and include the 
complete output:

  virt-v2v -v -x [...]

4. Can convert a guest from ESXi5.5 host to rhv4.4 via vddk6.7 + rhv-upload by virt-v2v
# virt-v2v -ic vpx://root.73.148/data/10.73.3.19/?no_verify=1 -it vddk -io vddk-libdir=/root/vmware-vix-disklib-distrib -io  vddk-thumbprint=AA:F5:4C:48:C9:BF:75:1A:94:41:61:4C:D5:EC:DF:46:48:B5:9B:4D  -o rhv-upload -of qcow2 -os nfs_data -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api  -op /home/rhvpasswd  -oo rhv-cluster=NFS esx5.5-rhel7.9-x86_64 -ip /home/passwd -oo rhv-direct
[   0.7] Opening the source -i libvirt -ic vpx://root.73.148/data/10.73.3.19/?no_verify=1 esx5.5-rhel7.9-x86_64 -it vddk  -io vddk-libdir=/root/vmware-vix-disklib-distrib -io vddk-thumbprint=AA:F5:4C:48:C9:BF:75:1A:94:41:61:4C:D5:EC:DF:46:48:B5:9B:4D
[   2.7] Creating an overlay to protect the source from being modified
nbdkit: vddk[1]: error: [NFC ERROR] NfcFssrvr_Close: Received unexpected message: NFC_SESSION_COMPLETE from server. Expected message: NFC_FSSRVR_CLOSE
[   6.1] Opening the overlay
nbdkit: vddk[2]: error: [NFC ERROR] NfcFssrvr_IOEx: Received unexpected message: NFC_SESSION_COMPLETE from server. Expected message: NFC_FSSRVR_MULTIIO_EX
nbdkit: vddk[2]: error: VixDiskLib_Read: Unknown error
nbdkit: vddk[2]: error: [NFC ERROR] NfcNetTcpSetError: Broken pipe
nbdkit: vddk[2]: error: [NFC ERROR] NfcNetTcpWrite: bWritten: -1. Errno: 32.
nbdkit: vddk[2]: error: [NFC ERROR] NfcSendMessage: NfcNet_Send failed: NFC_NETWORK_ERROR
nbdkit: vddk[2]: error: [NFC ERROR] NfcFssrvr_Close: Failed to send close message: The operation experienced a network error (NFC_NETWORK_ERROR)
nbdkit: vddk[2]: error: SSL: Unknown SSL Error
nbdkit: vddk[2]: error: [NFC ERROR] NfcNetTcpSetError: Success
nbdkit: vddk[2]: error: [NFC ERROR] NfcNetTcpWrite: bWritten: -1. Errno: 0.
nbdkit: vddk[2]: error: [NFC ERROR] NfcSendMessage: NfcNet_Send failed: NFC_NETWORK_ERROR
[  18.3] Inspecting the overlay
[  36.8] Checking for sufficient free disk space in the guest
[  36.8] Estimating space required on target for each disk
[  36.8] Converting Red Hat Enterprise Linux Server 7.9 (Maipo) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 154.0] Mapping filesystem data to avoid copying unused and blank areas
[ 154.5] Closing the overlay
[ 154.7] Assigning disks to buses
[ 154.7] Checking if the guest needs BIOS or UEFI to boot
[ 154.7] Initializing the target -o rhv-upload -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /home/rhvpasswd -os nfs_data
[ 156.0] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/tmp/v2vnbdkit.OZt1Nj/nbdkit4.sock", "file.export": "/" } (qcow2)
nbdkit: vddk[3]: error: [NFC ERROR] NfcFssrvr_IOEx: Received unexpected message: NFC_SESSION_COMPLETE from server. Expected message: NFC_FSSRVR_MULTIIO_EX
nbdkit: vddk[3]: error: VixDiskLib_Read: Unknown error
nbdkit: vddk[3]: error: [NFC ERROR] NfcNetTcpSetError: Broken pipe
nbdkit: vddk[3]: error: [NFC ERROR] NfcNetTcpWrite: bWritten: -1. Errno: 32.
nbdkit: vddk[3]: error: [NFC ERROR] NfcSendMessage: NfcNet_Send failed: NFC_NETWORK_ERROR
nbdkit: vddk[3]: error: [NFC ERROR] NfcFssrvr_Close: Failed to send close message: The operation experienced a network error (NFC_NETWORK_ERROR)
nbdkit: vddk[3]: error: SSL: Unknown SSL Error
nbdkit: vddk[3]: error: [NFC ERROR] NfcNetTcpSetError: Success
nbdkit: vddk[3]: error: [NFC ERROR] NfcNetTcpWrite: bWritten: -1. Errno: 0.
nbdkit: vddk[3]: error: [NFC ERROR] NfcSendMessage: NfcNet_Send failed: NFC_NETWORK_ERROR
    (100.00/100%)
[ 768.4] Creating output metadata
[ 770.7] Finishing off

--- Additional comment from mxie on 2021-01-12 08:26:10 UTC ---

Use virt-v2v to convert guest from different versions of ESXi hosts via different versions of vddk and rhv-upliad to rhv4.4, summary the test result as below: 

           VDDK6.5    VDDK6.7    VDDK7.0

ESXi5.5     PASS       PASS       PASS

ESXi6.0     PASS       FAIL       FAIL

ESXi6.5     PASS       FAIL       FAIL

ESXi6.7     PASS       PASS       PASS

ESXi7.0     PASS       PASS       PASS


According to above result, can see v2v can't convert guest from ESXi6.0 and ESXi6.5 via (vddk6.7 or vddk7.0) + rhv-upload to rhv4.4

Comment 3 Richard W.M. Jones 2021-01-14 14:07:35 UTC
Sorry, I haven't been paying proper attention to this bug.  The
actual error appears to be on the RHV side.  It unexpectedly
drops the connection.

Do we have RHV logs when the error occurs?

Comment 4 mxie@redhat.com 2021-01-14 15:11:45 UTC
Created attachment 1747449 [details]
engine.log

Comment 5 mxie@redhat.com 2021-01-14 15:12:20 UTC
Created attachment 1747451 [details]
vdsm.log

Comment 6 Richard W.M. Jones 2021-01-15 09:05:36 UTC
We think what is happening here is:

(1) Using VDDK >= 6.7 supports querying for extents.  This causes qemu-img convert to
spend a long time at the beginning of the transfer mapping out the extents of the
disk before starting to copy anything.  This is why the bug only happens with VDDK >= 6.7.

(2) On the RHV side, we obtain a ticket when we first connect to RHV:

https://github.com/libguestfs/virt-v2v/blob/96a5d7d058aeb107f785f52e166f42c2a1e797a1/v2v/rhv-upload-plugin.py#L115

(3) Because actual copying of data is very delayed (by point (1)) by the time we get
to writing to RHV the ticket has expired.

https://github.com/libguestfs/virt-v2v/blob/96a5d7d058aeb107f785f52e166f42c2a1e797a1/v2v/rhv-upload-plugin.py#L216

(4) The write to RHV fails abruptly.

I'm not sure why this is only seen with ESXi 6.0 and 6.5.  It might be that those
systems are much slower at answering or rejecting extents queries, or simply that
the particular systems we have are slower in general.

In the long term we will fix this using nbdcopy which has a better approach to
extent querying compared to qemu-img, but that work is still TBD upstream.

Nir: Is it possible to request a longer timeout for transfer tickets?  Ideally we
don't want them to expire at all, or we would set a very very long timeout, eg 24h.

Comment 8 Nir Soffer 2021-01-15 21:49:38 UTC
(In reply to Richard W.M. Jones from comment #6)
> We think what is happening here is:
...
> (4) The write to RHV fails abruptly.

imageio uses 300 seconds timeout for tickets, and it extends the ticket
automatically on every request. If the client is idle for 5 minutes, the
ticket will expire.

However engine is extending the ticket every minute, so the ticket should
not expire until inactivity_timeout seconds passed.

So it sounds like we have a bug in ticket monitoring, or mapping extents
takes more than 3600 seconds?

We need imageio log to be sure about this failure:
/var/log/ovirt-imageio/daemon.log.

Looking in vdsm.log, we see only one transfer (e03763d3-d561-4778-b380-222d480ed7db).

The ticket was added here:

2021-01-14 23:01:18,576+0800 INFO  (jsonrpc/7) [vdsm.api] START add_image_ticket(ticket={'dirty': False, 'ops': ['write'], 'size': 16106127360, 'sparse': True, 'transfer_id': 'e03763d3-d561-4778-b380-222d480ed7db', 'uuid': '3328764f-74b9-444b-a4d4-bae3b725b9bc', 'timeout': 300, 'url': 'nbd:unix:/run/vdsm/nbd/3328764f-74b9-444b-a4d4-bae3b725b9bc.sock'}) from=::ffff:10.73.72.65,43588, flow_id=70bc85c8-aea0-43a5-9c1a-2c83d7144299, task_id=a37412de-071c-4782-ad45-f9ce767b5dd7 (api:48)

It was extended here, one minute after it was added (expected):

2021-01-14 23:02:23,724+0800 INFO  (jsonrpc/6) [vdsm.api] START extend_image_ticket(uuid='3328764f-74b9-444b-a4d4-bae3b725b9bc', timeout=300) from=::ffff:10.73.72.65,43588, flow_id=70bc85c8-aea0-43a5-9c1a-2c83d7144299, task_id=b809e8f1-d1f8-4b4e-bb49-71898c1ec0dd (api:48)

Finally it was removed here, 2 minutes after the ticket was added:

2021-01-14 23:03:03,931+0800 INFO  (jsonrpc/6) [vdsm.api] START remove_image_ticket(uuid='3328764f-74b9-444b-a4d4-bae3b725b9bc') from=::ffff:10.73.72.65,43588, flow_id=70bc85c8-aea0-43a5-9c1a-2c83d7144299, task_id=72fe3c45-ab50-460e-9650-3ae18792b349 (api:48)

I think what happened is this:

1. rhv-upload-plugin opened a connection and sent OPTIONS request, but
   never send any other request.

2. after 60 seconds, the idle connection is closed by imageio.
   This is a fix in 4.4. In older versions imageio did not specify
   socket timeout, so the builtin default socket timeout was used
   (value was several minutes). This could cause long delays in other
   flows and can increase resources usage for idle connections.

3. engine detected that the client was disconnected and assumed that the
   transfer was aborted by the client.

I think we need to fix engine to respect the inactivity timeout we specify
when starting a transfer, even if the ticket has not connected clients.

...
> Nir: Is it possible to request a longer timeout for transfer tickets? 
> Ideally we
> don't want them to expire at all, or we would set a very very long timeout,
> eg 24h.

The current code already set inactivity_timeout, so this should be fixed
in RHV.

Can you file ovirt-engine bug for this?

Comment 9 Nir Soffer 2021-01-15 22:18:01 UTC
(In reply to Nir Soffer from comment #8)
> 1. rhv-upload-plugin opened a connection and sent OPTIONS request, but
>    never send any other request.

Actually after we send an OPTIONS request, we replace the http/tcp connection
with http/unix connection (optimize_http). This connection is first used when
pwrite() or zero() are called.

Another way to solve this issue on virt-v2v side is to open another
connection and send an OPTIONS request every 50 seconds to make sure the
ticket is kept alive. This thread can be stopped when qemu starts to
write data.

Comment 11 mxie@redhat.com 2021-01-18 08:20:41 UTC
 > Can you file ovirt-engine bug for this?

Hi Nir, do you want to fix the bug on ovirt-engine? If yes, I think we could change the component of bug as ovirt-engine directly.

Comment 12 Nir Soffer 2021-01-18 08:50:38 UTC
(In reply to mxie from comment #11)
This should be fixed in ovirt-engine. You can move the bug, but then
it will not be tested with virt-v2v.

Comment 13 Richard W.M. Jones 2021-01-18 09:49:07 UTC
(In reply to Nir Soffer from comment #8)
> I think what happened is this:
> 
> 1. rhv-upload-plugin opened a connection and sent OPTIONS request, but
>    never send any other request.
> 
> 2. after 60 seconds, the idle connection is closed by imageio.
>    This is a fix in 4.4. In older versions imageio did not specify
>    socket timeout, so the builtin default socket timeout was used
>    (value was several minutes). This could cause long delays in other
>    flows and can increase resources usage for idle connections.
> 
> 3. engine detected that the client was disconnected and assumed that the
>    transfer was aborted by the client.
> 
> I think we need to fix engine to respect the inactivity timeout we specify
> when starting a transfer, even if the ticket has not connected clients.

Looking at the code, this sounds very plausible to me.

From the qemu/v2v side what's going on is we rhv-upload-plugin:open() a connection.
The qemu-img spends ages doing extent mapping on the source side (easily longer
than 60 seconds).  Then qemu-img writes the first block to the target (calling
pwrite or zero in the plugin).

> Actually after we send an OPTIONS request, we replace the http/tcp connection
> with http/unix connection (optimize_http). This connection is first used when
> pwrite() or zero() are called.

I see the "imageio features" message in the virt-v2v log, which is printed
after OPTIONS.  Then much later pwrite() is called (after extents mapping)
and that fails.

> Another way to solve this issue on virt-v2v side is to open another
> connection and send an OPTIONS request every 50 seconds to make sure the
> ticket is kept alive. This thread can be stopped when qemu starts to
> write data.

Tricky as we don't really control any of this.  It's all driven from qemu-img
by this point.

BTW nbdcopy will take a slightly different approach.  We will have probably 4
threads each opening their own connection.  In parallel each copies 128M blocks
from the source to the target, only synchronizing in order to choose which block
to copy next.  Extent mapping is done by each thread at the start of the block.
So extent mapping is spread over the disk rather than all being done up front.

The aim is to replace qemu-img convert with nbdcopy in the next version of virt-v2v.

Comment 14 mxie@redhat.com 2021-01-18 10:47:35 UTC
Change the component of bug to ovirt-engine, I will test the bug from v2v side when bug is fixed。

Comment 15 Lukas Svaty 2021-01-19 11:18:42 UTC
Hi Ming,

can you check the flags are correct, currently they are indicating this should block RHV releases?
TestBlocekr - some test cases are blocked from Testing and you were not able to WA this issue.
Urgent - basic processes broken, loss of data, setup/upgrade issue, etc.

Please align these.

Comment 16 Sandro Bonazzola 2021-01-19 12:04:51 UTC
Moving to storage since it seems Nir Soffer is on it on RHV side.

Comment 17 mxie@redhat.com 2021-01-19 12:33:24 UTC
(In reply to Lukas Svaty from comment #15)
> Hi Ming,
> 
> can you check the flags are correct, currently they are indicating this
> should block RHV releases?
> TestBlocekr - some test cases are blocked from Testing and you were not able
> to WA this issue.
> Urgent - basic processes broken, loss of data, setup/upgrade issue, etc.
> 
> Please align these.

Hi Lukas, this issue is really blocking v2v testing, so it's urgent for v2v testing, Nir is working on the bug, maybe the bug can be fixed before RHV releasing?

Hi Nir, could you please help to confirm this?

Comment 18 Daniel Gur 2021-01-19 12:45:24 UTC
As this bug is not related to IMS mass migration with CFME  but the older single VM migration moving back to RHV QE.  Nisim I believe or maybe the reporter mxie 
Migration QE is not handling such scenarios

Comment 19 Nir Soffer 2021-01-19 12:57:14 UTC
(In reply to mxie from comment #17)
> (In reply to Lukas Svaty from comment #15)
> > Hi Ming,
> > 
> > can you check the flags are correct, currently they are indicating this
> > should block RHV releases?
> > TestBlocekr - some test cases are blocked from Testing and you were not able
> > to WA this issue.
> > Urgent - basic processes broken, loss of data, setup/upgrade issue, etc.
> > 
> > Please align these.
> 
> Hi Lukas, this issue is really blocking v2v testing, so it's urgent for v2v
> testing, Nir is working on the bug,

I'm not working on the bug, this is engine bug and I don't maintain
this code.

> maybe the bug can be fixed before RHV releasing?

This bug probably exists since 4.4.1, so there is no reason
to block 4.4.4 release.

The issue seems to be change in virt-v2v which does not work with 4.4,
so it may block virt-v2v release, since earlier version did work with
4.4.

I think the earliest fix we can have is in 4.4.5.

Comment 20 Nir Soffer 2021-01-19 14:01:51 UTC
(In reply to Richard W.M. Jones from comment #13)
> (In reply to Nir Soffer from comment #8)
...
> > Another way to solve this issue on virt-v2v side is to open another
> > connection and send an OPTIONS request every 50 seconds to make sure the
> > ticket is kept alive. This thread can be stopped when qemu starts to
> > write data.
> 
> Tricky as we don't really control any of this.  It's all driven from qemu-img
> by this point.

We can start a "keep alive" thread in open(), keeping connection alive until
a global flag is set.

In pwrite()/zero() we can set the flag to signal the keep alive thread to
exit.

Otherwise we need to wait for ovirt-engine 4.4.5 for a fix.

> BTW nbdcopy will take a slightly different approach.  We will have probably 4
> threads each opening their own connection.  In parallel each copies 128M
> blocks
> from the source to the target, only synchronizing in order to choose which
> block
> to copy next.  Extent mapping is done by each thread at the start of the
> block.
> So extent mapping is spread over the disk rather than all being done up
> front.

Sounds good, we also use 128MiB chunks in imageio client as maximum chunk
size.

Another way (used by imageio client) is to get the extents in a separate
thread (e.g. main thread), doing block status command for every 1g, and 
push request to the workers to copy or zero ranges.

Main thread handling extents:
https://github.com/oVirt/ovirt-imageio/blob/5c0101c6797017197f4e8524a5461f3aaabde354/daemon/ovirt_imageio/_internal/io.py#L88

Worker loop:
https://github.com/oVirt/ovirt-imageio/blob/5c0101c6797017197f4e8524a5461f3aaabde354/daemon/ovirt_imageio/_internal/io.py#L228

imageio supports up to 8 connections. In my tests more than 4 connections does
not seem to give any improvement. You need to check imageio OPTIONS response,
it reports how may writers are allowed.

You also need to use format="raw" in the transfer, otherwise you work with
the file backend, which does not support multiple writers, and will corrupt
image data or fail if you use multiple writers.

With this setup I expect to see 2-3 speedup compared to qemu-img.

Comment 21 Richard W.M. Jones 2021-01-19 14:06:54 UTC
I don't think this needs to be a full TestBlocker.  We can easily wait
for it to be fixed on the oVirt side in the next release.  Changes like
adding background threads to virt-v2v are invasive and highly risky.

Comment 22 Nir Soffer 2021-01-19 14:15:10 UTC
Tal, since this blocks virt-v2v, I think we need to fix this in 4.4.5.

This should be a small (but may be tricky) change in engine. We need
to make sure we don't break the code handling download from browser,
when we do want to consider a transfer as aborted as soon as the
connection is closed.

Comment 23 Nir Soffer 2021-01-19 22:53:49 UTC
I reproduced the issue locally with image_transfer.py example
and looks like engine is correct.

I add this change to simulate long delay after after connecting
to imageio, before sending any request:
https://gerrit.ovirt.org/c/ovirt-engine-sdk/+/113088

Running this command reproduce the failure:

$ ./image_transfer.py -c engine-dev --inactivity-timeout 300 --read-delay 130 upload 398241a6-0ce4-46be-9384-e970502f0d23
[   0.0 ] Connecting to engine...
[   0.0 ] Looking up disk 398241a6-0ce4-46be-9384-e970502f0d23
[   0.1 ] Creating image transfer for upload
[   1.3 ] Transfer ID: 7eb17bc8-dbdb-42df-b7a9-6f5d2cf2fde4
[   1.3 ] Transfer host name: host4
[   1.3 ] Transfer URL: https://host4:54322/images/bb5d6885-869e-4953-bdf6-16066ec3a1a3
[   1.3 ] Proxy URL: https://engine-dev:54323/images/bb5d6885-869e-4953-bdf6-16066ec3a1a3
[   1.3 ] Conneted to imageio server
[   1.3 ] Reading from server...
[ 131.4 ] Reading from server...
[ 131.4 ] Finalizing image transfer...
Traceback (most recent call last):
  File "./image_transfer.py", line 177, in <module>
    client.read(0, buf)
  File "/usr/lib64/python3.8/site-packages/ovirt_imageio/client/_api.py", line 393, in read
    return self._backend.readinto(buffer)
  File "/usr/lib64/python3.8/site-packages/ovirt_imageio/_internal/backends/http.py", line 234, in readinto
    res = self._get(length)
  File "/usr/lib64/python3.8/site-packages/ovirt_imageio/_internal/backends/http.py", line 428, in _get
    res = self._con.getresponse()
  File "/usr/lib64/python3.8/http/client.py", line 1347, in getresponse
    response.begin()
  File "/usr/lib64/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python3.8/http/client.py", line 276, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response


Looking in imageio log, we see:

1. Ticket was added

2021-01-19 21:50:36,089 INFO    (Thread-75) [tickets] [local] ADD ticket={'dirty': False, 'ops': ['write'], 'size': 1073741824, 'sparse': True, 'transfer_id': '7eb17bc8-dbdb-42df-b7a9-6f5d2cf2fde4', 'uuid': 'bb5d6885-869e-4953-bdf6-16066ec3a1a3', 'timeout': 300, 'url': 'nbd:unix:/run/vdsm/nbd/bb5d6885-869e-4953-bdf6-16066ec3a1a3.sock'}

2. Client connects to server and getting OPTIONS

2021-01-19 21:50:37,127 INFO    (Thread-76) [images] [::ffff:192.168.122.1] OPTIONS ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3

3. Client get image extents

2021-01-19 21:50:37,131 INFO    (Thread-76) [extents] [::ffff:192.168.122.1] EXTENTS ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3 context=zero

4. Client makes first read

2021-01-19 21:50:37,133 DEBUG   (Thread-76) [images] [::ffff:192.168.122.1] READ size=4096 offset=0 close=False ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3

(Client starts 130 seconds sleep at this point)

5. Engine query ticket status

2021-01-19 21:50:38,048 DEBUG   (Thread-77) [tickets] [local] GET ticket={'active': False, 'canceled': False, 'connections': 1, 'expires': 4458269, 'idle_time': 1, 'ops': ['write'], 'size': 1073741824, 'sparse': True, 'dirty': False, 'timeout': 300, 'url': 'nbd:unix:/run/vdsm/nbd/bb5d6885-869e-4953-bdf6-16066ec3a1a3.sock', 'uuid': 'bb5d6885-869e-4953-bdf6-16066ec3a1a3', 'transfer_id': '7eb17bc8-dbdb-42df-b7a9-6f5d2cf2fde4', 'transferred': 4096}

(This is repeated every 10 seconds)

6. Timeout reading next request from client - server close the connection

2021-01-19 21:51:37,185 WARNING (Thread-76) [http] Timeout reading or writing to socket: The read operation timed out

2021-01-19 21:51:37,185 INFO    (Thread-76) [http] CLOSE connection=76 client=::ffff:192.168.122.1 [connection 1 ops, 60.058543 s] [dispatch 3 ops, 0.005019 s] [extents 1 ops, 0.001644 s] [read 1 ops, 0.001409 s, 4.00 KiB, 2.77 MiB/s] [read.read 1 ops, 0.001231 s, 4.00 KiB, 3.17 MiB/s] [read.write 1 ops, 0.000135 s, 4.00 KiB, 28.86 MiB/s]

7. Engine extend the ticket

2021-01-19 21:51:40,155 INFO    (Thread-84) [tickets] [local] EXTEND timeout=300 ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3

8. Engine extend the ticket again

2021-01-19 21:52:40,319 INFO    (Thread-91) [tickets] [local] EXTEND timeout=300 ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3

At this point the client wakes up and try read from close connection.
It fails with http.client.RemoteDisconnected and finalize the connection.

9. Engine remove the ticket

2021-01-19 21:52:50,399 INFO    (Thread-93) [tickets] [local] REMOVE ticket=bb5d6885-869e-4953-bdf6-16066ec3a1a3


This proves that engine is respecting inactivity_timeout, extending the
ticket every 60 seconds. If engine was ignoring the timeout, it would 
stop the transfer instead of extending the ticket.

The issue is that rhv-upload-plugin is idle for more than 60 seconds,
and it cannot handle connection closed by the server.

Handling disconnection automatically is tricky, since we detect this
after sending the request, so we don't know what happened on the server
side, and we don't know if change done before the disconnection were
flushed.

I think the best way to handle this is to open the imageio connection 
lazily used for the actual transfer only when qemu is ready to write
to the server.

I posted a fix to virt-v2v here:
https://www.redhat.com/archives/libguestfs/2021-January/msg00016.html

Moving the bug back to virt-v2v.

Comment 25 Richard W.M. Jones 2021-01-20 10:47:04 UTC
Removing TestBlocker and blocker? flags.  We do not intend this bug to block
RHEL releases.

Comment 26 Nir Soffer 2021-01-21 13:04:35 UTC
Fixed upstream in:
https://github.com/libguestfs/virt-v2v/commit/1d5fc257765c444644e5bfc6525e86ff201755f0

Turns out that this happens only when not using unix socket:
- Running virt-v2v on non-ovirt host (e.g. your laptop)
- Running virt-v2v on ovirt host which is not up
- Running virt-v2v on ovirt host in another data center
  (e.g. importing vm to dc1, host belongs in to dc2)
- Using -oo rhv_direct=false

So this does not affect IMS, when we run the import on ovirt
host in the right data center, and we always use rhv_direct=true.

Testing this change:

1. Run virt-v2v with vddk and vmware version that have the slow extents
   issue, using -oo rvh_direct=false, or on non-ovirt host.

Without this change, if qemu-img needs more than 60 seconds to get image
extents, it will fail on the first write (comment 0).

With this change the import will succeed.

2. Run other flows to make sure there are no regressions.

3. Run sanity tests with IMS?

Note: Does QE have tests *using* unix socket? This is the most
important use case, and we should test both cases.

Comment 27 Richard W.M. Jones 2021-01-21 14:39:21 UTC
Adding needinfo about Nir's question above:

> Note: Does QE have tests *using* unix socket? This is the most
> important use case, and we should test both cases.

A second question that I have:

Should we fix this in 8.3.1, or only in 8.4.0?  I would prefer 8.4.0
unless this is really very urgent.

Comment 28 Richard W.M. Jones 2021-01-21 15:06:50 UTC
After discussion with mxie, setting this to 8.3.1

Comment 31 mxie@redhat.com 2021-01-22 10:01:13 UTC
(In reply to Nir Soffer from comment #26)
> Fixed upstream in:
> https://github.com/libguestfs/virt-v2v/commit/
> 1d5fc257765c444644e5bfc6525e86ff201755f0
> 
> Turns out that this happens only when not using unix socket:
> - Running virt-v2v on non-ovirt host (e.g. your laptop)
> - Running virt-v2v on ovirt host which is not up
> - Running virt-v2v on ovirt host in another data center
>   (e.g. importing vm to dc1, host belongs in to dc2)
> - Using -oo rhv_direct=false
> 
> So this does not affect IMS, when we run the import on ovirt
> host in the right data center, and we always use rhv_direct=true.

You're right, v2v conversion can finish successfully if convert guest from ESXi6.5/ESXi6.0 with vddk>=6.7.3 on rhv node and use rhv-direct=true in v2v command line
   
> Note: Does QE have tests *using* unix socket? This is the most
> important use case, and we should test both cases.

Yes, we will test this feature when test v2v integration testing with rhv, sorry for not testing the bug on rhv node when filing the bug

Comment 32 liuzi 2021-01-27 02:51:45 UTC
Verify the bug with below builds:
virt-v2v-1.42.0-7.module+el8.3.1+9562+c3ede7c6.x86_64
libvirt-6.6.0-13.module+el8.3.1+9548+0a8fede5.x86_64
qemu-kvm-5.1.0-18.module+el8.3.1+9507+32d6953c.x86_64
nbdkit-1.22.0-2.module+el8.3.0+8203+18ecf00e.x86_64
virtio-win-1.9.14-4.el8.noarch
vdsm-4.40.40-1.el8ev.x86_64
rhv4.4.4.6-0.1.el8ev


Steps:
 
Scenario1: On standalone v2v conversion server, use virt-v2v to convert guest from different versions of ESXi hosts via different versions of vddk and rhv-upload to rhv4.4, summary the test result as below: 

1.1  NOT use rhv-direct=true option in v2v command line  

           VDDK6.5   VDDK6.7.3    VDDK7.0
 
ESXi6.0    PASS       PASS        PASS
 
ESXi6.5    PASS       PASS        PASS

ESXi6.7    PASS       PASS        PASS

ESXi7.0    PASS       PASS        PASS


1.2 Use rhv-direct=true option in v2v command line  

           VDDK6.5   VDDK6.7.3    VDDK7.0

ESXi6.0    PASS       PASS        PASS

ESXi6.5    PASS       PASS        PASS

ESXi6.7    PASS       PASS        PASS

ESXI7.0    PASS       PASS        PASS

Scenario2: On rhv4.4 node, use virt-v2v to convert guest from different versions of ESXi hosts via different versions of vddk and rhv-upload to rhv4.4, summary the test result as below: 

2.1  NOT use rhv-direct=true option in v2v command line  

           VDDK6.5   VDDK6.7.3    VDDK7.0

ESXi6.0    PASS        PASS       PASS


ESXi6.5    PASS        PASS       PASS

ESXi6.7    PASS        PASS       PASS

ESXi7.0    PASS        PASS       PASS  

2.2 Use rhv-direct=true option in v2v command line  

           VDDK6.5    VDDK6.7.3    VDDK7.0

ESXi6.0    PASS         PASS        PASS

ESXi6.5    PASS         PASS        PASS

ESXi6.7    PASS         PASS        PASS  

ESXi7.0    PASS         PASS        PASS



Scenario3: Change rhv4.4 node to maintenance status, then use virt-v2v to convert guest from ESXi6.0 and ESXi6.5 hosts via vddk6.7/vddk7.0 and rhv-upload to rhv4.4 on rhv4.4 node, summary the test result as below: 

3.1 NOT use rhv-direct=true option in v2v command line  

           VDDK6.7.3    VDDK7.0     

ESXi6.0    PASS            PASS

ESXi6.5    PASS            PASS


3.2 Use rhv-direct=true option in v2v command line  

           VDDK6.7.3    VDDK7.0     

ESXi6.0    PASS            PASS

ESXi6.5    PASS            PASS  



Scenario4: On rhv4.4 node, use virt-v2v to convert guest from ESXi6.0 and ESXi6.5 hosts via vddk6.7/vddk7.0 and rhv-upload to  another datacenter (e.g. running v2v on rhv node1, then convert guest to rhv node2, node1 belongs in to datacenter1, node2 belongs to datacenter2 ), summary the test result as below: 

4.1 NOT use rhv-direct=true option in v2v command line  

           VDDK6.7.3    VDDK7.0     

ESXi6.0     PASS        PASS

ESXi6.5     PASS        PASS


4.2 Use rhv-direct=true option in v2v command line  

           VDDK6.7.3    VDDK7.0     

ESXi6.0   PASS          PASS

ESXi6.5   PASS          PASS  


Result:
     According to above results, the bug has been fixed, so move the bug from ON_QA to VERIFIED

Comment 34 errata-xmlrpc 2021-02-22 15:39:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0639


Note You need to log in before you can comment on or make changes to this bug.