Bug 1976024 - [RFE][v2v] [upload/download disk/CBT] Failed to attach disk to the VM - disk is OK but image transfer still holds a lock on the disk
Summary: [RFE][v2v] [upload/download disk/CBT] Failed to attach disk to the VM - disk ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: virt-v2v
Version: 8.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.5
Assignee: Richard W.M. Jones
QA Contact: Xiaodai Wang
URL:
Whiteboard:
Depends On: 1849861 1976020
Blocks: 2014017
TreeView+ depends on / blocked
 
Reported: 2021-06-25 02:05 UTC by Xiaodai Wang
Modified: 2021-12-23 16:39 UTC (History)
17 users (show)

Fixed In Version: virt-v2v-1.42.0-14.el8
Doc Type: Enhancement
Doc Text:
Clone Of: 1976020
Environment:
Last Closed: 2021-11-16 07:54:24 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:4684 0 None None None 2021-11-16 07:55:16 UTC

Description Xiaodai Wang 2021-06-25 02:05:52 UTC
+++ This bug was initially created as a clone of Bug #1976020 +++

This bug should also be fixed on v2v side.

> Pavel is working on image transfer examples, so I hope he can also fix virt-v2v.

+++ This bug was initially created as a clone of Bug #1849861 +++

Description of problem:
Failed to attach disk to the VM

Version-Release number of selected component (if applicable):
virt-v2v-1.40.2-23.module+el8.2.1+6797+0db00a40.x86_64
ovirt-engine-4.3.10-0.2.el7.noarch

How reproducible:
50%

Steps to Reproduce:
1. Run auto script by 'avocado run --vt-type v2v --vt-machine-type q35 convert_from_file.positive_test.linux.input_mode.ova.parse.invalid_line_in_manifest.output_mode.rhev.rhv_upload --vt-connect-uri qemu:///system'.  
   1)The real v2v command was executed in the script is '/bin/virt-v2v -i ova -ic qemu:///system -o rhv-upload -os nfs_data -of raw -b ovirtmgmt -n ovirtmgmt /var/lib/libvirt/images/v2v/ova-images/case84967/invalid_line_in_manifest.ova -on ova_vm_toT -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /tmp/rhv_upload_passwd_file -oo rhv-cafile=/tmp/rhv_upload_ca.pem -oo rhv-cluster=NFS -oo rhv-direct -v -x'.
   2) The ova file used in the script are as below.
# ll -h /var/lib/libvirt/images/v2v/ova-images/case84967/
total 20M
-rwxr-xr-x. 1 nobody nobody 3.9M Apr 17  2017 invalid_line_in_manifest-missing_space.ova
-rwxr-xr-x. 1 nobody nobody 3.9M Apr 14  2017 invalid_line_in_manifest.ova
-rw-r--r--. 1 nobody nobody 3.9M Mar 29  2019 SHA1.ova
-rwxr-xr-x. 1 nobody nobody 3.9M Apr 14  2017 SHA1_SHA256.ova
-rw-r--r--. 1 nobody nobody 3.9M Mar 29  2019 SHA256.ova

Actual results:
2020-06-22 11:41:37,183+08 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1295) [bd848c1] EVENT_ID: VM_IMPORT_FROM_CONFIGURATION_ATTACH_DISKS_FAILED(175), VM ova_vm_toT has been imported from the given configuration but the following disk(s) failed to attach: a49ea084-1cc6-4f80-8ea8-79f8fdd3c0dc.

Expected results:
The disk should be attached to the VM successfully.

Additional info:
1) I can only reproduce it by running the script. I tried to reproduce it by running the v2v
command directly in the terminal but failed.
2) The ova disk is not bootable.
3) Please see attachment for the v2v debug log and ovirt-engine log.

--- Additional comment from Xiaodai Wang on 2020-06-23 03:31:37 UTC ---

Here is the discussion:

http://post-office.corp.redhat.com/archives/v2v-devel/2020-June/msg00035.html

--- Additional comment from Xiaodai Wang on 2020-06-23 03:36:09 UTC ---



--- Additional comment from Nir Soffer on 2020-06-23 09:49:57 UTC ---

Just to make the command more clear:

/bin/virt-v2v \
    -i ova \
    -ic qemu:///system \
    -o rhv-upload \
    -os nfs_data \
    -of raw \
    -b ovirtmgmt \
    -n ovirtmgmt /var/lib/libvirt/images/v2v/ova-images/case84967/invalid_line_in_manifest.ova \
    -on ova_vm_toT \
    -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api \
    -op /tmp/rhv_upload_passwd_file \
    -oo rhv-cafile=/tmp/rhv_upload_ca.pem \
    -oo rhv-cluster=NFS \
    -oo rhv-direct \
    -v \
    -x

The command looks broken, see the -n argument.

Why do you use ova? Are you testing ova support?

Can you reproduce using local disk?

virt-v2v \
        -i disk disk.img \
        -o rhv-upload \
        -oc https://my.engine/ovirt-engine/api \
        -op password-file \
        -on v2v \
        -os my-domain \
        -of raw \
        -oa sparse \
        -oo rhv-cafile=/etc/pki/vdsm/certs/cacert.pem \
        -oo rhv-cluster=my-cluster \
        -oo rhv-direct=true

The issue described sounds like an engine bug, so it should be reproducible
using local disk.

--- Additional comment from Xiaodai Wang on 2020-06-23 10:22:58 UTC ---

(In reply to Nir Soffer from comment #3)
> Just to make the command more clear:
> 
> /bin/virt-v2v \
>     -i ova \
>     -ic qemu:///system \
>     -o rhv-upload \
>     -os nfs_data \
>     -of raw \
>     -b ovirtmgmt \
>     -n ovirtmgmt
> /var/lib/libvirt/images/v2v/ova-images/case84967/invalid_line_in_manifest.
> ova \
>     -on ova_vm_toT \
>     -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api \
>     -op /tmp/rhv_upload_passwd_file \
>     -oo rhv-cafile=/tmp/rhv_upload_ca.pem \
>     -oo rhv-cluster=NFS \
>     -oo rhv-direct \
>     -v \
>     -x
> 
> The command looks broken, see the -n argument.

It's not a broken argument. '-n' is duplicate with '-b'.

> 
> Why do you use ova? Are you testing ova support?

Yes, we also have ova testing cases for v2v.

> 
> Can you reproduce using local disk?
> 
> virt-v2v \
>         -i disk disk.img \
>         -o rhv-upload \
>         -oc https://my.engine/ovirt-engine/api \
>         -op password-file \
>         -on v2v \
>         -os my-domain \
>         -of raw \
>         -oa sparse \
>         -oo rhv-cafile=/etc/pki/vdsm/certs/cacert.pem \
>         -oo rhv-cluster=my-cluster \
>         -oo rhv-direct=true
> 
> The issue described sounds like an engine bug, so it should be reproducible
> using local disk.

Yes, I reproduced it with a local disk.

Steps:
1) Convert the OVA file to a qcow2 file.
qemu-img 'create' '-q' '-f' 'qcow2' '-b' 'json:{ "file": { "driver": "raw", "offset": 6656, "size": 4063232, "file": { "driver": "file", "filename": "/var/lib/libvirt/images/v2v/ova-images/case84967/invalid_line_in_manifest.ova" } } }' '-o' 'compat=1.1,backing_fmt=vmdk' '/var/tmp/v2v.qcow2'
2) export LIBGUESTFS_BACKEND=direct
3) Run v2v command(I copied '/var/tmp/v2v.qcow2' to '/root/' in case it's deleted by mistake.).
# virt-v2v -i disk -ic qemu:///system -o rhv-upload -os nfs_data -of raw -b ovirtmgmt -n ovirtmgmt /root/v2v.qcow2 -on testvm -oc https://hp-dl360eg8-03.lab.eng.pek2.redhat.com/ovirt-engine/api -op /root/rhv_upload_passwd_file -oo rhv-cafile=/root/rhv_upload_ca.pem -oo rhv-cluster=NFS -oo rhv-direct -v -x

I'm going to upload the v2v and ovirt-engine logs and also the local qcow2 disk if you'd like to reproduce it on your environment.

--- Additional comment from Xiaodai Wang on 2020-06-23 10:30:52 UTC ---



--- Additional comment from Tal Nisan on 2020-06-29 14:14:40 UTC ---

Richard, any insights about what is the issue here?

--- Additional comment from Richard W.M. Jones on 2020-07-01 11:24:29 UTC ---

(In reply to Tal Nisan from comment #6)
> Richard, any insights about what is the issue here?

From the virt-v2v point the conversion is completely successful in
both cases.

So the possibilities can only be:

* It's a RHV engine problem of some kind.

* virt-v2v's -o rhv-upload mode should wait a bit longer or
check the disk status in a different way from how we're doing
it at the moment to give time for the disk to "become available"
(whatever that means).  For reference here's what we're doing now
to wait for the disk to appear:

https://github.com/libguestfs/virt-v2v/blob/c1caf7132000a4560c3e20c2753978e8dd10036a/v2v/rhv-upload-plugin.py#L593

and the general VM creation code here which runs after the above:

https://github.com/libguestfs/virt-v2v/blob/master/v2v/rhv-upload-createvm.py

--- Additional comment from Richard W.M. Jones on 2020-07-01 11:27:21 UTC ---

To add that in the both cases in this bug we print this message:

  2020-06-21 23:35:41,381 process          L0458 DEBUG| [stderr] finalized after 1.02907133102417 seconds

  finalized after 1.0110740661621094 seconds

which given the code above seems to mean that the disk status did
change to types.DiskStatus.OK.

--- Additional comment from Nir Soffer on 2020-07-01 11:58:56 UTC ---

(In reply to Richard W.M. Jones from comment #7)
> (In reply to Tal Nisan from comment #6)
> > Richard, any insights about what is the issue here?
> 
> From the virt-v2v point the conversion is completely successful in
> both cases.
> 
> So the possibilities can only be:
> 
> * It's a RHV engine problem of some kind.

Very likely. I have seen in the past case when finalizing the transfer
just hangs forever. This is not reproducible and the next transfer 
succeeds.

> * virt-v2v's -o rhv-upload mode should wait a bit longer or
> check the disk status in a different way from how we're doing
> it at the moment to give time for the disk to "become available"
> (whatever that means).  For reference here's what we're doing now
> to wait for the disk to appear:
> 
> https://github.com/libguestfs/virt-v2v/blob/
> c1caf7132000a4560c3e20c2753978e8dd10036a/v2v/rhv-upload-plugin.py#L593

This should be the correct way to wait for the disk, but I don't 
know engine internals.

> and the general VM creation code here which runs after the above:
> 
> https://github.com/libguestfs/virt-v2v/blob/master/v2v/rhv-upload-createvm.py

This does not attach the disk but create a vm with ovf, maybe there is
engine bug handling this ovf, or some issue with the ovf.

--- Additional comment from Nir Soffer on 2020-07-01 12:04:26 UTC ---

(In reply to Richard W.M. Jones from comment #8)
> To add that in the both cases in this bug we print this message:
> 
>   2020-06-21 23:35:41,381 process          L0458 DEBUG| [stderr] finalized
> after 1.02907133102417 seconds
> 
>   finalized after 1.0110740661621094 seconds
> 
> which given the code above seems to mean that the disk status did
> change to types.DiskStatus.OK.

This is not the same message as in:
https://github.com/libguestfs/virt-v2v/blob/c1caf7132000a4560c3e20c2753978e8dd10036a/v2v/rhv-upload-plugin.py#L593

The relevant logs from finalize_transfer are:

    transfer %s finalized in %.3f seconds

On success, and in case of timeout (60 seconds?), we raise:

   RuntimeError("timed out waiting for transfer %s to finalize)

Can you point me to the virt-v2v code used here?

--- Additional comment from Richard W.M. Jones on 2020-07-01 12:14:51 UTC ---

Oh you're right (and I can't read ...).  I note also this is the old version
of virt-v2v (1.40.2, new version is 1.42.0).

This message changed in the following upstream commit:

  https://github.com/libguestfs/virt-v2v/commit/75fcf46379ec8cad31684f3ca3dd24caa7aaaa46

which means (as expected) we're not carrying all of the patches
needed in this branch.

Is there a chance to move to RHEL AV 8.3.0?

--- Additional comment from Nir Soffer on 2020-07-01 13:04:29 UTC ---

(In reply to Richard W.M. Jones from comment #11)
> Oh you're right (and I can't read ...).  I note also this is the old version
> of virt-v2v (1.40.2, new version is 1.42.0).
> 
> This message changed in the following upstream commit:
>  
> https://github.com/libguestfs/virt-v2v/commit/
> 75fcf46379ec8cad31684f3ca3dd24caa7aaaa46
> 
> which means (as expected) we're not carrying all of the patches
> needed in this branch.

I think the issue with the old code was not handling the ILLEGAL state
which could end in an endless loop instead of fast failure, but otherwise
we check for LOCKED and OK stages, so based on the logs, we detected 
that the disk was OK.

Looking in engine log:

1. Transfer take exclusive lock on the disk

2020-06-23 18:18:27,005+08 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-1340) [539a982e-8973-4783-a56e-298622c811ec] Lock Acqu
ired to object 'EngineLock:{exclusiveLocks='[700b798a-4434-44c7-b500-a03ff9363faf=DISK]', sharedLocks='[]'}'

2. Transfer was finalized

2020-06-23 18:18:29,638+08 INFO  [org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater] (default task-1340) [1ff2ecec-eb40-46f3-9868-bf903f8f9db8] Updating image transfer 9996fe27-2881-41a0-99c1-80d0005ed3ea (image 700b798a-4434-44c7-b500-a03ff9363faf) phase to Finalizing Success

3. Starting to import vm from configuration

2020-06-23 18:18:33,098+08 INFO  [org.ovirt.engine.core.bll.exportimport.ImportVmFromConfigurationCommand] (default task-1340) [b313fbb8-c1c7-45c1-b919-2da431bb94ca] Lock Ac
quired to object 'EngineLock:{exclusiveLocks='[a65fddb4-48e1-46e1-a080-074d156cab8b=VM, testvm=VM_NAME]', sharedLocks='[a65fddb4-48e1-46e1-a080-074d156cab8b=REMOTE_VM]'}'

4. Import vm command fail to acquire lock on the disk

2020-06-23 18:18:33,349+08 INFO  [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-1340) [57cc33d7] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[a65fddb4-48e1-46e1-a080-074d156cab8b=VM_DISK_BOOT, 700b798a-4434-44c7-b500-a03ff9363faf=DISK]', sharedLocks=''}'

2020-06-23 18:18:33,349+08 WARN  [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-1340) [57cc33d7] Validation of action 'AttachDiskToVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__ATTACH_ACTION_TO,VAR__TYPE__DISK,ACTION_TYPE_FAILED_DISK_IS_LOCKED


5. Transfer marked as finished

2020-06-23 18:18:35,322+08 INFO  [org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater] (EE-ManagedThreadFactory-engineScheduled-Thread-12) [539a982e-8973-4783-a56e-298622c811ec] Updating image transfer 9996fe27-2881-41a0-99c1-80d0005ed3ea (image 700b798a-4434-44c7-b500-a03ff9363faf) phase to Finished Success


6. Transfer release exclusive lock on the disk

2020-06-23 18:18:35,398+08 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-12) [539a982e-8973-4783-a56e-298622c811ec] Lock freed to object 'EngineLock:{exclusiveLocks='[700b798a-4434-44c7-b500-a03ff9363faf=DISK]', sharedLocks='[]'}'


We have 2 ways lock disks:
- set imagestatus to LOCKED (2) in the database
- take memory locks (e.g. EngineLock:{exclusiveLocks='[700b798a-4434-44c7-b500-a03ff9363faf=DISK]')

When starting a transfer the transfer command takes exclusive lock on the disk 
and change the disk status to LOCKED.

When ending the tranfer the transfer command release the exclusive lock and
change the disk status to OK.

Based on the logs, we change the disk to OK before releasing the exclusive 
lock, so when v2v continue the import the disk is still locked.

This is not the first time we had such issues.

--- Additional comment from Nir Soffer on 2020-07-01 13:10:24 UTC ---

Moving to RHV and changing severity to high since fails basic 
functionallity.

--- Additional comment from Avihai on 2020-07-08 09:39:47 UTC ---

(In reply to Nir Soffer from comment #13)
> Moving to RHV and changing severity to high since fails basic 
> functionallity.

Tal this is a severe bug that will affect customers as well please retarget to 4.4.1(if possible) or at least 4.4.2.

--- Additional comment from Ilan Zuckerman on 2020-07-08 09:52:04 UTC ---

Encountered similar issue when downloading disk with help of python3-ovirt-engine-sdk4:

When using SDK for downloading qcow2 disk, the transfer times out to finalize transfer with the following ERROR from the SDK [1], 
with the following error from engine [2],
With the following error from vdsm host [3]


[1]:
2020-07-05 10:37:18,495 - MainThread - RemoteExecutor - DEBUG - [root.16.37/qum5net] Results of command: python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py download fac3911d-95b6-41be-a538-dad345691502 --password-file /root/pass.txt --engine-url https://hosted-engine-06.lab.eng.tlv2.redhat.com --username admin@internal --cafile /root/ca.crt --backup-dir /root/full_backup/ --backup-uuid 44ae937c-8c30-41a1-bb68-d4daf3177d7e
2020-07-05 10:37:18,496 - MainThread - RemoteExecutor - DEBUG - [root.16.37/qum5net]   OUT: [   0.0 ] Downloading VM fac3911d-95b6-41be-a538-dad345691502 disks
[   0.2 ] Creating image transfer for disk 2d075957-7481-4af3-9920-e63d7759af83
[   2.9 ] Image transfer 391d9654-0e59-4e24-9524-e59a37cb8b53 is ready
[ ------- ] 0 bytes, 0.00 seconds, 0 bytes/s                                   
Formatting '/root/full_backup/2d075957-7481-4af3-9920-e63d7759af83.202007051036.full.qcow2', fmt=qcow2 size=1073741824 cluster_size=65536 lazy_refcounts=off refcount_bits=16
[  99.89% ] 1022.88 MiB, 0.22 seconds, 4.56 GiB/s                              
[  99.92% ] 1023.19 MiB, 0.33 seconds, 3.00 GiB/s                              
[  99.99% ] 1023.88 MiB, 0.44 seconds, 2.29 GiB/s                              
[ 100.00% ] 1.00 GiB, 0.46 seconds, 2.17 GiB/s                                 
[   3.3 ] Finalizing image transfer

2020-07-05 10:37:18,496 - MainThread - RemoteExecutor - DEBUG - [root.16.37/qum5net]   ERR: Traceback (most recent call last):
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 476, in <module>
    main()
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 173, in main
    args.command(args)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 244, in cmd_download
    connection, args.backup_uuid, args, incremental=args.incremental)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 385, in download_backup
    download_disk(connection, backup_uuid, disk, disk_path, args, incremental=incremental)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py", line 430, in download_disk
    imagetransfer.finalize_transfer(connection, transfer, disk)
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/helpers/imagetransfer.py", line 296, in finalize_transfer
    .format(transfer.id))
RuntimeError: Timed out waiting for transfer 391d9654-0e59-4e24-9524-e59a37cb8b53 to finalize

---------------------------------------------------------------------------------------------

[2]:
2020-07-05 10:37:20,801+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetImageTicketVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [b434516f-9c7f-46fc-9f0c-50ee4ffc3775] START, GetImageTicketVDSCommand(HostName = host_mixed_1, GetImageTicketVDSCommandParameters:{hostId='9fb37721-708b-4327-9148-d3f302e51436', ticketId='0287b970-6b73-4b9e-b683-47c1f44dbe86', timeout='null'}), log id: 382fd9d6
2020-07-05 10:37:20,807+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetImageTicketVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [b434516f-9c7f-46fc-9f0c-50ee4ffc3775] FINISH, GetImageTicketVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.ImageTicketInformation@dc80e36, log id: 382fd9d6
2020-07-05 10:37:20,825+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [b434516f-9c7f-46fc-9f0c-50ee4ffc3775] EVENT_ID: DOWNLOAD_IMAGE_CANCELED_TIMEOUT(1,072), Download was canceled by system. Reason: timeout due to transfer inactivity.
2020-07-05 10:37:20,826+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.ImageTransferUpdater] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [b434516f-9c7f-46fc-9f0c-50ee4ffc3775] Updating image transfer 391d9654-0e59-4e24-9524-e59a37cb8b53 (image 2d075957-7481-4af3-9920-e63d7759af83) phase to Cancelled

---------------------------------------------------------------------------------------------

[3]:
2020-07-05 10:37:19,686+0300 INFO  (jsonrpc/7) [api.virt] FINISH stop_backup return={'status': {'code': 0, 'message': 'Done'}} from=::ffff:10.46.16.252,60486, flow_id=backups_syncAction_9850ddba-f8b8-4b8, vmId=fac3911d-95b6-41be-a538-dad345691502 (api:54)
2020-07-05 10:37:19,686+0300 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.stop_backup succeeded in 0.01 seconds (__init__:312)
2020-07-05 10:37:19,694+0300 ERROR (libvirt/events) [virt.vm] (vmId='fac3911d-95b6-41be-a538-dad345691502') Block job (untracked) type BACKUP for drive sda was canceled (vm:5306)
2020-07-05 10:37:19,710+0300 ERROR (libvirt/events) [virt.vm] (vmId='fac3911d-95b6-41be-a538-dad345691502') Block job (untracked) type BACKUP for drive vdb was canceled (vm:5306)
2020-07-05 10:37:19,732+0300 ERROR (libvirt/events) [virt.vm] (vmId='fac3911d-95b6-41be-a538-dad345691502') Block job (untracked) type BACKUP for drive sdb was canceled (vm:5306)
2020-07-05 10:37:19,783+0300 ERROR (libvirt/events) [virt.vm] (vmId='fac3911d-95b6-41be-a538-dad345691502') Block job (untracked) type BACKUP for drive vdc was canceled (vm:5306)

--- Additional comment from Tomáš Golembiovský on 2021-01-12 12:49:53 UTC ---



--- Additional comment from Ilan Zuckerman on 2021-02-02 07:11:17 UTC ---

I recently open a new bug [1] which is apperantly a duplicate of this one.
Just to mention that the issue Nir is explaining in his comments might affect not only v2v, but also any Automated tests which involve upload / download disks, and also CBT test cases (as they also involve downloading and uploading), and also other manual efforts which involve the usage of SDK.

[1]:
https://bugzilla.redhat.com/1923178

--- Additional comment from RHEL Program Management on 2021-02-02 07:29:25 UTC ---

This bug report has Keywords: Regression or TestBlocker.

Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release.

Please resolve ASAP.

--- Additional comment from Tal Nisan on 2021-02-04 10:15:19 UTC ---



--- Additional comment from Ilan Zuckerman on 2021-02-15 09:05:54 UTC ---

(In reply to Ilan Zuckerman from comment #17)
> I recently open a new bug [1] which is apperantly a duplicate of this one.
> Just to mention that the issue Nir is explaining in his comments might
> affect not only v2v, but also any Automated tests which involve upload /
> download disks, and also CBT test cases (as they also involve downloading
> and uploading), and also other manual efforts which involve the usage of SDK.
> 
> [1]:
> https://bugzilla.redhat.com/1923178

Just making this more clear:

This issue (reported in 1923178) is first seen in rhv-4.4.5-4 , and was reproduced once more (by automated regression runs) in rhv-4.4.5-5
I noticed this only on NFS/gluster flavors

--- Additional comment from Pavel Bar on 2021-02-19 12:02:03 UTC ---

Adding additional details just in case (might be helpful) - there was a PR that was closed:
https://github.com/oVirt/ovirt-ansible-collection/pull/215

--- Additional comment from Eyal Shenitzky on 2021-02-21 07:43:31 UTC ---

(In reply to Pavel Bar from comment #21)
> Adding additional details just in case (might be helpful) - there was a PR
> that was closed:
> https://github.com/oVirt/ovirt-ansible-collection/pull/215

PR closed because it waits for the fix in the engine code instead of fixing it in the Ansible collection.

Comment 6 Xiaodai Wang 2021-07-19 07:42:13 UTC
Tried to run the script in bug description 100 times, the problem was not reproduce.
I will also trigger all our v2v jobs to keep verifying the bug.

Comment 8 Xiaodai Wang 2021-07-23 07:19:55 UTC
I finished the verification of this bug and executed 200+ auto scripts
in our environment and not found the issue again. I think this bug
could be verified.

But I found another issue that v2v has already finished off, then start
or remove the VM at once, it will report 'VM xxx is being imported' error.
The problem cannot be reproduced every time.

Comment 9 Nir Soffer 2021-07-25 15:46:46 UTC
(In reply to Xiaodai Wang from comment #8)
> I finished the verification of this bug and executed 200+ auto scripts
> in our environment and not found the issue again. I think this bug
> could be verified.
> 
> But I found another issue that v2v has already finished off, then start
> or remove the VM at once, it will report 'VM xxx is being imported' error.
> The problem cannot be reproduced every time.

Please file new bug for the new issue. This fix affects only upload finalization
step. The error you mention sounds like incorrect waiting for import completion
and not related to the upload.

Comment 10 Xiaodai Wang 2021-07-26 02:12:36 UTC
Thanks!
9.0: https://bugzilla.redhat.com/show_bug.cgi?id=1985830      
8.5av: https://bugzilla.redhat.com/show_bug.cgi?id=1985827

Move the bug to VERIFIED.

Comment 12 errata-xmlrpc 2021-11-16 07:54:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684


Note You need to log in before you can comment on or make changes to this bug.