Bug 1798072 - Live storage migration fails in LiveMigrateDiskCommand - Failed during live storage migration, attempting to end replication before deleting the target disk
Summary: Live storage migration fails in LiveMigrateDiskCommand - Failed during live s...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.40.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ovirt-4.4.1
: ---
Assignee: Fedor Gavrilov
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On: 1803092 1805143 1820068
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-04 14:17 UTC by Evelina Shames
Modified: 2021-05-13 06:25 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-26 10:21:44 UTC
oVirt Team: Storage
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
logs (1.03 MB, application/zip)
2020-02-04 14:17 UTC, Evelina Shames
no flags Details
logs (2.02 MB, application/gzip)
2020-02-15 19:32 UTC, Yosi Ben Shimon
no flags Details
logs for comment 10 (246.46 KB, application/gzip)
2020-02-16 16:08 UTC, Nir Soffer
no flags Details
Logs (113.67 KB, application/zip)
2020-03-09 07:37 UTC, Evelina Shames
no flags Details

Description Evelina Shames 2020-02-04 14:17:50 UTC
Created attachment 1657566 [details]
logs

Description of problem:
Live storage migration fails with the following errors:

Engine:
2020-02-04 15:48:41,813+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-93) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure.
2020-02-04 15:48:42,828+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-49) [b2746547-7938-4236-9888-a50a734b2049] Command 'CopyImageGroupVolumesData' id: '8ad6e4b1-0f25-432e-80a4-bb493d555dd1' child commands '[137933c6-f8d2-4981-b703-9758b207110e]' executions were completed, status 'FAILED'
2020-02-04 15:48:43,845+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-63) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand' with failure.
2020-02-04 15:48:43,924+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-63) [b2746547-7938-4236-9888-a50a734b2049] Command 'LiveMigrateDisk' id: '41bf4f8a-d9a7-49e3-85e8-b62b5e2b7c90' child commands '[9fa26ec8-01c4-4535-84fa-57996a39600d, 7af45bec-7f1d-484e-89af-05bbbf4c0ae1, 8ad6e4b1-0f25-432e-80a4-bb493d555dd1]' executions were completed, status 'FAILED'
2020-02-04 15:48:45,032+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure.
2020-02-04 15:48:45,033+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Failed during live storage migration of disk '1f14a95a-eac9-4cde-b48f-84e9d8bfe19c' of vm 'f41f6b88-f4c0-4649-a7df-3c5a69f60726', attempting to end replication before deleting the target disk
2020-02-04 15:48:45,035+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] START, VmReplicateDiskFinishVDSCommand(HostName = host_mixed_2, VmReplicateDiskParameters:{hostId='a0563ece-3dd8-48e6-8e34-8c704456cbf2', vmId='f41f6b88-f4c0-4649-a7df-3c5a69f60726', storagePoolId='970b8cda-8492-42cc-bad8-1c3557950dc6', srcStorageDomainId='abfd8f3e-d1e2-470e-80c4-5229f9f1fb87', targetStorageDomainId='abfd8f3e-d1e2-470e-80c4-5229f9f1fb87', imageGroupId='1f14a95a-eac9-4cde-b48f-84e9d8bfe19c', imageId='25b916f9-2e55-4781-baaa-6d4c564e825a'}), log id: 604ce349
2020-02-04 15:48:45,609+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] FINISH, VmReplicateDiskFinishVDSCommand, return: , log id: 604ce349
2020-02-04 15:48:45,609+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Attempting to delete the target of disk '1f14a95a-eac9-4cde-b48f-84e9d8bfe19c' of vm 'f41f6b88-f4c0-4649-a7df-3c5a69f60726'
2020-02-04 15:48:46,324+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz have failed to move disk latest-rhel-guest-image-8.1-infra to domain nfs_0.

vdsm:
2020-02-04 08:48:36,104-0500 ERROR (tasks/6) [root] Job '15da5400-1991-4226-97a1-ef8eea9db226' failed (jobs:223)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/copy_data.py", line 86, in _run
    self._operation.run()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/qemuimg.py", line 344, in run
    for data in self._operation.watch():
  File "/usr/lib/python3.6/site-packages/vdsm/storage/operation.py", line 106, in watch
    self._finalize(b"", err)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/operation.py", line 179, in _finalize
    raise cmdutils.Error(self._cmd, rc, out, err)
vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/abfd8f3e-d1e2-470e-80c4-5229f9f1fb87/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd', '-O', 'qcow2', '-o', 'compat=1.1,backing_file=80c1e0ae-8037-4bb9-b3d3-4d707fc0d3fc,backing_fmt=qcow2', '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd: error while converting qcow2: Failed to get "write" lock\nIs another process using the image [/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd]?\n')

Version-Release number of selected component (if applicable):
4.4.0-0.17.master.el7

How reproducible:
Most of the times - I noticed that mainly it reproduces with nfs/gluster destination sd, but there were failure with iscsi as well. 

Steps to Reproduce:
1. Create New VM with disk (try to create disk on iscsi)
2. Start VM
3. Move its disk to gluster sd (or nfs)

Actual results:
Operation fails
)and it takes a while until the generated snapshots is removed (there is no task in the UI of this operation))

Expected results:
Operation should succeed

Additional info:
Logs are attached.

Comment 2 Eyal Shenitzky 2020-02-10 15:46:08 UTC
Nir, can you please have a look?

Comment 3 Nir Soffer 2020-02-14 10:40:26 UTC
Moving to vdsm since this is clearly not engine bug.

This is most likely libvirt bug, I will add more details later.

Comment 4 Nir Soffer 2020-02-14 11:02:48 UTC
Evelina, to understand this bug better, we need to know who holds the lock breaking
qemu-img convert.

If you can reproduce this manually, please do this:

0. Enable libvirt logs using this configuration in /etc/libvirt/libvirtd.conf:

log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util"
log_outputs="1:file:/var/log/libvirt/libvirtd.log"

Restart libvirtd to activate the log. Check that /var/log/libvirt/libvirtd.log
was created.

1. Start logging lslocks outpout in one shell

    while true; do
        date >> lslocks.log
        lslocks >> lslocks.log
        echo --- >> lslocks.log
        sleep 5
    done

2. Start logging libvirt xml in another shell

First find the vm id using:

$ virsh -r list

And run this script in a new shell:

   while true; do
       date >> vm-xml.log
       virsh -r dumpxml vm-id >> vm-xml.log
       echo --- >> vm-xml.log
   done

3. Start the live storage migration

4. When the live storage migration fails in vdsm, stop the scripts

Attache the collected logs to the bug
(lslocks.log, vm-xml.log, vdsm.log, libvirtd.log).

Comment 5 Nir Soffer 2020-02-14 13:01:31 UTC
Changing severity to urgent since this is regression and loss of basic 
functionallity that require shutting down vm for moving the disks to 
another storage.

Comment 7 Yosi Ben Shimon 2020-02-15 19:32:43 UTC
Created attachment 1663280 [details]
logs

Logs attached.
Please note that I've reproduced this bug twice:
1. manually *without* the scripts and libvirtd.log
2. manually with the required steps (libvirtd.log and scripts).
Therefore, you might see this scenario twice in vdsm.log

Let me know if you need any further assistance.

Comment 8 Nir Soffer 2020-02-15 22:02:13 UTC
(In reply to Yosi Ben Shimon from comment #7)
> Created attachment 1663280 [details]
What we see in the logs:

qemu-img fail to copy volume d26848d8-ab50-4e7a-ad8e-8b53e247075b from block
storage domain a62157c0-e7a0-4199-9cc1-e24bc6d9aa36 to gluster domain
76e32f13-b6e9-45a5-a177-0b0f773a29e8:

vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/a62157c0-e7a0-4199-9cc1-
e24bc6d9aa36/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/glusterSD/gluster01
.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b
'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a17
7-0b0f773a29e8/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b: error while converting qcow2: Failed to get "write" lock\nIs another process
 using the image [/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b5b7-
4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b]?\n')


Looking in vm-xml.log, we see that this volume is the base volume on the 
destination storage during a mirror job:

      <mirror type='file' file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/
299d48db-b5b7-4e7c-8db4-9d95edf048e9/58055544-064f-46e1-a3a2-cbbe3d0171ac' format='qcow2' job='copy' ready='abort'>
        <format type='qcow2'/>
        <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b
5b7-4e7c-8db4-9d95edf048e9/58055544-064f-46e1-a3a2-cbbe3d0171ac' index='7'>
          <seclabel model='dac' relabel='no'/>
        </source>
        <backingStore type='file' index='8'>
          <format type='qcow2'/>
          <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db
-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b'>
            <seclabel model='dac' relabel='no'/>
          </source>
          <backingStore/>
        </backingStore>
      </mirror>

Looking in lslocks.log, we don't see any locks on anything in /rhev/.

Did you run it on the host the vm was running on?

Please run this again on the same host running the vm. Lets use smaller
sleep value to make sure we don't miss the locks if the failure and cleanup
are very quick.

Comment 10 Nir Soffer 2020-02-16 16:01:06 UTC
I reproduced the issue on Yosi environment (storage-ge6-vdsm2).
This is the same issue reported in bug 1803092.

Step to reproduce:


1. Change vdsm to stop before copying images:

Edit /usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/copy_data.py

                with self._dest.volume_operation():
                   
                    import time, os
                    while os.path.exists("/run/vdsm/break"):
                        self.log.info("---- WAITING ON /run/vdsm/break ----")
                        time.sleep(10)

                    self._operation = qemuimg.convert(
                        self._source.path,
                        self._dest.path,
                        srcFormat=src_format,
                        dstFormat=dst_format,
                        dstQcow2Compat=self._dest.qcow2_compat,
                        backing=self._dest.backing_path,
                        backingFormat=self._dest.backing_qemu_format,
                        preallocation=self._dest.preallocation,
                        unordered_writes=self._dest
                            .recommends_unordered_writes)
                    self._operation.run()

2. Create break file:

# touch /run/vdsm/break


3. Start watching vdsm log:

# tail -f /var/log/vdsm/vdsm.log | grep 'WAITING ON /run/vdsm/break'


4. Start live storage migration


5. Wait until vdsm stops

2020-02-16 10:45:03,408-0500 INFO  (tasks/3) [storage.sdm.copy_data] ---- WAITING ON /run/vdsm/break ---- (copy_data:78)


6. Collect vm.xml and lslocks

# virsh -r list
 Id   Name   State
----------------------
 1    vm_a   running

# virsh -r dumpxml 1 > vm.xml
# lslocks > lslocks.out


7. Delete /run/vdsm/break

# rm /run/vdsm/break


In vdsm log we can see that qemu-img failed:

vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/a62157c0-e7a0-4199-9cc1-
e24bc6d9aa36/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/yellow-vdsb.qa.lab.
tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075
b'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a
18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b: error while converting qcow2: Failed to get "write" lock\nIs another proce
ss using the image [/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5
b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b]?\n')


In vm.xml we see that the volume d26848d8-ab50-4e7a-ad8e-8b53e247075b is the base volume
in the destination storage:

      <mirror type='file' file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/83e965c3-c36d-45b7-84fc-35d8cc229170' format='qcow2' job='copy' ready='yes'>
        <format type='qcow2'/>
        <source file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/83e965c3-c36d-45b7-84fc-35d8cc229170' index='4'>
          <seclabel model='dac' relabel='no'/>
        </source>
        <backingStore type='file' index='5'>
          <format type='qcow2'/>
          <source file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b'>
            <seclabel model='dac' relabel='no'/>
          </source>
          <backingStore/>
        </backingStore>
      </mirror>

In lslocks we see that there qemu-kvm is holding read locks on this volume:

qemu-kvm        13267  POSIX 192K READ  0   100        100 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b
qemu-kvm        13267  POSIX 192K READ  0   201        201 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b
qemu-kvm        13267  POSIX 192K READ  0   203        203 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b


Note: I change domain nfs_2 to NFS 4.2. With default configuration (auto negotiate)
we get truncated output from lslocks, so we cannot detectthe volume being locked,
and also we must test NFS 4.2. Anything older is not interesting because it does not
support sparseness, so you get horrible performance for storage operations.

Comment 11 Nir Soffer 2020-02-16 16:08:11 UTC
Created attachment 1663359 [details]
logs for comment 10

Comment 13 Avihai 2020-02-17 07:15:32 UTC
Hi Nir ,

About the NFSv4.2 relation to this bug:
The issue reproduced the regression Tier1 runs which run on ISCSI/NFSv3/gluster/ISCSI and it reproduced multiple times with NFSv3 and gluster storage flavors and even with ISCSI as I see in the original description.

If this only occurred on NFS4.2 this bug was less of a blocker if at all.

Can you clarify the NFS4.2 connection here (besides the known performance increase compared to older NFS versions)?
Is the performance increase makes it more probable to occur?

Comment 14 Nir Soffer 2020-02-17 11:35:03 UTC
(In reply to Avihai from comment #13)
> About the NFSv4.2 relation to this bug:
> The issue reproduced the regression Tier1 runs which run on
> ISCSI/NFSv3/gluster/ISCSI and it reproduced multiple times with NFSv3 and
> gluster storage flavors and even with ISCSI as I see in the original
> description.

I don't think you can reproduce this with iSCSI or FC. The issue is caused by
locking on destination storage, which is not used on block storage.

Why NFS 4.2 is important: we don't have enough info about the locks with NFS 3/4/4.1,
so we cannot tell who own the locks that break the copy data job. With NFS 4.2 we
have detailed output in lslocks showing that qemu-kvm is taking these locks.

Comment 15 Michal Skrivanek 2020-03-04 12:20:58 UTC
Avihai, can you please clarify and/or retest on non-NFS? It could be that you've been seeing bug 1781637 instead, and that have been fixed by now. It would be very useful to know if this is NFS-only or not

Comment 16 Avihai 2020-03-05 12:19:29 UTC
(In reply to Michal Skrivanek from comment #15)
> Avihai, can you please clarify and/or retest on non-NFS? It could be that
> you've been seeing bug 1781637 instead, and that have been fixed by now. It
> would be very useful to know if this is NFS-only or not

Evelina as you were the reported of this bug, can you please clarify and/or retest on non-NFS?

Comment 17 Evelina Shames 2020-03-09 07:37:17 UTC
(In reply to Avihai from comment #16)
> (In reply to Michal Skrivanek from comment #15)
> > Avihai, can you please clarify and/or retest on non-NFS? It could be that
> > you've been seeing bug 1781637 instead, and that have been fixed by now. It
> > would be very useful to know if this is NFS-only or not
> 
> Evelina as you were the reported of this bug, can you please clarify and/or
> retest on non-NFS?

FLow:
1. create VM with iscsi disk
2. run VM
3. move disk to another iscsi domain

engine-4.4.0-0.24.master.el8ev
vdsm-4.40.5-1.el8ev.x86_64

failed with the following errors:

2020-03-09 09:24:40,497+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed in 'VmReplicateDiskStartVDS' method
2020-03-09 09:24:40,508+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_3 command VmReplicateDiskStartVDS failed: Drive replication error
2020-03-09 09:24:40,543+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand' return value 'StatusOnlyReturn [status=Status [code=55, message=Drive replication error]]'
2020-03-09 09:24:40,543+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] HostName = host_mixed_3
2020-03-09 09:24:40,543+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'VmReplicateDiskStartVDSCommand(HostName = host_mixed_3, VmReplicateDiskParameters:{hostId='76123112-f23b-42ef-977d-4d5b8a525165', vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', storagePoolId='62644225-5569-4502-bf3b-18929f5a2aa0', srcStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', targetStorageDomainId='f995b5ab-0b6e-4763-892d-143307a6e953', imageGroupId='223fa747-0cde-49ad-b28f-68906cf9ea11', imageId='cebef26d-e606-488c-af8b-e67028c05a0a'})' execution failed: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive replication error, code = 55
2020-03-09 09:24:40,543+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] FINISH, VmReplicateDiskStartVDSCommand, return: , log id: 34652c0d
2020-03-09 09:24:40,543+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed VmReplicateDiskStart (Disk '223fa747-0cde-49ad-b28f-68906cf9ea11' , VM 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8')
2020-03-09 09:24:40,544+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'LiveMigrateDisk' id: 'e076e024-466a-4e51-b619-ad42f047ffca' with children [26ce6beb-0658-4ee4-bc6f-326f8e225a81, e3811ab8-02b7-4212-b659-0274a94ec9b9] failed when attempting to perform the next operation, marking as 'ACTIVE'
2020-03-09 09:24:40,544+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] EngineException: Drive replication error (Failed with error replicaErr and code 55): org.ovirt.engine.core.common.errors.EngineException: EngineException: Drive replication error (Failed with error replicaErr and code 55)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.replicateDiskStart(LiveMigrateDiskCommand.java:533)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.performNextOperation(LiveMigrateDiskCommand.java:234)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

2020-03-09 09:24:40,545+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'LiveMigrateDisk' id: 'e076e024-466a-4e51-b619-ad42f047ffca' child commands '[26ce6beb-0658-4ee4-bc6f-326f8e225a81, e3811ab8-02b7-4212-b659-0274a94ec9b9]' executions were completed, status 'FAILED'
2020-03-09 09:24:41,589+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure.
2020-03-09 09:24:41,590+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed during live storage migration of disk '223fa747-0cde-49ad-b28f-68906cf9ea11' of vm 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', attempting to end replication before deleting the target disk
2020-03-09 09:24:41,612+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'VmReplicateDiskFinishVDSCommand(HostName = host_mixed_3, VmReplicateDiskParameters:{hostId='76123112-f23b-42ef-977d-4d5b8a525165', vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', storagePoolId='62644225-5569-4502-bf3b-18929f5a2aa0', srcStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', targetStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', imageGroupId='223fa747-0cde-49ad-b28f-68906cf9ea11', imageId='cebef26d-e606-488c-af8b-e67028c05a0a'})' execution failed: VDSGenericException: VDSErrorException: Failed in vdscommand to VmReplicateDiskFinishVDS, error = Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}}
2020-03-09 09:24:41,612+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] FINISH, VmReplicateDiskFinishVDSCommand, return: , log id: 2bfd86e2
2020-03-09 09:24:41,612+02 WARN  [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Replication is not in progress, proceeding with removing the target disk
2020-03-09 09:24:41,612+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Attempting to delete the target of disk '223fa747-0cde-49ad-b28f-68906cf9ea11' of vm 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8'
2020-03-09 09:24:42,602+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz have failed to move disk latest-rhel-guest-image-8.1-infra to domain iscsi_1.

vdsm:
2020-03-09 09:24:40,487+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='1ef1e505-ccf3-4fa1-9131-499078464cc0') Unexpected error (task:874)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 881, in _run
    return fn(*args, **kargs)
  File "<decorator-gen-167>", line 2, in prepareImage
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3205, in prepareImage
    allVols = dom.getAllVolumes()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 819, in getAllVolumes
    return self._manifest.getAllVolumes()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 720, in getAllVolumes
    vols, rems = self.getAllVolumesImages()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 707, in getAllVolumesImages
    allVols = getAllVolumes(self.sdUUID)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 244, in getAllVolumes
    vols = _getVolsTree(sdUUID)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 206, in _getVolsTree
    lvs = lvm.getLV(sdUUID)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1121, in getLV
    raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName))
vdsm.storage.exception.LogicalVolumeDoesNotExistError: Logical volume does not exist: ('f995b5ab-0b6e-4763-892d-143307a6e953/None',)
2020-03-09 09:24:40,489+0200 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH prepareImage error=Logical volume does not exist: ('f995b5ab-0b6e-4763-892d-143307a6e953/None',) (dispatcher:83)
2020-03-09 09:24:40,489+0200 ERROR (jsonrpc/2) [virt.vm] (vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8') Unable to start replication for vda to {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': 'f995b5ab-0b6e-4763-892d-143307a6e953', 'format': 'cow', 'cache': 'none', 'propagateErrors': 'off'} (vm:3975)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 3963, in diskReplicateStart
    replica['path'] = self.cif.prepareVolumePath(replica)
  File "/usr/lib/python3.6/site-packages/vdsm/clientIF.py", line 428, in prepareVolumePath
    raise vm.VolumeError(drive)
vdsm.virt.vm.VolumeError: Bad volume specification {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': 'f995b5ab-0b6e-4763-892d-143307a6e953', 'format': 'cow', 'cache': 'none', 'propagateErrors': 'off'}
2020-03-09 09:24:40,491+0200 INFO  (jsonrpc/2) [api.virt] FINISH diskReplicateStart return={'status': {'code': 55, 'message': 'Drive replication error'}} from=::ffff:10.35.162.10,46788, flow_id=573fb066-a9b6-4d3f-90b3-4e85c20c132c, vmId=eb7f8c10-56c1-4d25-82b8-4fb4331f81c8 (api:54)
2020-03-09 09:24:40,492+0200 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call VM.diskReplicateStart failed (error 55) in 0.01 seconds (__init__:312)
2020-03-09 09:24:41,591+0200 INFO  (jsonrpc/4) [api.virt] START diskReplicateFinish(srcDisk={'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}, dstDisk={'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}) from=::ffff:10.35.162.10,46788, flow_id=573fb066-a9b6-4d3f-90b3-4e85c20c132c, vmId=eb7f8c10-56c1-4d25-82b8-4fb4331f81c8 (api:48)
2020-03-09 09:24:41,591+0200 ERROR (jsonrpc/4) [api] FINISH diskReplicateFinish error=Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}} (api:131)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 543, in diskReplicateFinish
    return self.vm.diskReplicateFinish(srcDisk, dstDisk)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4011, in diskReplicateFinish
    srcDisk=srcDisk)
vdsm.common.exception.ReplicationNotInProgress: Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}}

Comment 18 Evelina Shames 2020-03-09 07:37:47 UTC
Created attachment 1668601 [details]
Logs

Comment 19 Michal Skrivanek 2020-03-16 07:41:51 UTC
(In reply to Nir Soffer from comment #14)
> I don't think you can reproduce this with iSCSI or FC. The issue is caused by
> locking on destination storage, which is not used on block storage.

This seems to reproduce on non-NFS as well.
see comment #17, that's with vdsm-4.40.5 where bug 1781637 is supposedly fixed.

Comment 20 Nir Soffer 2020-03-16 10:38:53 UTC
(In reply to Michal Skrivanek from comment #19)
> (In reply to Nir Soffer from comment #14)
> > I don't think you can reproduce this with iSCSI or FC. The issue is caused by
> > locking on destination storage, which is not used on block storage.
> 
> This seems to reproduce on non-NFS as well.
> see comment #17, that's with vdsm-4.40.5 where bug 1781637 is supposedly
> fixed.

Comment 17 describe another issue which is not related to this bug.

Comment 21 Michal Skrivanek 2020-04-03 09:15:47 UTC
should be retested once bug 1820068 is fixed

Comment 22 Lukas Svaty 2020-04-07 10:46:21 UTC
Moving to MODIFIED waiting for bug 1820068

Comment 23 Michal Skrivanek 2020-04-07 13:16:42 UTC
(In reply to Michal Skrivanek from comment #21)
> should be retested once bug 1820068 is fixed

or with scratch build thereof, with "modules_hotfixes=true" in repo file

Comment 24 Lukas Svaty 2020-04-22 14:25:44 UTC
Moving to ONQA, we should have fix in last build.

Comment 25 Evelina Shames 2020-04-23 10:49:48 UTC
Saw it in 2 automation TCs on:
ovirt-engine-4.4.0-0.33.master.el8ev.noarch
vdsm-4.40.13-1.el8ev.x86_64

Attaching Relevant Logs

Flow TC21907:
1. Create VM with disks from all permutation (including bootable disk)
2. Run the VM
3. Start running DD process to each of the VM disks
4. Move all VM disks to different storage domain concurrently

Flow TC21798:
1. Create VM with disks from all permutation (including bootable disk)
2. Run the VM	
3. Move all VM disks to different storage domain concurrently

Comment 27 Tal Nisan 2020-05-04 09:15:34 UTC
Evelina, can you please attach the rpm versions of libvirt and qemu-kvm?

Comment 28 Evelina Shames 2020-05-04 09:23:05 UTC
(In reply to Tal Nisan from comment #27)
> Evelina, can you please attach the rpm versions of libvirt and qemu-kvm?

libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64
qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64

Comment 29 Michal Skrivanek 2020-05-15 08:38:16 UTC
Tal, what's going on with this one?

Comment 31 Lukas Svaty 2020-05-18 07:51:43 UTC
Hi,

any update here?

Comment 32 Tal Nisan 2020-05-18 11:35:30 UTC
(In reply to Michal Skrivanek from comment #29)
> Tal, what's going on with this one?

Fedor is looking into it

Comment 33 Eyal Shenitzky 2020-05-18 12:38:11 UTC
Evelina, can you please put the exact error you received?

According to the engine log the only error related to disk migration is - 

2020-04-21 18:43:34,758+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6) [disks_syncAction_492d65fc-1f32-416a] Exception: java.lang.NullPointerException
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand.completeGenerationInfo(CopyDataCommand.java:57)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand.executeCommand(CopyDataCommand.java:40)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1169)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1327)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2003)
	at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:140)
	at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:79)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1387)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:419)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runAction(Backend.java:442)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:424)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:630)
	at jdk.internal.reflect.GeneratedMethodAccessor455.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:79)
	at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:89)
	at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:102)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:228)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:430)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:160)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81)
	at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
	at org.wildfly.security.elytron-private.4.Final-redhat-00001//org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:627)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185)
	at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view2.runInternalAction(Unknown Source)
	at jdk.internal.reflect.GeneratedMethodAccessor454.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:410)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:134)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.InjectionPointPropagatingEnterpriseTargetBeanInstance.invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68)
	at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:106)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2381)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2406)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2401)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand.copyVolumeData(CopyImageGroupVolumesDataCommand.java:135)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand.performNextOperation(CopyImageGroupVolumesDataCommand.java:109)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

Comment 34 Eyal Shenitzky 2020-05-18 12:40:11 UTC
(In reply to Eyal Shenitzky from comment #33)
> Evelina, can you please put the exact error you received?
> 
> According to the engine log the only error related to disk migration is - 
> 
> 2020-04-21 18:43:34,758+03 ERROR
> [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6)
> [disks_syncAction_492d65fc-1f32-416a] Exception:
> java.lang.NullPointerException
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.
> CopyDataCommand.completeGenerationInfo(CopyDataCommand.java:57)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.
> CopyDataCommand.executeCommand(CopyDataCommand.java:40)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> executeWithoutTransaction(CommandBase.java:1169)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> executeActionInTransactionScope(CommandBase.java:1327)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> runInTransaction(CommandBase.java:2003)
> 	at
> org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.
> TransactionSupport.executeInSuppressed(TransactionSupport.java:140)
> 	at
> org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.
> TransactionSupport.executeInScope(TransactionSupport.java:79)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> execute(CommandBase.java:1387)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> executeAction(CommandBase.java:419)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.executor.
> DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.
> runAction(Backend.java:442)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.
> runActionImpl(Backend.java:424)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.
> runInternalAction(Backend.java:630)
> 	at jdk.internal.reflect.GeneratedMethodAccessor455.invoke(Unknown Source)
> 	at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> ManagedReferenceMethodInterceptor.
> processInvocation(ManagedReferenceMethodInterceptor.java:52)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> 	at
> org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.
> interceptors.Jsr299BindingsInterceptor.
> delegateInterception(Jsr299BindingsInterceptor.java:79)
> 	at
> org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.
> interceptors.Jsr299BindingsInterceptor.
> doMethodInterception(Jsr299BindingsInterceptor.java:89)
> 	at
> org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.
> interceptors.Jsr299BindingsInterceptor.
> processInvocation(Jsr299BindingsInterceptor.java:102)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> interceptors.UserInterceptorFactory$1.
> processInvocation(UserInterceptorFactory.java:63)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> invocationmetrics.ExecutionTimeInterceptor.
> processInvocation(ExecutionTimeInterceptor.java:43)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.concurrent.
> ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.
> java:45)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InitialInterceptor.processInvocation(InitialInterceptor.java:40)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> interceptors.ComponentDispatcherInterceptor.
> processInvocation(ComponentDispatcherInterceptor.java:52)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> singleton.SingletonComponentInstanceAssociationInterceptor.
> processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.
> CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:228)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.
> CMTTxInterceptor.supports(CMTTxInterceptor.java:430)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.
> CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:160)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.
> AbstractEJBRequestScopeActivationInterceptor.
> aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81)
> 	at
> org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.ejb.
> EjbRequestScopeActivationInterceptor.
> processInvocation(EjbRequestScopeActivationInterceptor.java:89)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> interceptors.CurrentInvocationContextInterceptor.
> processInvocation(CurrentInvocationContextInterceptor.java:41)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.
> java:47)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.security.
> SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:
> 100)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.deployment.
> processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.
> java:22)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> interceptors.ShutDownInterceptorFactory$1.
> processInvocation(ShutDownInterceptorFactory.java:64)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.
> interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.
> java:50)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> ContextClassLoaderInterceptor.
> processInvocation(ContextClassLoaderInterceptor.java:60)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.run(InterceptorContext.java:438)
> 	at
> org.wildfly.security.elytron-private.4.Final-redhat-00001//org.wildfly.
> security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.
> java:627)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:
> 57)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> InterceptorContext.proceed(InterceptorContext.java:422)
> 	at
> org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.
> ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> ViewService$View.invoke(ViewService.java:198)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> ViewDescription$1.processInvocation(ViewDescription.java:185)
> 	at
> org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.
> ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.interfaces.
> BackendInternal$$$view2.runInternalAction(Unknown Source)
> 	at jdk.internal.reflect.GeneratedMethodAccessor454.invoke(Unknown Source)
> 	at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.util.reflection.
> Reflections.invokeAndUnwrap(Reflections.java:410)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.
> EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.
> java:134)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.
> EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.
> InjectionPointPropagatingEnterpriseTargetBeanInstance.
> invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68)
> 	at
> org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.
> ProxyMethodHandler.invoke(ProxyMethodHandler.java:106)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.
> BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$
> $_Weld$EnterpriseProxy$.runInternalAction(Unknown Source)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> runInternalAction(CommandBase.java:2381)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> runInternalActionWithTasksContext(CommandBase.java:2406)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.
> runInternalActionWithTasksContext(CommandBase.java:2401)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.
> CopyImageGroupVolumesDataCommand.
> copyVolumeData(CopyImageGroupVolumesDataCommand.java:135)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.
> CopyImageGroupVolumesDataCommand.
> performNextOperation(CopyImageGroupVolumesDataCommand.java:109)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.
> SerialChildCommandsExecutionCallback.
> childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.
> ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.
> CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:
> 175)
> 	at
> deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.
> CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
> 	at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:
> 515)
> 	at
> java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> 	at
> org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.
> enterprise.concurrent.internal.
> ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.
> access$201(ManagedScheduledThreadPoolExecutor.java:383)
> 	at
> org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.
> enterprise.concurrent.internal.
> ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.
> run(ManagedScheduledThreadPoolExecutor.java:534)
> 	at
> java.base/java.util.concurrent.ThreadPoolExecutor.
> runWorker(ThreadPoolExecutor.java:1128)
> 	at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.
> run(ThreadPoolExecutor.java:628)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> 	at
> org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.
> enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.
> run(ManagedThreadFactoryImpl.java:250)

And some error related to communication issue with the storage - 
5734:2020-04-21 18:43:34,754+0300 DEBUG (jsonrpc/0) [storage.VolumeManifest] Failed to get volume info: Error while processing volume meta data: ("('/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com:_nas01_ge__7__nfs__1/50fa1353-0b26-4361-983e-b8a47579ed47/images/21e3eeee-59fa-4569-98cf-57752525c465/a8c61a6a-2625-4f7b-af9c-b26392050cf8',): Connection timed out",) (volume:263)

Comment 35 Avihai 2020-05-26 07:44:56 UTC
The issue does not reproduce in latest rhv-4.4.1-1.
engine 4.4.1-0.1.el8ev
vdsm-4.40.17-1.el8ev.x86_64
libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64


Rerun the same TC's and checked on latest regression test suits.

Also related bug[1]is fixed which probably also fixed this issue.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1803092

Comment 36 Michal Skrivanek 2020-05-26 10:21:44 UTC
thank you, Avihai, closing


Note You need to log in before you can comment on or make changes to this bug.