Created attachment 1657566 [details] logs Description of problem: Live storage migration fails with the following errors: Engine: 2020-02-04 15:48:41,813+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-93) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure. 2020-02-04 15:48:42,828+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-49) [b2746547-7938-4236-9888-a50a734b2049] Command 'CopyImageGroupVolumesData' id: '8ad6e4b1-0f25-432e-80a4-bb493d555dd1' child commands '[137933c6-f8d2-4981-b703-9758b207110e]' executions were completed, status 'FAILED' 2020-02-04 15:48:43,845+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-63) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand' with failure. 2020-02-04 15:48:43,924+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-63) [b2746547-7938-4236-9888-a50a734b2049] Command 'LiveMigrateDisk' id: '41bf4f8a-d9a7-49e3-85e8-b62b5e2b7c90' child commands '[9fa26ec8-01c4-4535-84fa-57996a39600d, 7af45bec-7f1d-484e-89af-05bbbf4c0ae1, 8ad6e4b1-0f25-432e-80a4-bb493d555dd1]' executions were completed, status 'FAILED' 2020-02-04 15:48:45,032+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure. 2020-02-04 15:48:45,033+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Failed during live storage migration of disk '1f14a95a-eac9-4cde-b48f-84e9d8bfe19c' of vm 'f41f6b88-f4c0-4649-a7df-3c5a69f60726', attempting to end replication before deleting the target disk 2020-02-04 15:48:45,035+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] START, VmReplicateDiskFinishVDSCommand(HostName = host_mixed_2, VmReplicateDiskParameters:{hostId='a0563ece-3dd8-48e6-8e34-8c704456cbf2', vmId='f41f6b88-f4c0-4649-a7df-3c5a69f60726', storagePoolId='970b8cda-8492-42cc-bad8-1c3557950dc6', srcStorageDomainId='abfd8f3e-d1e2-470e-80c4-5229f9f1fb87', targetStorageDomainId='abfd8f3e-d1e2-470e-80c4-5229f9f1fb87', imageGroupId='1f14a95a-eac9-4cde-b48f-84e9d8bfe19c', imageId='25b916f9-2e55-4781-baaa-6d4c564e825a'}), log id: 604ce349 2020-02-04 15:48:45,609+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] FINISH, VmReplicateDiskFinishVDSCommand, return: , log id: 604ce349 2020-02-04 15:48:45,609+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [b2746547-7938-4236-9888-a50a734b2049] Attempting to delete the target of disk '1f14a95a-eac9-4cde-b48f-84e9d8bfe19c' of vm 'f41f6b88-f4c0-4649-a7df-3c5a69f60726' 2020-02-04 15:48:46,324+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-74) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz have failed to move disk latest-rhel-guest-image-8.1-infra to domain nfs_0. vdsm: 2020-02-04 08:48:36,104-0500 ERROR (tasks/6) [root] Job '15da5400-1991-4226-97a1-ef8eea9db226' failed (jobs:223) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run self._run() File "/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/copy_data.py", line 86, in _run self._operation.run() File "/usr/lib/python3.6/site-packages/vdsm/storage/qemuimg.py", line 344, in run for data in self._operation.watch(): File "/usr/lib/python3.6/site-packages/vdsm/storage/operation.py", line 106, in watch self._finalize(b"", err) File "/usr/lib/python3.6/site-packages/vdsm/storage/operation.py", line 179, in _finalize raise cmdutils.Error(self._cmd, rc, out, err) vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/abfd8f3e-d1e2-470e-80c4-5229f9f1fb87/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd', '-O', 'qcow2', '-o', 'compat=1.1,backing_file=80c1e0ae-8037-4bb9-b3d3-4d707fc0d3fc,backing_fmt=qcow2', '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd: error while converting qcow2: Failed to get "write" lock\nIs another process using the image [/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge12__nfs__0/8f0963cb-7d79-4c28-9862-2cd85f7a1954/images/1f14a95a-eac9-4cde-b48f-84e9d8bfe19c/a4dd9241-5fcc-4f4f-b204-ce65b95ba0dd]?\n') Version-Release number of selected component (if applicable): 4.4.0-0.17.master.el7 How reproducible: Most of the times - I noticed that mainly it reproduces with nfs/gluster destination sd, but there were failure with iscsi as well. Steps to Reproduce: 1. Create New VM with disk (try to create disk on iscsi) 2. Start VM 3. Move its disk to gluster sd (or nfs) Actual results: Operation fails )and it takes a while until the generated snapshots is removed (there is no task in the UI of this operation)) Expected results: Operation should succeed Additional info: Logs are attached.
Nir, can you please have a look?
Moving to vdsm since this is clearly not engine bug. This is most likely libvirt bug, I will add more details later.
Evelina, to understand this bug better, we need to know who holds the lock breaking qemu-img convert. If you can reproduce this manually, please do this: 0. Enable libvirt logs using this configuration in /etc/libvirt/libvirtd.conf: log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util" log_outputs="1:file:/var/log/libvirt/libvirtd.log" Restart libvirtd to activate the log. Check that /var/log/libvirt/libvirtd.log was created. 1. Start logging lslocks outpout in one shell while true; do date >> lslocks.log lslocks >> lslocks.log echo --- >> lslocks.log sleep 5 done 2. Start logging libvirt xml in another shell First find the vm id using: $ virsh -r list And run this script in a new shell: while true; do date >> vm-xml.log virsh -r dumpxml vm-id >> vm-xml.log echo --- >> vm-xml.log done 3. Start the live storage migration 4. When the live storage migration fails in vdsm, stop the scripts Attache the collected logs to the bug (lslocks.log, vm-xml.log, vdsm.log, libvirtd.log).
Changing severity to urgent since this is regression and loss of basic functionallity that require shutting down vm for moving the disks to another storage.
Created attachment 1663280 [details] logs Logs attached. Please note that I've reproduced this bug twice: 1. manually *without* the scripts and libvirtd.log 2. manually with the required steps (libvirtd.log and scripts). Therefore, you might see this scenario twice in vdsm.log Let me know if you need any further assistance.
(In reply to Yosi Ben Shimon from comment #7) > Created attachment 1663280 [details] What we see in the logs: qemu-img fail to copy volume d26848d8-ab50-4e7a-ad8e-8b53e247075b from block storage domain a62157c0-e7a0-4199-9cc1-e24bc6d9aa36 to gluster domain 76e32f13-b6e9-45a5-a177-0b0f773a29e8: vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/a62157c0-e7a0-4199-9cc1- e24bc6d9aa36/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/glusterSD/gluster01 .scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b '] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a17 7-0b0f773a29e8/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b: error while converting qcow2: Failed to get "write" lock\nIs another process using the image [/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b5b7- 4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b]?\n') Looking in vm-xml.log, we see that this volume is the base volume on the destination storage during a mirror job: <mirror type='file' file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/ 299d48db-b5b7-4e7c-8db4-9d95edf048e9/58055544-064f-46e1-a3a2-cbbe3d0171ac' format='qcow2' job='copy' ready='abort'> <format type='qcow2'/> <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db-b 5b7-4e7c-8db4-9d95edf048e9/58055544-064f-46e1-a3a2-cbbe3d0171ac' index='7'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='file' index='8'> <format type='qcow2'/> <source file='/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge6__volume__0/76e32f13-b6e9-45a5-a177-0b0f773a29e8/images/299d48db -b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b'> <seclabel model='dac' relabel='no'/> </source> <backingStore/> </backingStore> </mirror> Looking in lslocks.log, we don't see any locks on anything in /rhev/. Did you run it on the host the vm was running on? Please run this again on the same host running the vm. Lets use smaller sleep value to make sure we don't miss the locks if the failure and cleanup are very quick.
I reproduced the issue on Yosi environment (storage-ge6-vdsm2). This is the same issue reported in bug 1803092. Step to reproduce: 1. Change vdsm to stop before copying images: Edit /usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/copy_data.py with self._dest.volume_operation(): import time, os while os.path.exists("/run/vdsm/break"): self.log.info("---- WAITING ON /run/vdsm/break ----") time.sleep(10) self._operation = qemuimg.convert( self._source.path, self._dest.path, srcFormat=src_format, dstFormat=dst_format, dstQcow2Compat=self._dest.qcow2_compat, backing=self._dest.backing_path, backingFormat=self._dest.backing_qemu_format, preallocation=self._dest.preallocation, unordered_writes=self._dest .recommends_unordered_writes) self._operation.run() 2. Create break file: # touch /run/vdsm/break 3. Start watching vdsm log: # tail -f /var/log/vdsm/vdsm.log | grep 'WAITING ON /run/vdsm/break' 4. Start live storage migration 5. Wait until vdsm stops 2020-02-16 10:45:03,408-0500 INFO (tasks/3) [storage.sdm.copy_data] ---- WAITING ON /run/vdsm/break ---- (copy_data:78) 6. Collect vm.xml and lslocks # virsh -r list Id Name State ---------------------- 1 vm_a running # virsh -r dumpxml 1 > vm.xml # lslocks > lslocks.out 7. Delete /run/vdsm/break # rm /run/vdsm/break In vdsm log we can see that qemu-img failed: vdsm.common.cmdutils.Error: Command ['/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'qcow2', '/rhev/data-center/mnt/blockSD/a62157c0-e7a0-4199-9cc1- e24bc6d9aa36/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/yellow-vdsb.qa.lab. tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075 b'] failed with rc=1 out=b'' err=bytearray(b'qemu-img: /rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a 18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b: error while converting qcow2: Failed to get "write" lock\nIs another proce ss using the image [/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5 b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b]?\n') In vm.xml we see that the volume d26848d8-ab50-4e7a-ad8e-8b53e247075b is the base volume in the destination storage: <mirror type='file' file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/83e965c3-c36d-45b7-84fc-35d8cc229170' format='qcow2' job='copy' ready='yes'> <format type='qcow2'/> <source file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/83e965c3-c36d-45b7-84fc-35d8cc229170' index='4'> <seclabel model='dac' relabel='no'/> </source> <backingStore type='file' index='5'> <format type='qcow2'/> <source file='/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b'> <seclabel model='dac' relabel='no'/> </source> <backingStore/> </backingStore> </mirror> In lslocks we see that there qemu-kvm is holding read locks on this volume: qemu-kvm 13267 POSIX 192K READ 0 100 100 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b qemu-kvm 13267 POSIX 192K READ 0 201 201 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b qemu-kvm 13267 POSIX 192K READ 0 203 203 /rhev/data-center/mnt/yellow-vdsb.xxx:_Storage__NFS_storage__local__ge6__nfs__2/842093fc-3462-4988-a18d-d20ee8ac9026/images/299d48db-b5b7-4e7c-8db4-9d95edf048e9/d26848d8-ab50-4e7a-ad8e-8b53e247075b Note: I change domain nfs_2 to NFS 4.2. With default configuration (auto negotiate) we get truncated output from lslocks, so we cannot detectthe volume being locked, and also we must test NFS 4.2. Anything older is not interesting because it does not support sparseness, so you get horrible performance for storage operations.
Created attachment 1663359 [details] logs for comment 10
Hi Nir , About the NFSv4.2 relation to this bug: The issue reproduced the regression Tier1 runs which run on ISCSI/NFSv3/gluster/ISCSI and it reproduced multiple times with NFSv3 and gluster storage flavors and even with ISCSI as I see in the original description. If this only occurred on NFS4.2 this bug was less of a blocker if at all. Can you clarify the NFS4.2 connection here (besides the known performance increase compared to older NFS versions)? Is the performance increase makes it more probable to occur?
(In reply to Avihai from comment #13) > About the NFSv4.2 relation to this bug: > The issue reproduced the regression Tier1 runs which run on > ISCSI/NFSv3/gluster/ISCSI and it reproduced multiple times with NFSv3 and > gluster storage flavors and even with ISCSI as I see in the original > description. I don't think you can reproduce this with iSCSI or FC. The issue is caused by locking on destination storage, which is not used on block storage. Why NFS 4.2 is important: we don't have enough info about the locks with NFS 3/4/4.1, so we cannot tell who own the locks that break the copy data job. With NFS 4.2 we have detailed output in lslocks showing that qemu-kvm is taking these locks.
Avihai, can you please clarify and/or retest on non-NFS? It could be that you've been seeing bug 1781637 instead, and that have been fixed by now. It would be very useful to know if this is NFS-only or not
(In reply to Michal Skrivanek from comment #15) > Avihai, can you please clarify and/or retest on non-NFS? It could be that > you've been seeing bug 1781637 instead, and that have been fixed by now. It > would be very useful to know if this is NFS-only or not Evelina as you were the reported of this bug, can you please clarify and/or retest on non-NFS?
(In reply to Avihai from comment #16) > (In reply to Michal Skrivanek from comment #15) > > Avihai, can you please clarify and/or retest on non-NFS? It could be that > > you've been seeing bug 1781637 instead, and that have been fixed by now. It > > would be very useful to know if this is NFS-only or not > > Evelina as you were the reported of this bug, can you please clarify and/or > retest on non-NFS? FLow: 1. create VM with iscsi disk 2. run VM 3. move disk to another iscsi domain engine-4.4.0-0.24.master.el8ev vdsm-4.40.5-1.el8ev.x86_64 failed with the following errors: 2020-03-09 09:24:40,497+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed in 'VmReplicateDiskStartVDS' method 2020-03-09 09:24:40,508+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_3 command VmReplicateDiskStartVDS failed: Drive replication error 2020-03-09 09:24:40,543+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand' return value 'StatusOnlyReturn [status=Status [code=55, message=Drive replication error]]' 2020-03-09 09:24:40,543+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] HostName = host_mixed_3 2020-03-09 09:24:40,543+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'VmReplicateDiskStartVDSCommand(HostName = host_mixed_3, VmReplicateDiskParameters:{hostId='76123112-f23b-42ef-977d-4d5b8a525165', vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', storagePoolId='62644225-5569-4502-bf3b-18929f5a2aa0', srcStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', targetStorageDomainId='f995b5ab-0b6e-4763-892d-143307a6e953', imageGroupId='223fa747-0cde-49ad-b28f-68906cf9ea11', imageId='cebef26d-e606-488c-af8b-e67028c05a0a'})' execution failed: VDSGenericException: VDSErrorException: Failed to VmReplicateDiskStartVDS, error = Drive replication error, code = 55 2020-03-09 09:24:40,543+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskStartVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] FINISH, VmReplicateDiskStartVDSCommand, return: , log id: 34652c0d 2020-03-09 09:24:40,543+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed VmReplicateDiskStart (Disk '223fa747-0cde-49ad-b28f-68906cf9ea11' , VM 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8') 2020-03-09 09:24:40,544+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'LiveMigrateDisk' id: 'e076e024-466a-4e51-b619-ad42f047ffca' with children [26ce6beb-0658-4ee4-bc6f-326f8e225a81, e3811ab8-02b7-4212-b659-0274a94ec9b9] failed when attempting to perform the next operation, marking as 'ACTIVE' 2020-03-09 09:24:40,544+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] EngineException: Drive replication error (Failed with error replicaErr and code 55): org.ovirt.engine.core.common.errors.EngineException: EngineException: Drive replication error (Failed with error replicaErr and code 55) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.replicateDiskStart(LiveMigrateDiskCommand.java:533) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.performNextOperation(LiveMigrateDiskCommand.java:234) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) 2020-03-09 09:24:40,545+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-2) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'LiveMigrateDisk' id: 'e076e024-466a-4e51-b619-ad42f047ffca' child commands '[26ce6beb-0658-4ee4-bc6f-326f8e225a81, e3811ab8-02b7-4212-b659-0274a94ec9b9]' executions were completed, status 'FAILED' 2020-03-09 09:24:41,589+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure. 2020-03-09 09:24:41,590+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Failed during live storage migration of disk '223fa747-0cde-49ad-b28f-68906cf9ea11' of vm 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', attempting to end replication before deleting the target disk 2020-03-09 09:24:41,612+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Command 'VmReplicateDiskFinishVDSCommand(HostName = host_mixed_3, VmReplicateDiskParameters:{hostId='76123112-f23b-42ef-977d-4d5b8a525165', vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', storagePoolId='62644225-5569-4502-bf3b-18929f5a2aa0', srcStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', targetStorageDomainId='1d42be4a-ae65-4d91-b2d0-2ac70472c71e', imageGroupId='223fa747-0cde-49ad-b28f-68906cf9ea11', imageId='cebef26d-e606-488c-af8b-e67028c05a0a'})' execution failed: VDSGenericException: VDSErrorException: Failed in vdscommand to VmReplicateDiskFinishVDS, error = Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}} 2020-03-09 09:24:41,612+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] FINISH, VmReplicateDiskFinishVDSCommand, return: , log id: 2bfd86e2 2020-03-09 09:24:41,612+02 WARN [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Replication is not in progress, proceeding with removing the target disk 2020-03-09 09:24:41,612+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [573fb066-a9b6-4d3f-90b3-4e85c20c132c] Attempting to delete the target of disk '223fa747-0cde-49ad-b28f-68906cf9ea11' of vm 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8' 2020-03-09 09:24:42,602+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-3) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz have failed to move disk latest-rhel-guest-image-8.1-infra to domain iscsi_1. vdsm: 2020-03-09 09:24:40,487+0200 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='1ef1e505-ccf3-4fa1-9131-499078464cc0') Unexpected error (task:874) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 881, in _run return fn(*args, **kargs) File "<decorator-gen-167>", line 2, in prepareImage File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3205, in prepareImage allVols = dom.getAllVolumes() File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 819, in getAllVolumes return self._manifest.getAllVolumes() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 720, in getAllVolumes vols, rems = self.getAllVolumesImages() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 707, in getAllVolumesImages allVols = getAllVolumes(self.sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 244, in getAllVolumes vols = _getVolsTree(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 206, in _getVolsTree lvs = lvm.getLV(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1121, in getLV raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName)) vdsm.storage.exception.LogicalVolumeDoesNotExistError: Logical volume does not exist: ('f995b5ab-0b6e-4763-892d-143307a6e953/None',) 2020-03-09 09:24:40,489+0200 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH prepareImage error=Logical volume does not exist: ('f995b5ab-0b6e-4763-892d-143307a6e953/None',) (dispatcher:83) 2020-03-09 09:24:40,489+0200 ERROR (jsonrpc/2) [virt.vm] (vmId='eb7f8c10-56c1-4d25-82b8-4fb4331f81c8') Unable to start replication for vda to {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': 'f995b5ab-0b6e-4763-892d-143307a6e953', 'format': 'cow', 'cache': 'none', 'propagateErrors': 'off'} (vm:3975) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 3963, in diskReplicateStart replica['path'] = self.cif.prepareVolumePath(replica) File "/usr/lib/python3.6/site-packages/vdsm/clientIF.py", line 428, in prepareVolumePath raise vm.VolumeError(drive) vdsm.virt.vm.VolumeError: Bad volume specification {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': 'f995b5ab-0b6e-4763-892d-143307a6e953', 'format': 'cow', 'cache': 'none', 'propagateErrors': 'off'} 2020-03-09 09:24:40,491+0200 INFO (jsonrpc/2) [api.virt] FINISH diskReplicateStart return={'status': {'code': 55, 'message': 'Drive replication error'}} from=::ffff:10.35.162.10,46788, flow_id=573fb066-a9b6-4d3f-90b3-4e85c20c132c, vmId=eb7f8c10-56c1-4d25-82b8-4fb4331f81c8 (api:54) 2020-03-09 09:24:40,492+0200 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call VM.diskReplicateStart failed (error 55) in 0.01 seconds (__init__:312) 2020-03-09 09:24:41,591+0200 INFO (jsonrpc/4) [api.virt] START diskReplicateFinish(srcDisk={'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}, dstDisk={'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}) from=::ffff:10.35.162.10,46788, flow_id=573fb066-a9b6-4d3f-90b3-4e85c20c132c, vmId=eb7f8c10-56c1-4d25-82b8-4fb4331f81c8 (api:48) 2020-03-09 09:24:41,591+0200 ERROR (jsonrpc/4) [api] FINISH diskReplicateFinish error=Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}} (api:131) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 543, in diskReplicateFinish return self.vm.diskReplicateFinish(srcDisk, dstDisk) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4011, in diskReplicateFinish srcDisk=srcDisk) vdsm.common.exception.ReplicationNotInProgress: Replication not in progress.: {'vmId': 'eb7f8c10-56c1-4d25-82b8-4fb4331f81c8', 'driveName': 'vda', 'srcDisk': {'imageID': '223fa747-0cde-49ad-b28f-68906cf9ea11', 'poolID': '62644225-5569-4502-bf3b-18929f5a2aa0', 'volumeID': 'cebef26d-e606-488c-af8b-e67028c05a0a', 'device': 'disk', 'domainID': '1d42be4a-ae65-4d91-b2d0-2ac70472c71e'}}
Created attachment 1668601 [details] Logs
(In reply to Nir Soffer from comment #14) > I don't think you can reproduce this with iSCSI or FC. The issue is caused by > locking on destination storage, which is not used on block storage. This seems to reproduce on non-NFS as well. see comment #17, that's with vdsm-4.40.5 where bug 1781637 is supposedly fixed.
(In reply to Michal Skrivanek from comment #19) > (In reply to Nir Soffer from comment #14) > > I don't think you can reproduce this with iSCSI or FC. The issue is caused by > > locking on destination storage, which is not used on block storage. > > This seems to reproduce on non-NFS as well. > see comment #17, that's with vdsm-4.40.5 where bug 1781637 is supposedly > fixed. Comment 17 describe another issue which is not related to this bug.
should be retested once bug 1820068 is fixed
Moving to MODIFIED waiting for bug 1820068
(In reply to Michal Skrivanek from comment #21) > should be retested once bug 1820068 is fixed or with scratch build thereof, with "modules_hotfixes=true" in repo file
Moving to ONQA, we should have fix in last build.
Saw it in 2 automation TCs on: ovirt-engine-4.4.0-0.33.master.el8ev.noarch vdsm-4.40.13-1.el8ev.x86_64 Attaching Relevant Logs Flow TC21907: 1. Create VM with disks from all permutation (including bootable disk) 2. Run the VM 3. Start running DD process to each of the VM disks 4. Move all VM disks to different storage domain concurrently Flow TC21798: 1. Create VM with disks from all permutation (including bootable disk) 2. Run the VM 3. Move all VM disks to different storage domain concurrently
Evelina, can you please attach the rpm versions of libvirt and qemu-kvm?
(In reply to Tal Nisan from comment #27) > Evelina, can you please attach the rpm versions of libvirt and qemu-kvm? libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
Tal, what's going on with this one?
Hi, any update here?
(In reply to Michal Skrivanek from comment #29) > Tal, what's going on with this one? Fedor is looking into it
Evelina, can you please put the exact error you received? According to the engine log the only error related to disk migration is - 2020-04-21 18:43:34,758+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6) [disks_syncAction_492d65fc-1f32-416a] Exception: java.lang.NullPointerException at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand.completeGenerationInfo(CopyDataCommand.java:57) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand.executeCommand(CopyDataCommand.java:40) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1169) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1327) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2003) at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:140) at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:79) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1387) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:419) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runAction(Backend.java:442) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:424) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:630) at jdk.internal.reflect.GeneratedMethodAccessor455.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509) at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:79) at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:89) at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:102) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:228) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:430) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:160) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81) at org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438) at org.wildfly.security.elytron-private.4.Final-redhat-00001//org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:627) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185) at org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view2.runInternalAction(Unknown Source) at jdk.internal.reflect.GeneratedMethodAccessor454.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:410) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:134) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb.InjectionPointPropagatingEnterpriseTargetBeanInstance.invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68) at org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:106) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2381) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2406) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2401) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand.copyVolumeData(CopyImageGroupVolumesDataCommand.java:135) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand.performNextOperation(CopyImageGroupVolumesDataCommand.java:109) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
(In reply to Eyal Shenitzky from comment #33) > Evelina, can you please put the exact error you received? > > According to the engine log the only error related to disk migration is - > > 2020-04-21 18:43:34,758+03 ERROR > [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] > (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6) > [disks_syncAction_492d65fc-1f32-416a] Exception: > java.lang.NullPointerException > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image. > CopyDataCommand.completeGenerationInfo(CopyDataCommand.java:57) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image. > CopyDataCommand.executeCommand(CopyDataCommand.java:40) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > executeWithoutTransaction(CommandBase.java:1169) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > executeActionInTransactionScope(CommandBase.java:1327) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > runInTransaction(CommandBase.java:2003) > at > org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction. > TransactionSupport.executeInSuppressed(TransactionSupport.java:140) > at > org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction. > TransactionSupport.executeInScope(TransactionSupport.java:79) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > execute(CommandBase.java:1387) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > executeAction(CommandBase.java:419) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.executor. > DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend. > runAction(Backend.java:442) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend. > runActionImpl(Backend.java:424) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend. > runInternalAction(Backend.java:630) > at jdk.internal.reflect.GeneratedMethodAccessor455.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl. > invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > ManagedReferenceMethodInterceptor. > processInvocation(ManagedReferenceMethodInterceptor.java:52) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext$Invocation.proceed(InterceptorContext.java:509) > at > org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld. > interceptors.Jsr299BindingsInterceptor. > delegateInterception(Jsr299BindingsInterceptor.java:79) > at > org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld. > interceptors.Jsr299BindingsInterceptor. > doMethodInterception(Jsr299BindingsInterceptor.java:89) > at > org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld. > interceptors.Jsr299BindingsInterceptor. > processInvocation(Jsr299BindingsInterceptor.java:102) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > interceptors.UserInterceptorFactory$1. > processInvocation(UserInterceptorFactory.java:63) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > invocationmetrics.ExecutionTimeInterceptor. > processInvocation(ExecutionTimeInterceptor.java:43) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.concurrent. > ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor. > java:45) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InitialInterceptor.processInvocation(InitialInterceptor.java:40) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > ChainedInterceptor.processInvocation(ChainedInterceptor.java:53) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > interceptors.ComponentDispatcherInterceptor. > processInvocation(ComponentDispatcherInterceptor.java:52) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > singleton.SingletonComponentInstanceAssociationInterceptor. > processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx. > CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:228) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx. > CMTTxInterceptor.supports(CMTTxInterceptor.java:430) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.tx. > CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:160) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext$Invocation.proceed(InterceptorContext.java:509) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb. > AbstractEJBRequestScopeActivationInterceptor. > aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81) > at > org.jboss.as.weld.common.0.GA-redhat-00004//org.jboss.as.weld.ejb. > EjbRequestScopeActivationInterceptor. > processInvocation(EjbRequestScopeActivationInterceptor.java:89) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > interceptors.CurrentInvocationContextInterceptor. > processInvocation(CurrentInvocationContextInterceptor.java:41) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor. > java:47) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.security. > SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java: > 100) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.deployment. > processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor. > java:22) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > interceptors.ShutDownInterceptorFactory$1. > processInvocation(ShutDownInterceptorFactory.java:64) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ejb3.0.GA-redhat-00004//org.jboss.as.ejb3.component. > interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor. > java:50) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > ContextClassLoaderInterceptor. > processInvocation(ContextClassLoaderInterceptor.java:60) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.run(InterceptorContext.java:438) > at > org.wildfly.security.elytron-private.4.Final-redhat-00001//org.wildfly. > security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager. > java:627) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java: > 57) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > InterceptorContext.proceed(InterceptorContext.java:422) > at > org.jboss.invocation.2.Final-redhat-00001//org.jboss.invocation. > ChainedInterceptor.processInvocation(ChainedInterceptor.java:53) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > ViewService$View.invoke(ViewService.java:198) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > ViewDescription$1.processInvocation(ViewDescription.java:185) > at > org.jboss.as.ee.0.GA-redhat-00004//org.jboss.as.ee.component. > ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.interfaces. > BackendInternal$$$view2.runInternalAction(Unknown Source) > at jdk.internal.reflect.GeneratedMethodAccessor454.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl. > invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.util.reflection. > Reflections.invokeAndUnwrap(Reflections.java:410) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb. > EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler. > java:134) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy. > EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.module.ejb. > InjectionPointPropagatingEnterpriseTargetBeanInstance. > invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68) > at > org.jboss.weld.core.2.Final-redhat-00001//org.jboss.weld.bean.proxy. > ProxyMethodHandler.invoke(ProxyMethodHandler.java:106) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll. > BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$ > $_Weld$EnterpriseProxy$.runInternalAction(Unknown Source) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > runInternalAction(CommandBase.java:2381) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > runInternalActionWithTasksContext(CommandBase.java:2406) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase. > runInternalActionWithTasksContext(CommandBase.java:2401) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image. > CopyImageGroupVolumesDataCommand. > copyVolumeData(CopyImageGroupVolumesDataCommand.java:135) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image. > CopyImageGroupVolumesDataCommand. > performNextOperation(CopyImageGroupVolumesDataCommand.java:109) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll. > SerialChildCommandsExecutionCallback. > childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll. > ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks. > CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java: > 175) > at > deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks. > CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java: > 515) > at > java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > at > org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish. > enterprise.concurrent.internal. > ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask. > access$201(ManagedScheduledThreadPoolExecutor.java:383) > at > org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish. > enterprise.concurrent.internal. > ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask. > run(ManagedScheduledThreadPoolExecutor.java:534) > at > java.base/java.util.concurrent.ThreadPoolExecutor. > runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker. > run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > at > org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish. > enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread. > run(ManagedThreadFactoryImpl.java:250) And some error related to communication issue with the storage - 5734:2020-04-21 18:43:34,754+0300 DEBUG (jsonrpc/0) [storage.VolumeManifest] Failed to get volume info: Error while processing volume meta data: ("('/rhev/data-center/mnt/mantis-nfs-lif2.lab.eng.tlv2.redhat.com:_nas01_ge__7__nfs__1/50fa1353-0b26-4361-983e-b8a47579ed47/images/21e3eeee-59fa-4569-98cf-57752525c465/a8c61a6a-2625-4f7b-af9c-b26392050cf8',): Connection timed out",) (volume:263)
The issue does not reproduce in latest rhv-4.4.1-1. engine 4.4.1-0.1.el8ev vdsm-4.40.17-1.el8ev.x86_64 libvirt-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 Rerun the same TC's and checked on latest regression test suits. Also related bug[1]is fixed which probably also fixed this issue. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1803092
thank you, Avihai, closing