Created attachment 1506955 [details] engine and vdsm logs Description of problem: Fail 1st LSM(NFS->ISCSI) blocking its source SD(NFS) from SPM and wait for LSM to fail. After it fails unblock connection and wait until all hosts/SD are up/available and try an the same LSM(NFS->ISCSI) this time unblocked. LSM fails with: Engine: 2nd LSM 2018-11-18 14:55:38,024+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-14) [7a76d927-bc21-4384-9520-4b2d38137b15] EVENT_ID: USER_MOVED_DISK(2,008), User admin@inter nal-authz moving disk vm1_Disk2 to domain iscsi_0. LSM failed : 2018-11-18 14:56:01,495+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2018-11-18 14:56:01,510+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM h ost_mixed_3 command HSMGetAllTasksStatusesVDS failed: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\' Logical Volume "f14f759c-42dd -4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']' 2018-11-18 14:56:01,511+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] SPMAsyncTask::PollTask: Polling task '1792442e-5dec-4fec-9bf4-ddad44957aa5' (Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2018-11-18 14:56:01,516+02 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] BaseAsyncTask::logEndTaskFailure: Task '1792442e-5dec-4fec-9bf4-ddad44957aa5 ' (Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800c c5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b0980 0cc5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550' 2018-11-18 14:56:01,517+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] CommandAsyncTask::endActionIfNecessary: All tasks of command 'c9048d09-d 79e-4652-8522-3456e2599c75' has ended -> executing 'endAction' 2018-11-18 14:56:01,517+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-68) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: 'c 9048d09-d79e-4652-8522-3456e2599c75'): calling endAction '. 2018-11-18 14:56:01,519+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3855) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAc tion 'CreateImagePlaceholder', 2018-11-18 14:56:01,532+02 ERROR [org.ovirt.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand] (EE-ManagedThreadFactory-engine-Thread-3855) [7a76d927-bc21-4384-9520-4b2d38137b15] Ending command 'org.ovir t.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand' with failure. VDSM host_mixed_3: 2018-11-18 14:55:51,951+0200 DEBUG (tasks/5) [storage.Misc.excCmd] FAILED: <err> = ' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\n '; <rc> = 5 (commands:89) 2018-11-18 14:55:51,952+0200 ERROR (tasks/5) [storage.Volume] Failed to create volume /rhev/data-center/mnt/blockSD/02b98d9e-82d7-468c-9456-d8fcb00159e6/images/0f98c097-f1fd-4dc9-bd99-c9417321eb20/f14f759c-42dd-4843-9b61-b09800cc5f58: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']' (volume:1208) 2018-11-18 14:55:51,952+0200 ERROR (tasks/5) [storage.Volume] Unexpected error (volume:1245) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1205, in create initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 507, in _create initialTags=(sc.TAG_VOL_UNINIT,)) File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 1161, in createLV raise se.CannotCreateLogicalVolume(vgName, lvName, err) CannotCreateLogicalVolume: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']' 2018-11-18 14:55:51,953+0200 ERROR (tasks/5) [storage.Image] Unexpected error (image:422) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/image.py", line 405, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 816, in createVolume initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1205, in create initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 507, in _create initialTags=(sc.TAG_VOL_UNINIT,)) File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 1161, in createLV raise se.CannotCreateLogicalVolume(vgName, lvName, err) CannotCreateLogicalVolume: Cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b09800cc5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']' Version-Release number of selected component (if applicable): ovirt-engine- 4.3.0-0.0.master.20181008123014 VDSM - 4.30.1-25.gitce9e416.el7 How reproducible: Happened manually twice so far (100%) Steps to Reproduce: 1. Create VM with disks, OS and write data 2. LSM nfs disk to iscsi domain -> block from SPM host(iptables) of source(nfs) domain during host Non Operational - LSM - fails eventually . 3. Unblock(iptables) of source(nfs) domain - host operational again 4. Run same LSM again - fails CreateImagePlaceholder with : cannot create Logical Volume: 'vgname=02b98d9e-82d7-468c-9456-d8fcb00159e6 lvname=f14f759c-42dd-4843-9b61-b0980 0cc5f58 err=[\' Logical Volume "f14f759c-42dd-4843-9b61-b09800cc5f58" already exists in volume group "02b98d9e-82d7-468c-9456-d8fcb00159e6"\']', code = 550' Actual results: 2nd LSM failed Expected results: Additional info:
Trying to verify Bug 1565685 I encountered this bug. As the scenario of this bug and Bug 1565685 are the same I marked this bug as a blocker for Bug 1565685 verification.
This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.
As seen in the logs, there was an attempt to cleanup, but it failed because vdsm failed to respond because it had no access to the storage: 2018-11-18 14:49:58,449+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VmReplicateDiskFinishVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-76) [f87e17d7-aee9-4ed6-a914-a1241b7917c5] Command 'VmReplicateDiskFinishVDSCommand(HostName = host_mixed_1, VmReplicateDiskParameters:{hostId='640c82d8-d796-4056-91df-71de40894752', vmId='543a221d-896f-4889-b8ad-c979025fbb00', storagePoolId='0bbd940f-5c1a-4a04-b936-e76fe5ee71d8', srcStorageDomainId='1d1674d8-2d91-4067-bd6a-2008979532ad', targetStorageDomainId='1d1674d8-2d91-4067-bd6a-2008979532ad', imageGroupId='0f98c097-f1fd-4dc9-bd99-c9417321eb20', imageId='27b76597-af4a-449e-8097-731595c10922'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues There is not much we can do in these cases and manual cleanup has to be performed. Setting as duplicate of bugzilla 1611929 *** This bug has been marked as a duplicate of bug 1611929 ***