Created attachment 1250602 [details] logs Description of problem: I created a VM with 2 disks, one on iscsi storage domain ad the second on NFS. After deactivating and detaching the iscsi storage doamin, the VM becomes unregistered under the iscsi storage domain. I imported the storage domain back to the environment and register the template: POST to /ovirt-engine/api/storagedomains/19906b00-7f68-4af8-9672-cddbcb94f054/templates/1cec261a-c04d-4db1-92fa-ca87d72cfb93/register body: <action> <allow_partial_import>true</allow_partial_import> <async>false</async> <cluster> <name>golden_env_mixed_1</name> </cluster> <grace_period> <expiry>10</expiry> </grace_period> <reassign_bad_macs>true</reassign_bad_macs> </action> When tryingto clone a VM from that template, the operation failed (sometimes) with the error: - engine.log: 2017-02-15 13:10:30,237+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler10) [4247d907] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure. 2017-02-15 13:10:30,260+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler10) [4247d907] Command 'CopyImageGroupWithData' id: '31494d54-be60-45c5-9622-194d610f41e4' child commands '[ad2bf9ec-31da-4bdd-8a65-709e8e2a3c67, 07cefbd1-3f3e-4d8f-9fcb-5d6f3b3f9104]' executions were completed, status 'FAILED' 2017-02-15 13:10:31,281+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (DefaultQuartzScheduler9) [4247d907] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand' with failure. 2017-02-15 13:10:32,300+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [4b3ee393] Command 'CreateCloneOfTemplate' id: '918a847c-ef37-4c38-9a42-d645c80d98d5' child commands '[31494d54-be60-45c5-9622-194d610f41e4]' executions were completed, status 'FAILED' 2017-02-15 13:10:32,300+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [4b3ee393] Command 'CreateCloneOfTemplate' id: '918a847c-ef37-4c38-9a42-d645c80d98d5' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2017-02-15 13:10:33,017+02 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler1) [4247d907] Setting new tasks map. The map contains now 3 tasks 2017-02-15 13:10:34,345+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler10) [55e0b0b4] Command 'AddVmFromTemplate' id: '697a8c00-bbf9-4710-a481-641c1a259508' child commands '[918a847c-ef37-4c38-9a42-d645c80d98d5]' executions were completed, status 'FAILED' 2017-02-15 13:10:35,389+02 ERROR [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler10) [55e0b0b4] Ending command 'org.ovirt.engine.core.bll.AddVmFromTemplateCommand' with failure. 2017-02-15 13:10:35,402+02 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (DefaultQuartzScheduler10) [4b3ee393] Ending command 'org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand' with failure. 2017-02-15 13:10:35,584+02 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler10) [] Lock freed to object 'EngineLock:{exclusiveLocks='[vm_TestCase5200_REST_NFS_1512590643=<VM_NAME, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='[1cec261a-c04d-4db1-92fa-ca87d72cfb93=<TEMPLATE, ACTION_TYPE_FAILED_TEMPLATE_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5200_REST_NFS_1512590643>, 63fb7230-4cb4-41eb-875d-6204803114d0=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5200_REST_NFS_1512590643>]'}' 2017-02-15 13:10:35,601+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler10) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Correlation ID: vms_create_4f68ccc3-f0d9-448d, Job ID: 1976d1f0-485c-4d24-8616-abf28deb1afc, Call Stack: null, Custom Event ID: -1, Message: Failed to complete VM vm_TestCase5200_REST_NFS_1512590643 creation. - vdsm.log (spm): 2017-02-15 13:10:27,009 INFO (tasks/6) [storage.SANLock] Acquiring Lease(name='e5509bb3-19a4-4359-bc95-628acbc696cf', path=u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge5__nfs__3/19906b00-7f 68-4af8-9672-cddbcb94f054/images/ef063408-b7a7-4e5d-8367-f184cecec602/e5509bb3-19a4-4359-bc95-628acbc696cf.lease', offset=0) for host id 3 (clusterlock:343) 2017-02-15 13:10:27,010 ERROR (tasks/6) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_19906b00-7f68-4af8-9672-cddbcb94f054, name=e5509bb3-19a4-4359-bc95-628acbc696cf, mode=exclusive at 0x27bfd50> (guarded:96) 2017-02-15 13:10:27,010 ERROR (tasks/6) [root] Job 'dc77d74b-210c-4b42-8a73-d9c18c5029d7' failed (jobs:217) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in run self._run() File "/usr/share/vdsm/storage/sdm/api/copy_data.py", line 64, in _run with guarded.context(self._source.locks + self._dest.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/share/vdsm/storage/volume.py", line 1392, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 362, in acquire "Cannot acquire %s" % (lease,), str(e)) AcquireLockFailure: Cannot obtain lock: u"id=19906b00-7f68-4af8-9672-cddbcb94f054, rc=28, out=Cannot acquire Lease(name='e5509bb3-19a4-4359-bc95-628acbc696cf', path=u'/rhev/data-center/mnt/10.35.110.11:_Storage__NF S_storage__local__ge5__nfs__3/19906b00-7f68-4af8-9672-cddbcb94f054/images/ef063408-b7a7-4e5d-8367-f184cecec602/e5509bb3-19a4-4359-bc95-628acbc696cf.lease', offset=0), err=(28, 'Sanlock resource not acquired', 'No s pace left on device')" - vdsm.log (hsm): 2017-02-15 13:10:11,515 INFO (jsonrpc/4) [dispatcher] Run and protect: getVolumeInfo(sdUUID='ad78e663-d94d-4542-9402-4c7554263f61', spUUID='6708e4f5-b4c9-40de-84d1-43b9e8ef31a5', imgUUID='403bfc8b-7094-494f-ac86-f b19e444373a', volUUID='7f586784-ea88-4c1e-a599-67f0b80b1ab2', options=None) (logUtils:49) 2017-02-15 13:10:11,521 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='1aa7b758-b6ac-41f3-9666-7cedd05de3af') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3054, in getVolumeInfo volUUID=volUUID).getInfo() File "/usr/share/vdsm/storage/sd.py", line 748, in produceVolume volUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 361, in __init__ manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 61, in __init__ volUUID) File "/usr/share/vdsm/storage/volume.py", line 84, in __init__ self.validate() File "/usr/share/vdsm/storage/volume.py", line 106, in validate self.validateImagePath() File "/usr/share/vdsm/storage/fileVolume.py", line 80, in validateImagePath raise se.ImagePathError(imageDir) ImagePathError: Image path does not exist or cannot be accessed/created: (u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge5__nfs__2/ad78e663-d94d-4542-9402-4c7554263f61/images/403bfc8b-7094-494 f-ac86-fb19e444373a',) 2017-02-15 13:10:11,522 INFO (jsonrpc/4) [storage.TaskManager.Task] (Task='1aa7b758-b6ac-41f3-9666-7cedd05de3af') aborting: Task is aborted: 'Image path does not exist or cannot be accessed/created' - code 254 (ta sk:1175) 2017-02-15 13:10:11,523 ERROR (jsonrpc/4) [storage.Dispatcher] {'status': {'message': "Image path does not exist or cannot be accessed/created: (u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge 5__nfs__2/ad78e663-d94d-4542-9402-4c7554263f61/images/403bfc8b-7094-494f-ac86-fb19e444373a',)", 'code': 254}} (dispatcher:77) Version-Release number of selected component (if applicable): rhevm-4.1.1-0.1.el7 How reproducible: 50% Steps to Reproduce: 1. Create a VM with 2 disks, on on NFS SD and the second on iscsi SD 2. Create a template from that VM 3. Deactivate the iscsi SD and detach it from the DC 4. Remove it without format it 5. Add the iscsi storage domain back to the environment and attach it to the DC 6. Register the unregistered template on that SD 7. Try to clone a VM from that template Actual results: Sometimes, the operation will fail with the errors above Expected results: Operation should succeed Additional info:
It looks like the exception is that there is no space left on your device: ddbcb94f054/images/ef063408-b7a7-4e5d-8367-f184cecec602/e5509bb3-19a4-4359-bc95-628acbc696cf.lease', offset=0), err=(28, 'Sanlock resource not acquired', 'No s pace left on device')"
Just to be sure I've tried to reproduce this on my env and it seems to work, so it just looks like no space issue. Can you please try to reproduce this once again with storage domain that have sufficient space.
Maor, I mentioned that it doesn't reproduced 100%. I have iscsi storage domain with 150GB free space and nfs storage domain with almost 2TB of free space
Update: Seems like it happens also when the template contain only one disk, so this has nothing to do with the disks on 2 different storage domains Attaching new logs with a scenario of: Steps to Reproduce: 1. Create a VM with 1 disk 2. Create a template from that VM 3. Deactivate the SD (where the disk resides) and detach it from the DC 4. Remove it without format it 5. Add the storage domain back to the environment and attach it to the DC 6. Register the unregistered template on that SD 7. Try to clone a VM from that template
Created attachment 1250640 [details] new logs engine.log: 2017-02-15 17:41:36,800+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler10) [1e01c187] FINISH, GetHostJobsVDSCommand, return: {65b5a76b-c57c-4ae6-bb10-99a31cebd468= HostJobInfo:{id='65b5a76b-c57c-4ae6-bb10-99a31cebd468', type='storage', description='copy_data', status='failed', progress='null', error='VDSError:{code='AcquireLockFailure', message='Cannot obtain lock: u"id=d83fa ca9-974c-46a5-992c-8297eb63d3c4, rc=28, out=Cannot acquire Lease(name='48dfc33e-e17c-4bde-9389-4171eb6c0e5f', path=u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge5__nfs__3/d83faca9-974c-46a5-9 92c-8297eb63d3c4/images/f4446ba1-72f9-400c-8d23-a0a48fdcb363/48dfc33e-e17c-4bde-9389-4171eb6c0e5f.lease', offset=0), err=(28, 'Sanlock resource not acquired', 'No space left on device')"'}'}}, log id: 648001e2 2017-02-15 17:41:36,807+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler10) [1e01c187] Command CopyData id: 'a614ae1b-baef-4c0c-b5a9-2ff31e0d9bc7': job '65b5a76b-c57c-4ae6-bb10-99a31c ebd468' execution was completed with VDSM job status 'failed' 2017-02-15 17:41:36,812+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler10) [1e01c187] Command CopyData id: 'a614ae1b-baef-4c0c-b5a9-2ff31e0d9bc7': execution was completed, the comman d status is 'FAILED' 2017-02-15 17:41:37,822+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler8) [1e01c187] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure. 2017-02-15 17:41:38,848+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [1e01c187] Command 'CopyImageGroupWithData' id: '23668f9e-3f1c-42ec-8472-d4ef00e73f0b' child commands '[edd461a0-39e4-4838-b82d-785984cf25fe, a614ae1b-baef-4c0c-b5a9-2ff31e0d9bc7]' executions were completed, status 'FAILED' 2017-02-15 17:41:39,872+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (DefaultQuartzScheduler3) [1e01c187] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand' with failure. 2017-02-15 17:41:40,895+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [5abff62e] Command 'CreateCloneOfTemplate' id: '2bca104c-ddbf-41d2-9690-bcf4960b28af' child commands '[23668f9e-3f1c-42ec-8472-d4ef00e73f0b]' executions were completed, status 'FAILED' 2017-02-15 17:41:40,895+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [5abff62e] Command 'CreateCloneOfTemplate' id: '2bca104c-ddbf-41d2-9690-bcf4960b28af' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2017-02-15 17:41:40,924+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [5ce0bede] Command 'ProcessOvfUpdateForStorageDomain' id: 'bc3b50ae-98d1-4c90-8201-b4d58a22a9f4' child commands '[b058ed5f-959a-4e30-b689-6fe0b12979ac, 3341751d-1b22-4670-a5f3-3e44aa7ce612]' executions were completed, status 'SUCCEEDED' 2017-02-15 17:41:41,955+02 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (DefaultQuartzScheduler3) [5ce0bede] Ending command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand' successfully. 2017-02-15 17:41:41,998+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler3) [32825ff9] Command 'AddVmFromTemplate' id: 'e042d606-5f18-40f2-bbe1-54cd64646599' child commands '[2bca104c-ddbf-41d2-9690-bcf4960b28af]' executions were completed, status 'FAILED' 2017-02-15 17:41:43,044+02 ERROR [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler6) [32825ff9] Ending command 'org.ovirt.engine.core.bll.AddVmFromTemplateCommand' with failure. 2017-02-15 17:41:43,060+02 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (DefaultQuartzScheduler6) [5abff62e] Ending command 'org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand' with failure. 2017-02-15 17:41:43,259+02 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler6) [] Lock freed to object 'EngineLock:{exclusiveLocks='[vm_TestCase5297_REST_NFS_1517325312=<VM_NAME, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='[a48e19e3-4138-41d8-a4bf-b6734ed4c384=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5297_REST_NFS_1517325312>, 08445fa0-3631-47e6-a8bb-c34674d98f2a=<TEMPLATE, ACTION_TYPE_FAILED_TEMPLATE_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5297_REST_NFS_1517325312>]'}' 2017-02-15 17:41:43,278+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Correlation ID: vms_create_bd6648b9-7c67-44ff, Job ID: 22099f7c-4d9d-4f80-925b-88630e8a54fe, Call Stack: null, Custom Event ID: -1, Message: Failed to complete VM vm_TestCase5297_REST_NFS_1517325312 creation. vdsm.log: 2017-02-15 17:41:34,540 INFO (tasks/3) [storage.SANLock] Acquiring Lease(name='48dfc33e-e17c-4bde-9389-4171eb6c0e5f', path=u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge5__nfs__3/d83faca9-97 4c-46a5-992c-8297eb63d3c4/images/f4446ba1-72f9-400c-8d23-a0a48fdcb363/48dfc33e-e17c-4bde-9389-4171eb6c0e5f.lease', offset=0) for host id 3 (clusterlock:343) 2017-02-15 17:41:34,540 ERROR (tasks/3) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_d83faca9-974c-46a5-992c-8297eb63d3c4, name=48dfc33e-e17c-4bde-9389-4171eb6c0e5f, mode=exclusive at 0x22a1a50> (guarded:96) 2017-02-15 17:41:34,541 ERROR (tasks/3) [root] Job '65b5a76b-c57c-4ae6-bb10-99a31cebd468' failed (jobs:217) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in run self._run() File "/usr/share/vdsm/storage/sdm/api/copy_data.py", line 64, in _run with guarded.context(self._source.locks + self._dest.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/share/vdsm/storage/volume.py", line 1392, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 362, in acquire "Cannot acquire %s" % (lease,), str(e)) AcquireLockFailure: Cannot obtain lock: u"id=d83faca9-974c-46a5-992c-8297eb63d3c4, rc=28, out=Cannot acquire Lease(name='48dfc33e-e17c-4bde-9389-4171eb6c0e5f', path=u'/rhev/data-center/mnt/10.35.110.11:_Storage__NFS_storage__local__ge5__nfs__3/d83faca9-974c-46a5-992c-8297eb63d3c4/images/f4446ba1-72f9-400c-8d23-a0a48fdcb363/48dfc33e-e17c-4bde-9389-4171eb6c0e5f.lease', offset=0), err=(28, 'Sanlock resource not acquired', 'No space left on device')"
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Raz, can you please attach the sanlock logs. Should probable be located in /var/log/sanlock.log
Maor, Attached the sanlock.log from all the hosts (3) in the environment And the engine.log from the reproduction: 2017-02-21 18:48:02,472+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler10) [770e0669] Command CopyData id: 'fe5da503-c05e-49e5-b22d-61817a0f5a84': job 'bedd3167-d83a-414c-94d4-42d 7c9f44993' execution was completed with VDSM job status 'failed' 2017-02-21 18:48:02,477+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler10) [770e0669] Command CopyData id: 'fe5da503-c05e-49e5-b22d-61817a0f5a84': execution was completed, the com mand status is 'FAILED' 2017-02-21 18:48:03,491+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler4) [770e0669] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure. 2017-02-21 18:48:03,518+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [770e0669] Command 'CopyImageGroupWithData' id: '865d2eeb-9b4a-4ea4-a921-fe1a25b85997' child commands '[13c0cb28-a107-402e-a3af-dd0eaf9d9cdd, fe5da503-c05e-49e5-b22d-61817a0f5a84]' executions were completed, status 'FAILED' 2017-02-21 18:48:04,545+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (DefaultQuartzScheduler1) [770e0669] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.Cop yImageGroupWithDataCommand' with failure. 2017-02-21 18:48:05,568+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler7) [5cfd5ee8] Command 'CreateCloneOfTemplate' id: '50af7a07-c957-4e8f-acc0-d420b8cfb62 3' child commands '[865d2eeb-9b4a-4ea4-a921-fe1a25b85997]' executions were completed, status 'FAILED' 2017-02-21 18:48:05,568+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler7) [5cfd5ee8] Command 'CreateCloneOfTemplate' id: '50af7a07-c957-4e8f-acc0-d420b8cfb62 3' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2017-02-21 18:48:07,568+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Stage: Misc configuration 2017-02-21 18:48:07,573+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Stage: Transaction commit 2017-02-21 18:48:07,593+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Stage: Closing up 2017-02-21 18:48:07,598+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Stage: Pre-termination 2017-02-21 18:48:07,617+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Retrieving installation logs to: '/var/log/ovirt-engine/host-deploy/ovirt-host-mgmt-20170221184807-sto rage-ge5-vdsm1.qa.lab.tlv.redhat.com-42cfb430.log' 2017-02-21 18:48:07,681+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler7) [36a465db] Command 'AddVmFromTemplate' id: '02dc548f-c341-4ac6-9069-fb435486fc7f' c hild commands '[50af7a07-c957-4e8f-acc0-d420b8cfb623]' executions were completed, status 'FAILED' 2017-02-21 18:48:07,721+02 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [42cfb430] Stage: Termination 2017-02-21 18:48:07,781+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-7-thread-2) [42cfb430] EVENT_ID: HOST_AVAILABLE_UPDATES_FINISHED(885), Correlation ID: null, Call Sta ck: null, Custom Event ID: -1, Message: Check for available updates on host host_mixed_1 was completed successfully with message 'no updates found.'. 2017-02-21 18:48:08,722+02 ERROR [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler9) [36a465db] Ending command 'org.ovirt.engine.core.bll.AddVmFromTemplateCommand' with failure. 2017-02-21 18:48:08,738+02 ERROR [org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand] (DefaultQuartzScheduler9) [5cfd5ee8] Ending command 'org.ovirt.engine.core.bll.CreateCloneOfTemplateCommand' with failure . 2017-02-21 18:48:08,943+02 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (DefaultQuartzScheduler9) [] Lock freed to object 'EngineLock:{exclusiveLocks='[vm_TestCase5200_REST_ISCSI_2118474182=<VM_NAM E, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='[f82c1509-e3ae-4ad0-a2c3-720d00080a06=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5200_REST_ISCSI_2118474182>, 08c187dc-1945-4692-8 581-e45aa50d75af=<TEMPLATE, ACTION_TYPE_FAILED_TEMPLATE_IS_USED_FOR_CREATE_VM$VmName vm_TestCase5200_REST_ISCSI_2118474182>]'}' 2017-02-21 18:48:08,968+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler9) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Correlation ID: vms_create_ffbb612 5-6de3-4a8a, Job ID: e27c2727-5b43-4435-8841-5364cbc46956, Call Stack: null, Custom Event ID: -1, Message: Failed to complete VM vm_TestCase5200_REST_ISCSI_2118474182 creation.
Created attachment 1256201 [details] logs 21.2.17
Thanks for the logs Raz, After investigating it here are the following conclusions for now: 1. The sanlock exception 'No space left on device' is misleading, we can get it also if the lease file does not exists. 2. Since there was a behavior change and now the host acquires a lease for the copy operation, there is a dependency on the sanlock acquire operation. The host must first acquire a lease before the engine can call the copy of the volume. The lease can be acquired only after repoStats returns acquired='True' Once we activate the storage domain the host does not guarantee the sanlock was acquired yet, so we can not use this storage domain from this host to make any storage operations that requires leases until the 'acquired' flag is true for this storage domain.
(In reply to Maor from comment #10) > Thanks for the logs Raz, > > After investigating it here are the following conclusions for now: > > 1. The sanlock exception 'No space left on device' is misleading, we can get > it also if the lease file does not exists. > > 2. Since there was a behavior change and now the host acquires a lease > for the copy operation, there is a dependency on the sanlock acquire > operation. > The host must first acquire a lease before the engine can call the > copy of the volume. > The lease can be acquired only after repoStats returns acquired='True' > > Once we activate the storage domain the host does not guarantee the > sanlock was acquired yet, so we can not use this storage domain from > this host to make any storage operations that requires leases until > the 'acquired' flag is true for this storage domain. This bug is mainly caused since the acquire has not been completed before the creation of the disk took place. Eventually once the acquire succeeded the operation should succeed. The appropriate fix is to maintain a memory table of host per domain for acquire flag (true or false) in the monitoring phase. Therefore I think it should be targeted to 4.2
The Same issue occurred multiple times in latest tier2 automation run (ovirt-engine-4.1.1.8-0.1.el7.noarch) marking it as automation blocker.
(In reply to Maor from comment #10) > Thanks for the logs Raz, > > After investigating it here are the following conclusions for now: > > 1. The sanlock exception 'No space left on device' is misleading, we can get > it also if the lease file does not exists. We need a bug on sanlock for this issue. > 2. Since there was a behavior change and now the host acquires a lease > for the copy operation, there is a dependency on the sanlock acquire > operation. > The host must first acquire a lease before the engine can call the > copy of the volume. > The lease can be acquired only after repoStats returns acquired='True' > > Once we activate the storage domain the host does not guarantee the > sanlock was acquired yet, so we can not use this storage domain from > this host to make any storage operations that requires leases until > the 'acquired' flag is true for this storage domain. Would a simple sleep in the automation's code circumvent this?
(In reply to Allon Mureinik from comment #14) > (In reply to Maor from comment #10) > > Thanks for the logs Raz, > > > > After investigating it here are the following conclusions for now: > > > > 1. The sanlock exception 'No space left on device' is misleading, we can get > > it also if the lease file does not exists. > We need a bug on sanlock for this issue. Added a new bug: https://bugzilla.redhat.com/1442971 - misleading "No space left on device" message return from sanlock. > > > 2. Since there was a behavior change and now the host acquires a lease > > for the copy operation, there is a dependency on the sanlock acquire > > operation. > > The host must first acquire a lease before the engine can call the > > copy of the volume. > > The lease can be acquired only after repoStats returns acquired='True' > > > > Once we activate the storage domain the host does not guarantee the > > sanlock was acquired yet, so we can not use this storage domain from > > this host to make any storage operations that requires leases until > > the 'acquired' flag is true for this storage domain. > > Would a simple sleep in the automation's code circumvent this? It can be a suitable temporary fix for now. David, I know that you mentioned that 26 seconds might be too short for lockspace to be ready to be used. How much time would you recommend to sleep before acquire the lease? (until the we will add a proper fix).
The minimum time is about 21 seconds, so with no external delays, 26 should be good. But in cases we saw before, I think that there were i/o delays, and it's hard to predict in that case. In theory, given max i/o delays (10 sec), a clean add_lockspace with a free host_id could take up to 50 seconds (3 i/o ops at 10 sec each, plus the 20 second algorithm wait.)
This error also appears on cold move operations: reproduced by ~50%. Steps to reproduce: 1. Create a VM with disk 2. Move the VM disk to different storage domain 2017-04-27 18:20:59,023+03 INFO [org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand] (DefaultQuartzScheduler8) [] Lock freed to object 'EngineLock:{exclusiveLocks='[56b0acfa-36f9-4e00-883b-73fdd6b0c7dc=<DISK, ACTION_TYPE_FAILED_DISK_IS_BEING_MIGRATED$DiskName 5988_disk_virtiocow_2718195908>]', sharedLocks='[7a8ad949-b33c-4a22-993a-3562f6a4dd46=<VM, ACTION_TYPE_FAILED_DISK_IS_BEING_MIGRATED$DiskName 5988_disk_virtiocow_2718195908>]'}' 2017-04-27 18:20:59,028+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler8) [] EVENT_ID: USER_MOVED_DISK_FINISHED_SUCCESS(2,010), Correlation ID: disks_syncAction_4f9ddf09-f0c7-48d8, Job ID: e56592e6-8f76-4db8-8383-117de9be2ed0, Call Stack: null, Custom Event ID: -1, Message: User admin@internal-authz finished moving disk 5988_disk_virtiocow_2718195908 to domain iscsi_2. 2017-04-27 18:21:00,033+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'a1969c0b-8c22-42d2-a21d-9e9d4a31c6d6:export_domain' report isn't an actual report 2017-04-27 18:21:00,033+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '7e06e771-287b-4119-82f3-ef7ba3bd515e:test_gluster_2' report isn't an actual report 2017-04-27 18:21:00,033+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '82d074d4-05b9-4be1-8e3f-8a01284d945a:test_gluster_1' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'a8463627-25ca-4488-bd1d-e6a2eac6d83b:test_gluster_0' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'b3925f20-df6c-4884-9b81-f6902de491c6:iscsi_1' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'd73f3d30-7c4d-4396-a74d-42e6f35b0925:iscsi_0' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '6e34c8c9-5060-479a-a12a-f768225a489f:nfs_2' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'fdfc4830-f8e8-481d-9f6f-9a7596bd7e01:nfs_1' report isn't an actual report 2017-04-27 18:21:00,034+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '18378f41-8261-4360-a017-3f28d8364ed4:nfs_0' report isn't an actual report 2017-04-27 18:21:00,035+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '5770c3a0-e767-40de-b495-d96d7a27e831:iscsi_2' report isn't an actual report 2017-04-27 18:21:05,243+03 INFO [org.ovirt.engine.core.bll.storage.disk.MoveDisksCommand] (default task-4) [disks_syncAction_23987d98-0e8d-4d73] Running command: MoveDisksCommand internal: false. Entities affected : ID: aec5ab78-da82-4268-ac49-59d2d4e4eef4 Type: DiskAction group CONFIGURE_DISK_STORAGE with role type USER 2017-04-27 18:21:05,259+03 INFO [org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand] (default task-4) [disks_syncAction_23987d98-0e8d-4d73] Lock Acquired to object 'EngineLock:{exclusiveLocks='[aec5ab78-da82-4268-ac49-59d2d4e4eef4=<DISK, ACTION_TYPE_FAILED_DISK_IS_BEING_MIGRATED$DiskName 5988_disk_virtio_scsicow_2718195980>]', sharedLocks='[7a8ad949-b33c-4a22-993a-3562f6a4dd46=<VM, ACTION_TYPE_FAILED_DISK_IS_BEING_MIGRATED$DiskName 5988_disk_virtio_scsicow_2718195980>]'}' 2017-04-27 18:21:05,318+03 INFO [org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand] (org.ovirt.thread.pool-6-thread-28) [disks_syncAction_23987d98-0e8d-4d73] Running command: MoveOrCopyDiskCommand internal: false. Entities affected : ID: aec5ab78-da82-4268-ac49-59d2d4e4eef4 Type: DiskAction group CONFIGURE_DISK_STORAGE with role type USER, ID: b3925f20-df6c-4884-9b81-f6902de491c6 Type: StorageAction group CREATE_DISK with role type USER 2017-04-27 18:21:05,335+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.MoveImageGroupCommand] (org.ovirt.thread.pool-6-thread-28) [disks_syncAction_23987d98-0e8d-4d73] Running command: MoveImageGroupCommand internal: true. Entities affected : ID: b3925f20-df6c-4884-9b81-f6902de491c6 Type: Storage 2017-04-27 18:21:05,351+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (org.ovirt.thread.pool-6-thread-28) [disks_syncAction_23987d98-0e8d-4d73] Running command: CopyImageGroupWithDataCommand internal: true. 2017-04-27 18:21:05,379+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CloneImageGroupVolumesStructureCommand] (org.ovirt.thread.pool-6-thread-28) [disks_syncAction_23987d98-0e8d-4d73] Running command: CloneImageGroupVolumesStructureCommand internal: true. 2017-04-27 18:21:05,409+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-28) [disks_syncAction_23987d98-0e8d-4d73] EVENT_ID: USER_MOVED_DISK(2,008), Correlation ID: disks_syncAction_23987d98-0e8d-4d73, Job ID: 55936346-0439-4aec-a0a9-99703f7026da, Call Stack: null, Custom Event ID: -1, Message: User admin@internal-authz moving disk 5988_disk_virtio_scsicow_2718195980 to domain iscsi_1. 2017-04-27 18:21:06,044+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Command 'CopyImageGroupWithData' (id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7') waiting on child command id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef' type:'CloneImageGroupVolumesStructure' to complete 2017-04-27 18:21:06,046+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CloneImageGroupVolumesStructureCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Starting child command 1 of 1, image 'f60ebc6e-059f-496d-b9bc-4edbee283706' 2017-04-27 18:21:06,086+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] START, GetVolumeInfoVDSCommand(HostName = host_mixed_1, GetVolumeInfoVDSCommandParameters:{runAsync='true', hostId='2d690294-ccde-4b0d-9faf-94c174a555dc', storagePoolId='d5c26715-43c0-4b8b-9edb-e7f1d689153c', storageDomainId='6e34c8c9-5060-479a-a12a-f768225a489f', imageGroupId='aec5ab78-da82-4268-ac49-59d2d4e4eef4', imageId='f60ebc6e-059f-496d-b9bc-4edbee283706'}), log id: 5c476676 2017-04-27 18:21:07,138+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] FINISH, GetVolumeInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@78a3d6b0, log id: 5c476676 2017-04-27 18:21:07,155+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CreateVolumeContainerCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Running command: CreateVolumeContainerCommand internal: true. 2017-04-27 18:21:07,169+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] START, CreateSnapshotVDSCommand( CreateSnapshotVDSCommandParameters:{runAsync='true', storagePoolId='d5c26715-43c0-4b8b-9edb-e7f1d689153c', ignoreFailoverLimit='false', storageDomainId='b3925f20-df6c-4884-9b81-f6902de491c6', imageGroupId='aec5ab78-da82-4268-ac49-59d2d4e4eef4', imageSizeInBytes='1073741824', volumeFormat='COW', newImageId='f60ebc6e-059f-496d-b9bc-4edbee283706', imageType='Sparse', newImageDescription='null', imageInitialSizeInBytes='186182', imageId='00000000-0000-0000-0000-000000000000', sourceImageGroupId='00000000-0000-0000-0000-000000000000'}), log id: 485216f1 2017-04-27 18:21:07,170+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] -- executeIrsBrokerCommand: calling 'createVolume' with two new parameters: description and UUID 2017-04-27 18:21:08,219+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateSnapshotVDSCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] FINISH, CreateSnapshotVDSCommand, return: f60ebc6e-059f-496d-b9bc-4edbee283706, log id: 485216f1 2017-04-27 18:21:08,224+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command '92bcd488-6a03-45c1-8a4f-e932f40e94bc' 2017-04-27 18:21:08,224+03 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] CommandMultiAsyncTasks::attachTask: Attaching task 'e889bbb0-645e-4445-8e49-06331eeb3865' to command '92bcd488-6a03-45c1-8a4f-e932f40e94bc'. 2017-04-27 18:21:08,236+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Adding task 'e889bbb0-645e-4445-8e49-06331eeb3865' (Parent Command 'CreateVolumeContainer', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. 2017-04-27 18:21:08,244+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] BaseAsyncTask::startPollingTask: Starting to poll task 'e889bbb0-645e-4445-8e49-06331eeb3865'. 2017-04-27 18:21:08,254+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveImageGroup' (id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63') waiting on child command id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7' type:'CopyImageGroupWithData' to complete 2017-04-27 18:21:08,259+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveOrCopyDisk' (id: '2d553d2f-96b8-483d-a443-8f92f7afb4ff') waiting on child command id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63' type:'MoveImageGroup' to complete 2017-04-27 18:21:10,263+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler5) [disks_syncAction_23987d98-0e8d-4d73] Command 'CopyImageGroupWithData' (id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7') waiting on child command id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef' type:'CloneImageGroupVolumesStructure' to complete 2017-04-27 18:21:10,267+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler5) [disks_syncAction_23987d98-0e8d-4d73] Command 'CloneImageGroupVolumesStructure' (id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef') waiting on child command id: '92bcd488-6a03-45c1-8a4f-e932f40e94bc' type:'CreateVolumeContainer' to complete 2017-04-27 18:21:10,277+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler5) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveImageGroup' (id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63') waiting on child command id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7' type:'CopyImageGroupWithData' to complete 2017-04-27 18:21:10,282+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler5) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveOrCopyDisk' (id: '2d553d2f-96b8-483d-a443-8f92f7afb4ff') waiting on child command id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63' type:'MoveImageGroup' to complete 2017-04-27 18:21:10,703+03 INFO [org.ovirt.engine.core.bll.snapshots.CreateAllSnapshotsFromVmCommand] (default task-23) [snapshots_create_54db28a9-18a1-4e00] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[7a8ad949-b33c-4a22-993a-3562f6a4dd46=<VM, ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM$VmName vm_TestCase19652_while_sna_2718193411>]', sharedLocks='null'}' 2017-04-27 18:21:10,703+03 WARN [org.ovirt.engine.core.bll.snapshots.CreateAllSnapshotsFromVmCommand] (default task-23) [snapshots_create_54db28a9-18a1-4e00] Validation of action 'CreateAllSnapshotsFromVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_DISK_IS_BEING_MIGRATED,$DiskName 5988_disk_virtio_scsicow_2718195980 2017-04-27 18:21:10,709+03 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-23) [] Operation Failed: [Cannot create Snapshot. Disk 5988_disk_virtio_scsicow_2718195980 is being moved or copied.] 2017-04-27 18:21:12,358+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler9) [] Setting new tasks map. The map contains now 10 tasks 2017-04-27 18:21:14,287+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command 'CopyImageGroupWithData' (id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7') waiting on child command id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef' type:'CloneImageGroupVolumesStructure' to complete 2017-04-27 18:21:14,289+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command 'CloneImageGroupVolumesStructure' (id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef') waiting on child command id: '92bcd488-6a03-45c1-8a4f-e932f40e94bc' type:'CreateVolumeContainer' to complete 2017-04-27 18:21:14,295+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveImageGroup' (id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63') waiting on child command id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7' type:'CopyImageGroupWithData' to complete 2017-04-27 18:21:14,300+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveOrCopyDisk' (id: '2d553d2f-96b8-483d-a443-8f92f7afb4ff') waiting on child command id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63' type:'MoveImageGroup' to complete 2017-04-27 18:21:15,130+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'a1969c0b-8c22-42d2-a21d-9e9d4a31c6d6:export_domain' report isn't an actual report 2017-04-27 18:21:15,130+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '7e06e771-287b-4119-82f3-ef7ba3bd515e:test_gluster_2' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '82d074d4-05b9-4be1-8e3f-8a01284d945a:test_gluster_1' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'a8463627-25ca-4488-bd1d-e6a2eac6d83b:test_gluster_0' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'b3925f20-df6c-4884-9b81-f6902de491c6:iscsi_1' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'd73f3d30-7c4d-4396-a74d-42e6f35b0925:iscsi_0' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '6e34c8c9-5060-479a-a12a-f768225a489f:nfs_2' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain 'fdfc4830-f8e8-481d-9f6f-9a7596bd7e01:nfs_1' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '18378f41-8261-4360-a017-3f28d8364ed4:nfs_0' report isn't an actual report 2017-04-27 18:21:15,131+03 WARN [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (DefaultQuartzScheduler7) [] Domain '5770c3a0-e767-40de-b495-d96d7a27e831:iscsi_2' report isn't an actual report 2017-04-27 18:21:18,217+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler4) [] Polling and updating Async Tasks: 10 tasks, 1 tasks to poll now 2017-04-27 18:21:19,227+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler4) [] SPMAsyncTask::PollTask: Polling task 'e889bbb0-645e-4445-8e49-06331eeb3865' (Parent Command 'CreateVolumeContainer', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. 2017-04-27 18:21:19,231+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler4) [] BaseAsyncTask::onTaskEndSuccess: Task 'e889bbb0-645e-4445-8e49-06331eeb3865' (Parent Command 'CreateVolumeContainer', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2017-04-27 18:21:19,232+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler4) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '92bcd488-6a03-45c1-8a4f-e932f40e94bc' has ended -> executing 'endAction' 2017-04-27 18:21:19,232+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler4) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '92bcd488-6a03-45c1-8a4f-e932f40e94bc'): calling endAction '. 2017-04-27 18:21:19,232+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-11) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 'CreateVolumeContainer', 2017-04-27 18:21:19,239+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CreateVolumeContainerCommand] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CreateVolumeContainerCommand' successfully. 2017-04-27 18:21:19,248+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type 'CreateVolumeContainer' completed, handling the result. 2017-04-27 18:21:19,248+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type 'CreateVolumeContainer' succeeded, clearing tasks. 2017-04-27 18:21:19,248+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] SPMAsyncTask::ClearAsyncTask: Attempting to clear task 'e889bbb0-645e-4445-8e49-06331eeb3865' 2017-04-27 18:21:19,249+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] START, SPMClearTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{runAsync='true', storagePoolId='d5c26715-43c0-4b8b-9edb-e7f1d689153c', ignoreFailoverLimit='false', taskId='e889bbb0-645e-4445-8e49-06331eeb3865'}), log id: 49ae97da 2017-04-27 18:21:19,249+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] START, HSMClearTaskVDSCommand(HostName = host_mixed_1, HSMTaskGuidBaseVDSCommandParameters:{runAsync='true', hostId='2d690294-ccde-4b0d-9faf-94c174a555dc', taskId='e889bbb0-645e-4445-8e49-06331eeb3865'}), log id: 7e4329dc 2017-04-27 18:21:20,268+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] FINISH, HSMClearTaskVDSCommand, log id: 7e4329dc 2017-04-27 18:21:20,268+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] FINISH, SPMClearTaskVDSCommand, log id: 49ae97da 2017-04-27 18:21:20,272+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] BaseAsyncTask::removeTaskFromDB: Removed task 'e889bbb0-645e-4445-8e49-06331eeb3865' from DataBase 2017-04-27 18:21:20,272+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-11) [disks_syncAction_23987d98-0e8d-4d73] CommandAsyncTask::HandleEndActionResult [within thread]: Removing CommandMultiAsyncTasks object for entity '92bcd488-6a03-45c1-8a4f-e932f40e94bc' 2017-04-27 18:21:22,308+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler2) [disks_syncAction_23987d98-0e8d-4d73] Command 'CopyImageGroupWithData' (id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7') waiting on child command id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef' type:'CloneImageGroupVolumesStructure' to complete 2017-04-27 18:21:22,311+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler2) [disks_syncAction_23987d98-0e8d-4d73] Command 'CloneImageGroupVolumesStructure' id: '6b1559da-8631-4cbf-b26e-d1aae8fa78ef' child commands '[92bcd488-6a03-45c1-8a4f-e932f40e94bc]' executions were completed, status 'SUCCEEDED' 2017-04-27 18:21:22,323+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler2) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveImageGroup' (id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63') waiting on child command id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7' type:'CopyImageGroupWithData' to complete 2017-04-27 18:21:22,327+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler2) [disks_syncAction_23987d98-0e8d-4d73] Command 'MoveOrCopyDisk' (id: '2d553d2f-96b8-483d-a443-8f92f7afb4ff') waiting on child command id: '26ecc5b8-4ae2-4944-9f92-53f55962ac63' type:'MoveImageGroup' to complete 2017-04-27 18:21:23,332+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CloneImageGroupVolumesStructureCommand] (DefaultQuartzScheduler8) [disks_syncAction_23987d98-0e8d-4d73] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CloneImageGroupVolumesStructureCommand' successfully. 2017-04-27 18:21:24,356+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Running command: CopyImageGroupVolumesDataCommand internal: true. 2017-04-27 18:21:25,371+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] Starting child command 1 of 1, image 'f60ebc6e-059f-496d-b9bc-4edbee283706' 2017-04-27 18:21:25,391+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] Running command: CopyDataCommand internal: true. 2017-04-27 18:21:25,394+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] START, GetVolumeInfoVDSCommand(HostName = host_mixed_3, GetVolumeInfoVDSCommandParameters:{runAsync='true', hostId='39610ce7-c2f3-4d11-b0be-2b00e91a0262', storagePoolId='d5c26715-43c0-4b8b-9edb-e7f1d689153c', storageDomainId='b3925f20-df6c-4884-9b81-f6902de491c6', imageGroupId='aec5ab78-da82-4268-ac49-59d2d4e4eef4', imageId='f60ebc6e-059f-496d-b9bc-4edbee283706'}), log id: 5f27d7c7 2017-04-27 18:21:26,618+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] FINISH, GetVolumeInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@33cff16e, log id: 5f27d7c7 2017-04-27 18:21:26,628+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CopyVolumeDataVDSCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] START, CopyVolumeDataVDSCommand(HostName = host_mixed_1, CopyVolumeDataVDSCommandParameters:{runAsync='true', hostId='2d690294-ccde-4b0d-9faf-94c174a555dc', storageDomainId='null', jobId='9e1eb880-ecdb-4497-b1e4-8dd4cb9cddea', srcInfo='VdsmImageLocationInfo [storageDomainId=6e34c8c9-5060-479a-a12a-f768225a489f, imageGroupId=aec5ab78-da82-4268-ac49-59d2d4e4eef4, imageId=f60ebc6e-059f-496d-b9bc-4edbee283706, generation=null]', dstInfo='VdsmImageLocationInfo [storageDomainId=b3925f20-df6c-4884-9b81-f6902de491c6, imageGroupId=aec5ab78-da82-4268-ac49-59d2d4e4eef4, imageId=f60ebc6e-059f-496d-b9bc-4edbee283706, generation=0]', collapse='false'}), log id: 2675af56 2017-04-27 18:21:26,628+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CopyVolumeDataVDSCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] -- executeVdsBrokerCommand: calling 'copyVolumeData' 2017-04-27 18:21:26,991+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CopyVolumeDataVDSCommand] (DefaultQuartzScheduler3) [disks_syncAction_23987d98-0e8d-4d73] FINISH, CopyVolumeDataVDSCommand, log id: 2675af56 2017-04-27 18:21:28,004+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command 'CopyImageGroupWithData' (id: '1923c6a9-0d4e-421c-aa20-bf9c87c215e7') waiting on child command id: '7a364022-650b-4a69-b632-b97fb9bf65c7' type:'CopyImageGroupVolumesData' to complete 2017-04-27 18:21:28,008+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] START, GetHostJobsVDSCommand(HostName = host_mixed_1, GetHostJobsVDSCommandParameters:{runAsync='true', hostId='2d690294-ccde-4b0d-9faf-94c174a555dc', type='storage', jobIds='[9e1eb880-ecdb-4497-b1e4-8dd4cb9cddea]'}), log id: 42e9cf83 2017-04-27 18:21:28,269+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] FINISH, GetHostJobsVDSCommand, return: {9e1eb880-ecdb-4497-b1e4-8dd4cb9cddea=HostJobInfo:{id='9e1eb880-ecdb-4497-b1e4-8dd4cb9cddea', type='storage', description='copy_data', status='failed', progress='null', error='VDSError:{code='AcquireLockFailure', message='Cannot obtain lock: "id=b3925f20-df6c-4884-9b81-f6902de491c6, rc=28, out=Cannot acquire Lease(name='f60ebc6e-059f-496d-b9bc-4edbee283706', path='/dev/b3925f20-df6c-4884-9b81-f6902de491c6/leases', offset=112197632), err=(28, 'Sanlock resource not acquired', 'No space left on device')"'}'}}, log id: 42e9cf83 2017-04-27 18:21:28,273+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command CopyData id: 'e73550b9-31c3-49d6-84a4-f94563a7706d': job '9e1eb880-ecdb-4497-b1e4-8dd4cb9cddea' execution was completed with VDSM job status 'failed' 2017-04-27 18:21:28,276+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler6) [disks_syncAction_23987d98-0e8d-4d73] Command CopyData id: 'e73550b9-31c3-49d6-84a4-f94563a7706d': execution was completed, the command status is 'FAILED'
Also fails cold merge operations engine.log: 2017-05-22 19:09:02,822+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [snapshots_delete_6d3ae6f3-2e39-487b] Command 'RemoveSnapshot' (id: '1b95e34b-fda2-4955-91f7-ade1b4fe6835') waiting on child command id: 'ca04ae0f-7270-4698-b321-e21397601a6d' type:'ColdMergeSnapshotSingleDisk' to complete 2017-05-22 19:09:02,826+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [snapshots_delete_6d3ae6f3-2e39-487b] Command 'ColdMergeSnapshotSingleDisk' (id: 'ca04ae0f-7270-4698-b321-e21397601a6d') waiting on child command id: 'ef70a801-39d4-4995-ad74-78c7782d7ca7' type:'PrepareMerge' to complete 2017-05-22 19:09:04,971+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler1) [] Polling and updating Async Tasks: 7 tasks, 5 tasks to poll now 2017-05-22 19:09:05,994+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler1) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2017-05-22 19:09:06,001+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler1) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM host_mixed_1 command HSMGetAllTasksStatusesVDS failed: Cannot obtain lock 2017-05-22 19:09:06,001+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler1) [] SPMAsyncTask::PollTask: Polling task 'b664f3cc-0548-440a-83a4-0f4a9c502904' (Parent Command 'PrepareMerge', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2017-05-22 19:09:06,005+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler1) [] BaseAsyncTask::logEndTaskFailure: Task 'b664f3cc-0548-440a-83a4-0f4a9c502904' (Parent Command 'PrepareMerge', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot obtain lock, code = 651', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot obtain lock, code = 651' 2017-05-22 19:09:06,006+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler1) [] CommandAsyncTask::endActionIfNecessary: All tasks of command 'ef70a801-39d4-4995-ad74-78c7782d7ca7' has ended -> executing 'endAction' vdsm.log 2017-05-22 19:09:00,332+0300 DEBUG (tasks/3) [storage.ResourceManager] Trying to release resource '00_storage.82cba7ce-8693-46b2-94ae-610993913b6c' (resourceManager:566) 2017-05-22 19:09:00,332+0300 DEBUG (tasks/3) [storage.ResourceManager] Released resource '00_storage.82cba7ce-8693-46b2-94ae-610993913b6c' (1 active users) (resourceManager:585) 2017-05-22 19:09:00,333+0300 ERROR (tasks/3) [storage.TaskManager.Task] (Task='b664f3cc-0548-440a-83a4-0f4a9c502904') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 333, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1884, in prepareMerge merge.prepare(subchainInfo) File "/usr/share/vdsm/storage/merge.py", line 181, in prepare with guarded.context(subchain.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/share/vdsm/storage/volume.py", line 1418, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 362, in acquire "Cannot acquire %s" % (lease,), str(e)) AcquireLockFailure: Cannot obtain lock: u"id=82cba7ce-8693-46b2-94ae-610993913b6c, rc=28, out=Cannot acquire Lease(name='56483875-9204-4b1b-9656-de2d1414ef51', path=u'/rhev/data-center/mnt/vserver-nas01.qa.lab.tlv.redhat.com:_nas01_ge__1__nfs__1/82cba7ce-8693-46b2-94ae-610993913b6c/images/8f58c5f8-05e7-48c1-8fd8-43c4087f0053/56483875-9204-4b1b-9656-de2d1414ef51.lease', offset=0), err=(28, 'Sanlock resource not acquired', 'No space left on device')" 2017-05-22 19:09:00,333+0300 DEBUG (tasks/3) [storage.TaskManager.Task] (Task='b664f3cc-0548-440a-83a4-0f4a9c502904') Task._run: b664f3cc-0548-440a-83a4-0f4a9c502904 () {} failed - stopping task (task:889) ** Logs attached.
Created attachment 1281970 [details] engine and vdsm.log cold merge
Before any job is running using a random HSM we need to make sure that this HSM has acquired a host id on the storage domain. There are different ways to handle this issue: 1. Through the Engine - Monitoring - Every time a Host will be activated or a storage domain will be moved from unknown/inactive state to active the monitoring takes place, we can change the monitoring to also take in consider the acquired flag returned from repoStats Regarding activate, there is a known issue which described in BZ1043047 - We can add a memory matrix that will map acquire flag of Host to Storage domains so once a job will be executed the engine will filter only hosts which has acquire flag to this storage domain. 2. Through VDSM - Use the cluster lock module so every lease acquire will automatically wait for the host id if it was not acquired yet. Acquire will wait on the event in cluster lock so the event will be set in hasHostId() and in acquireHostID
Raz, can you test a proposed solution on vdsm side? https://gerrit.ovirt.org/77765
*** Bug 1459058 has been marked as a duplicate of this bug. ***
The issue also occurs on Live storage migration: VDSM: 2017-06-05 11:54:46,591+0300 INFO (jsonrpc/2) [vdsm.api] START sdm_copy_data(job_id='12fa9df4-9357-42de-9950-a83d3dd02533', source={'img_id': '2cc28ab4-7e16-4106-b8f2-00e49c64a29c', 'sd_id': '28df2ec8-431c-4f90-b18a-58d9d13882d7', 'endpoint_type': 'div', 'vol_id': 'c5a08ac5-20bc-4d44-9f4c-bc6a53d661d1'}, destination={'generation': 0, 'img_id': '2cc28ab4-7e16-4106-b8f2-00e49c64a29c', 'sd_id': '45f6b9b4-6b94-44db-a740-2494f05bffa4', 'endpoint_type': 'div', 'vol_id': 'c5a08ac5-20bc-4d44-9f4c-bc6a53d661d1'}) from=::ffff:10.35.69.13,33844, flow_id=disks_syncAction_ac2ff124-a7a0-47ba (api:46) 2017-06-05 11:54:46,592+0300 INFO (jsonrpc/2) [vdsm.api] FINISH sdm_copy_data return=None from=::ffff:10.35.69.13,33844, flow_id=disks_syncAction_ac2ff124-a7a0-47ba (api:52) 2017-06-05 11:54:46,592+0300 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call SDM.copy_data succeeded in 0.00 seconds (__init__:539) 2017-06-05 11:54:46,593+0300 INFO (tasks/9) [storage.ThreadPool.WorkerThread] START task b31cd91b-f35b-4f15-95b5-5de63808760f (cmd=<bound method Task.commit of <storage.task.Task instance at 0x7f71e4244440>>, args=None) (threadPool:208) 2017-06-05 11:54:46,593+0300 INFO (tasks/9) [root] Running job '12fa9df4-9357-42de-9950-a83d3dd02533'... (jobs:179) 2017-06-05 11:54:46,820+0300 INFO (tasks/9) [storage.LVM] Refreshing lvs: vg=45f6b9b4-6b94-44db-a740-2494f05bffa4 lvs=['leases'] (lvm:1291) 2017-06-05 11:54:46,821+0300 INFO (tasks/9) [storage.LVM] Refreshing LVs (vg=45f6b9b4-6b94-44db-a740-2494f05bffa4, lvs=['leases']) (lvm:1319) 2017-06-05 11:54:46,929+0300 INFO (tasks/9) [storage.SANLock] Acquiring Lease(name='c5a08ac5-20bc-4d44-9f4c-bc6a53d661d1', path='/dev/45f6b9b4-6b94-44db-a740-2494f05bffa4/leases', offset=113246208) for host id 1 (clusterlock:343) 2017-06-05 11:54:46,930+0300 ERROR (tasks/9) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_45f6b9b4-6b94-44db-a740-2494f05bffa4, name=c5a08ac5-20bc-4d44-9f4c-bc6a53d661d1, mode=exclusive at 0x3acaf10> (guarded:96) 2017-06-05 11:54:46,931+0300 ERROR (tasks/9) [root] Job '12fa9df4-9357-42de-9950-a83d3dd02533' failed (jobs:217) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 154, in run self._run() File "/usr/share/vdsm/storage/sdm/api/copy_data.py", line 64, in _run with guarded.context(self._source.locks + self._dest.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/share/vdsm/storage/volume.py", line 1418, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/share/vdsm/storage/sd.py", line 471, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 362, in acquire "Cannot acquire %s" % (lease,), str(e)) AcquireLockFailure: Cannot obtain lock: "id=45f6b9b4-6b94-44db-a740-2494f05bffa4, rc=28, out=Cannot acquire Lease(name='c5a08ac5-20bc-4d44-9f4c-bc6a53d661d1', path='/dev/45f6b9b4-6b94-44db-a740-2494f05bffa4/leases', offset=113246208), err=(28, 'Sanlock resource not acquired', 'No space left on device')" 2017-06-05 11:54:46,932+0300 INFO (tasks/9) [root] Job '12fa9df4-9357-42de-9950-a83d3dd02533' will be deleted in 3600 seconds (jobs:245) Move disk can failed with error massage in the engine - 2017-06-05 11:54:49,428+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler6) [disks_syncAction_ac2ff124-a7a0-47ba] Command CopyData id: 'c9803db7-9802-4a0d-80a3-a22fc42dd19f': execution was completed, the command status is 'FAILED' 2017-06-05 11:54:49,432+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [disks_syncAction_ac2ff124-a7a0-47ba] Command 'CopyImageGroupWithData' (id: 'e774b141-5221-488f-b621-853ca8a7e78d') waiting on child command id: 'd413e773-b740-4570-96ad-d0df0e5f311f' type:'CopyImageGroupVolumesData' to complete 2017-06-05 11:54:50,439+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler9) [disks_syncAction_ac2ff124-a7a0-47ba] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure.
We are currently testing the following solution in VDSM: https://gerrit.ovirt.org/#/c/77765/ Once we will get feedback on the automation tests' result we will know whether to move the bug to post or not.
Removing the needinfo for now. Please set it again when there's a patch we can test
(In reply to Raz Tamir from comment #26) > Removing the needinfo for now. > Please set it again when there's a patch we can test Still the same patch: (In reply to Maor from comment #25) > We are currently testing the following solution in VDSM: > https://gerrit.ovirt.org/#/c/77765/ > > Once we will get feedback on the automation tests' result we will know > whether to move the bug to post or not. Not sure what removing the needinfo means. Was this tested and failed? Some other feedback?
(In reply to Allon Mureinik from comment #27) > (In reply to Raz Tamir from comment #26) > > Removing the needinfo for now. > > Please set it again when there's a patch we can test > > Still the same patch: > (In reply to Maor from comment #25) > > We are currently testing the following solution in VDSM: > > https://gerrit.ovirt.org/#/c/77765/ > > > > Once we will get feedback on the automation tests' result we will know > > whether to move the bug to post or not. > > Not sure what removing the needinfo means. > Was this tested and failed? Some other feedback? Last time master was broken because of the persistent vm change. We did not see the the errors related to this bug, but some error may be masked by persistent vms failures. The patches are ready now for testing again: https://gerrit.ovirt.org/77765 (Note, you must checkout this patch, not cherry-pick)
Execution finished - cold move disk flow: We are still seeing the same error in the logs engine.log: 2017-06-28 14:06:02,177+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CopyVolumeDataVDSCommand] (DefaultQuartzScheduler2) [disks_syncAction_dbf04eaf-c0d0-461b] FINISH, CopyVolumeDataVDSCommand, log id: 315b11 46 2017-06-28 14:06:02,187+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler2) [disks_syncAction_dbf04eaf-c0d0-461b] START, GetHostJobsVDSCommand(HostName = host_mixed_1, GetHostJobsVDSCommandParameters:{runAsync='true', hostId='ba644554-9f4b-4bc9-98c3-3a7ea9b9b6ff', type='storage', jobIds='[76e45363-e748-4ef3-acd3-5a0aaf38085f]'}), log id: 1bb99606 2017-06-28 14:06:02,190+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler2) [disks_syncAction_dbf04eaf-c0d0-461b] FINISH, GetHostJobsVDSCommand, return: {76e45363-e748-4ef3-acd3-5a0aaf38085f=HostJobInfo:{id='76e45363-e748-4ef3-acd3-5a0aaf38085f', type='storage', description='copy_data', status='running', progress='null', error='null'}}, log id: 1bb99606 2017-06-28 14:06:02,193+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler2) [disks_syncAction_dbf04eaf-c0d0-461b] Command CopyData id: 'bfc5a129-c8f3-485a-ae5f-8da8c6cbdfb7': waiting for job '76e45363-e748-4ef3-acd3-5a0aaf38085f' on host 'host_mixed_1' (id: 'ba644554-9f4b-4bc9-98c3-3a7ea9b9b6ff') to complete 2017-06-28 14:06:03,198+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'CopyImageGroupWithData' (id: '8a5db46c-2514-4b59-bf55-2e916420d97e') waiting on child command id: '91692eae-33b1-4089-936a-ff62b8eaeb8e' type:'CopyImageGroupVolumesData' to complete 2017-06-28 14:06:04,201+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler9) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'CopyImageGroupVolumesData' (id: '91692eae-33b1-4089-936a-ff62b8eaeb8e') waiting on child command id: 'bfc5a129-c8f3-485a-ae5f-8da8c6cbdfb7' type:'CopyData' to complete 2017-06-28 14:06:04,205+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler9) [disks_syncAction_dbf04eaf-c0d0-461b] START, GetHostJobsVDSCommand(HostName = host_mixed_1, GetHostJobsVDSCommandParameters:{runAsync='true', hostId='ba644554-9f4b-4bc9-98c3-3a7ea9b9b6ff', type='storage', jobIds='[76e45363-e748-4ef3-acd3-5a0aaf38085f]'}), log id: dcbc7f0 2017-06-28 14:06:04,208+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (DefaultQuartzScheduler9) [disks_syncAction_dbf04eaf-c0d0-461b] FINISH, GetHostJobsVDSCommand, return: {76e45363-e748-4ef3-acd3-5a0aaf38085f=HostJobInfo:{id='76e45363-e748-4ef3-acd3-5a0aaf38085f', type='storage', description='copy_data', status='failed', progress='null', error='VDSError:{code='AcquireLockFailure', message='Cannot obtain lock: "id=34cacfb6-fc56-4e66-8db4-7dcdb387c310, rc=28, out=Cannot acquire Lease(name='57562b48-e0dc-4373-b974-e87fbfea8889', path='/dev/34cacfb6-fc56-4e66-8db4-7dcdb387c310/leases', offset=109051904), err=(28, 'Sanlock resource not acquired', 'No space left on device')"'}'}}, log id: dcbc7f0 2017-06-28 14:06:04,210+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler9) [disks_syncAction_dbf04eaf-c0d0-461b] Command CopyData id: 'bfc5a129-c8f3-485a-ae5f-8da8c6cbdfb7': job '76e45363-e748-4ef3-acd3-5a0aaf38085f' execution was completed with VDSM job status 'failed' 2017-06-28 14:06:04,213+03 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (DefaultQuartzScheduler9) [disks_syncAction_dbf04eaf-c0d0-461b] Command CopyData id: 'bfc5a129-c8f3-485a-ae5f-8da8c6cbdfb7': execution was completed, the command status is 'FAILED' 2017-06-28 14:06:05,219+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (DefaultQuartzScheduler7) [disks_syncAction_dbf04eaf-c0d0-461b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure. 2017-06-28 14:06:06,229+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler10) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'CopyImageGroupVolumesData' id: '91692eae-33b1-4089-936a-ff62b8eaeb8e' child commands '[bfc5a129-c8f3-485a-ae5f-8da8c6cbdfb7]' executions were completed, status 'FAILED' 2017-06-28 14:06:07,236+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (DefaultQuartzScheduler8) [disks_syncAction_dbf04eaf-c0d0-461b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand' with failure. 2017-06-28 14:06:07,244+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (DefaultQuartzScheduler8) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'CopyImageGroupWithData' id: '8a5db46c-2514-4b59-bf55-2e916420d97e' child commands '[891f957d-ac2d-4694-8155-b93c99158cf1, 91692eae-33b1-4089-936a-ff62b8eaeb8e]' executions were completed, status 'FAILED' 2017-06-28 14:06:08,260+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (DefaultQuartzScheduler10) [disks_syncAction_dbf04eaf-c0d0-461b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand' with failure. 2017-06-28 14:06:09,287+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler1) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'MoveImageGroup' id: '548fc410-fe4b-4cd3-a15f-e684bf402532' child commands '[8a5db46c-2514-4b59-bf55-2e916420d97e]' executions were completed, status 'FAILED' 2017-06-28 14:06:09,287+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler1) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'MoveImageGroup' id: '548fc410-fe4b-4cd3-a15f-e684bf402532' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2017-06-28 14:06:10,301+03 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler4) [disks_syncAction_dbf04eaf-c0d0-461b] Command 'MoveOrCopyDisk' id: '06a9e571-c636-41bc-bebd-74e6c9e4efbe' child commands '[548fc410-fe4b-4cd3-a15f-e684bf402532]' executions were completed, status 'FAILED' 2017-06-28 14:06:11,037+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (DefaultQuartzScheduler1) [7e93839a] Setting new tasks map. The map contains now 2 tasks 2017-06-28 14:06:11,314+03 ERROR [org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand] (DefaultQuartzScheduler8) [disks_syncAction_dbf04eaf-c0d0-461b] Ending command 'org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand' with failure. 2017-06-28 14:06:11,321+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.MoveImageGroupCommand] (DefaultQuartzScheduler8) [disks_syncAction_dbf04eaf-c0d0-461b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.MoveImageGroupCommand' with failure. vdsm.log: /dev/34cacfb6-fc56-4e66-8db4-7dcdb387c310/leases', offset=109051904) for host id 1 ( clusterlock:342) 2017-06-28 14:06:02,605+0300 ERROR (tasks/4) [storage.guarded] Error acquiring lock <VolumeLease ns=04_lease_34cacfb6-fc56-4e66-8db4-7dcdb387c310, name=57562b48-e0dc-4373-b974-e87fbfea8889, mode=exclusive at 0x22fe f50> (guarded:96) 2017-06-28 14:06:02,606+0300 ERROR (tasks/4) [root] Job u'76e45363-e748-4ef3-acd3-5a0aaf38085f' failed (jobs:221) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 157, in run self._run() File "/usr/lib/python2.7/site-packages/vdsm/storage/sdm/api/copy_data.py", line 61, in _run with guarded.context(self._source.locks + self._dest.locks): File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 102, in __enter__ six.reraise(*exc) File "/usr/lib/python2.7/site-packages/vdsm/storage/guarded.py", line 93, in __enter__ lock.acquire() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1519, in acquire dom.acquireVolumeLease(self._host_id, self._img_id, self._vol_id) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 478, in acquireVolumeLease self._domainLock.acquire(hostId, lease) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 361, in acquire "Cannot acquire %s" % (lease,), str(e)) AcquireLockFailure: Cannot obtain lock: "id=34cacfb6-fc56-4e66-8db4-7dcdb387c310, rc=28, out=Cannot acquire Lease(name='57562b48-e0dc-4373-b974-e87fbfea8889', path='/dev/34cacfb6-fc56-4e66-8db4-7dcdb387c310/leases' , offset=109051904), err=(28, 'Sanlock resource not acquired', 'No space left on device')"
Created attachment 1292653 [details] logs with patch
Raz, can you share the output of rpm -q vdsm
The environment was already reprovisioned so I don't have this info anymore
The line numbers in the traceback do not match the code. It looks like this test was using only the first patch or master. Please make sure you run versio d0e95ceef2.
(In reply to Nir Soffer from comment #31) > Raz, can you share the output of > rpm -q vdsm Found it - vdsm-4.20.1-77.git6045110.el7.x86_64 Is That the right version?
(In reply to Raz Tamir from comment #34) > (In reply to Nir Soffer from comment #31) > > Raz, can you share the output of > > rpm -q vdsm > > Found it - vdsm-4.20.1-77.git6045110.el7.x86_64 > > Is That the right version? NO, this is patch from master that does not include the fix. https://gerrit.ovirt.org/78386 The patch we need to test is: https://gerrit.ovirt.org/77765 You must checkout this patch, not cherry-pick, you need the 4 patches in this series. The version should be: vdsm-4.20.1-91.gitd0e95ce.el7.x86_64
I'm removing the test infrastructure patch from the bug, since the tests infra has change too much from 4.1, and we cannot backport the tests.
Nir, can you please add the bug URL that blocks this bug from being merged downstream
We want to resolve first bug 1469175.
Raising severity as on our last tier 2, 80% of the failures are because of this bug
I think that marking it as urgent is pointless since the bug is blocked on BZ1469175 so there is not much we can do other than check BZ1469175 from time to time.
(In reply to Maor from comment #41) > I think that marking it as urgent is pointless since the bug is blocked on > BZ1469175 so there is not much we can do other than check BZ1469175 from > time to time. Bug 1469175 is marked as MODIFIED, so we can go ahead and fix this one now.
(In reply to Allon Mureinik from comment #42) > (In reply to Maor from comment #41) > > I think that marking it as urgent is pointless since the bug is blocked on > > BZ1469175 so there is not much we can do other than check BZ1469175 from > > time to time. > > Bug 1469175 is marked as MODIFIED, so we can go ahead and fix this one now. rebased the backport patches, please review: https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:ovirt-4.1+topic:wait-for-host-id
No trace of this bug in our automation - tier 1/2/3 on latest build ovirt-engine-4.1.5.2-0.1.el7.noarch