Created attachment 1676374 [details] engine and vdsm logs Description of problem: Creating a VM from template on ISCSI SD fails blockVolume.py", line 151, in validate with VolumeDoesNotExist. Issue reproduce manually via web admin and via automation. VDSM log: 2020-04-05 13:17:47,348+0300 ERROR (tasks/8) [storage.TaskManager.Task] (Task='6ff359ed-e782-4d33-9dc9-1e4aa9289cb7') Unexpected error (task:880) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 149, in validate lvm.getLV(self.sdUUID, self.volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1249, in getLV raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName)) vdsm.storage.exception.LogicalVolumeDoesNotExistError: Logical volume does not exist: ('b838bbe0-de83-47da-9b33-48facb0a6781/10563300-077f-49e1-96ca-788e7ab005bc',) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run return fn(*args, **kargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 350, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python3.6/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 1900, in deleteVolume vol = dom.produceVolume(imgUUID, volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 969, in produceVolume volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 812, in __init__ self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 62, in __init__ volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 84, in __init__ self.validate() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 151, in validate raise se.VolumeDoesNotExist(self.volUUID) vdsm.storage.exception.VolumeDoesNotExist: Volume does not exist: ('10563300-077f-49e1-96ca-788e7ab005bc',) Engine log: 2020-04-05 13:17:47,079+03 INFO [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-56) [] Lock freed to object 'EngineLock:{exclusiveLocks ='[VM_from_template_iscsi2=VM_NAME]', sharedLocks='[410ba0c4-a364-47dc-b533-dc9c241825a4=DISK, 74ee5bf9-3366-4154-9833-2e9fa9ab85d9=TEMPLATE]'}' 2020-04-05 13:17:47,090+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-56) [] EVENT_ID: USER_ADD_VM_FINISHED _FAILURE(60), Failed to complete VM VM_from_template_iscsi2 creation. 2020-04-05 13:17:53,701+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39) [] Polling and updating Async Tasks: 1 tasks, 1 tasks to poll now 2020-04-05 13:17:53,708+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39) [] Failed in 'HSMGetAllTask sStatusesVDS' method 2020-04-05 13:17:53,709+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39) [] SPMAsyncTask::PollTask: Polling task '6ff359ed-e782-4d 33-9dc9-1e4aa9289cb7' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2020-04-05 13:17:53,709+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39) [] BaseAsyncTask::logEndTaskFailure: Task '6ff359ed-e782-4d33-9dc9-1e4aa9289cb7' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('10563300-077f-49e1-96ca-788e7ab005bc',) abortedcode=201, code = 201', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('10563300-077f-49e1-96ca-788e7ab005bc',) abortedcode=201, code = 201' 2020-04-05 13:17:53,709+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-39) [] SPMAsyncTask::ClearAsyncTask: Attempting to clear task '6ff359ed-e782-4d33-9dc9-1e4aa9289cb7' Version-Release number of selected component (if applicable): ovirt-engine-4.4.0-0.31.master.el8ev.noarch vdsm-4.40.11-1.el8ev.x86_64 libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64 qemu-img-4.2.0-17.module+el8.2.0+6131+4e715f3b.x86_64 How reproducible: 100% (on one ISCSI SD 'iscsi_2') Steps to Reproduce: 1.Import a template from glance and populate/copy to iscsi SD. 2.Create a VM from this template(named 'latest-rhel-guest-image-8.2-infra') Actual results: Create a VM from this template fails. Expected results: Additional info:
Facing the same issue on 3 other automation test cases - same execution (tier1) Same engine and VDSM logs. In all 3 test cases, the VM creation failed while trying to create the disk on iscsi domain, also from template. Looks like the last operation before the VDSM exception was: 2020-04-04 20:40:35,027+0300 DEBUG (tasks/3) [storage.LVM] lvs reloaded (lvm:675) Attaching logs from one of them as they all the same.
Created attachment 1676375 [details] TestCase5062
Benny Can you please have a look?
low reproducibility -> lowering severity, removing Regression keyword
I have seen the same issue with the same following errors in "copy_disk" test case (tier1) The following errors appear also in the running of gluster and nfs SD. from engine.log: 2020-04-04 14:53:32,943+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2020-04-04 14:53:32,944+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2020-04-04 14:53:32,944+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] SPMAsyncTask::PollTask: Polling task 'b1c80197-7f8e-4b34-b3cf-9a1477ee2ad8' (Parent Command 'CreateSnapshotFromTemplate', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2020-04-04 14:53:32,948+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] BaseAsyncTask::logEndTaskFailure: Task 'b1c80197-7f8e-4b34-b3cf-9a1477ee2ad8' (Parent Command 'CreateSnapshotFromTemplate', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('10563300-077f-49e1-96ca-788e7ab005bc',) abortedcode=201, code = 201', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('10563300-077f-49e1-96ca-788e7ab005bc',) abortedcode=201, code = 201' 2020-04-04 14:53:32,949+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '4efbc57c-5f72-4086-8469-8dbe6f53eccb' has ended -> executing 'endAction' 2020-04-04 14:53:32,949+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '4efbc57c-5f72-4086-8469-8dbe6f53eccb'): calling endAction '. 2020-04-04 14:53:32,950+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] SPMAsyncTask::PollTask: Polling task 'd7f25ee3-8532-4145-9183-8da04624c98c' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2020-04-04 14:53:32,950+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-100) [] BaseAsyncTask::logEndTaskFailure: Task 'd7f25ee3-8532-4145-9183-8da04624c98c' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('a0a2c25e-24a7-4422-800d-a4586daf9fe9',) abortedcode=201, code = 201', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Volume does not exist: ('a0a2c25e-24a7-4422-800d-a4586daf9fe9',) abortedcode=201, code = 201' 2020-04-04 14:53:37,762+03 ERROR [org.ovirt.engine.core.bll.AddVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [vms_create_0bea2d58-e9d1-4eee] Ending command 'org.ovirt.engine.core.bll.AddVmCommand' with failure. 2020-04-04 14:53:37,766+03 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotFromTemplateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [vms_create_0bea2d58-e9d1-4eee] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotFromTemplateCommand' with failure. 2020-04-04 14:53:37,775+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] Running command: RemoveImageCommand internal: true. Entities affected : ID: b838bbe0-de83-47da-9b33-48facb0a6781 Type: Storage 2020-04-04 14:53:37,816+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] START, DeleteImageGroupVDSCommand( DeleteImageGroupVDSCommandParameters:{storagePoolId='24718638-81b8-4442-ac3a-048076b87b58', ignoreFailoverLimit='false', storageDomainId='b838bbe0-de83-47da-9b33-48facb0a6781', imageGroupId='90d8466e-51c4-49a3-a054-c4b594a3395b', postZeros='false', discard='false', forceDelete='false'}), log id: 234a2b16 2020-04-04 14:53:37,974+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] EVENT_ID: IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command DeleteImageGroupVDS failed: Image does not exist in domain: 'image=90d8466e-51c4-49a3-a054-c4b594a3395b, domain=b838bbe0-de83-47da-9b33-48facb0a6781' 2020-04-04 14:53:37,974+03 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] Command 'DeleteImageGroupVDSCommand( DeleteImageGroupVDSCommandParameters:{storagePoolId='24718638-81b8-4442-ac3a-048076b87b58', ignoreFailoverLimit='false', storageDomainId='b838bbe0-de83-47da-9b33-48facb0a6781', imageGroupId='90d8466e-51c4-49a3-a054-c4b594a3395b', postZeros='false', discard='false', forceDelete='false'})' execution failed: IRSGenericException: IRSErrorException: Image does not exist in domain: 'image=90d8466e-51c4-49a3-a054-c4b594a3395b, domain=b838bbe0-de83-47da-9b33-48facb0a6781' 2020-04-04 14:53:37,974+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.DeleteImageGroupVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] FINISH, DeleteImageGroupVDSCommand, return: , log id: 234a2b16 2020-04-04 14:53:37,974+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] Disk '90d8466e-51c4-49a3-a054-c4b594a3395b' doesn't exist on storage domain 'b838bbe0-de83-47da-9b33-48facb0a6781', rolling forward 2020-04-04 14:53:37,984+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [2e493205] Removed task 'efc89b15-a313-4a93-83e2-9c8d046d6c1a' from DataBase 2020-04-04 14:53:38,169+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Failed to complete VM copy_disk_vm_iscsi creation. from vdsm.log: 2020-04-04 14:53:38,260+0300 DEBUG (tasks/9) [storage.LVM] lvs reloaded (lvm:675) 2020-04-04 14:53:38,261+0300 DEBUG (tasks/9) [storage.ResourcesFactories] Image 90d8466e-51c4-49a3-a054-c4b594a3395b does not exist in domain b838bbe0-de83-47da-9b33-48facb0a6781 (resourceFactories:125) 2020-04-04 14:53:38,261+0300 DEBUG (tasks/9) [storage.ResourceManager] Resource '01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b' is free. Now locking as 'exclusive' (1 active user) (resourceManager:533) 2020-04-04 14:53:38,261+0300 DEBUG (tasks/9) [storage.ResourceManager.Request] (ResName='01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b', ReqID='ecef33f4-08d7-4851-ae9e-7bb4bf6bef52') Granted request (resourceManager:223) 2020-04-04 14:53:38,290+0300 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = b'1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB, 1000 KiB) copied, 0.0108459 s, 94.4 MB/s\n'; <rc> = 0 (commands:224) 2020-04-04 14:53:38,398+0300 DEBUG (tasks/9) [storage.LVM] lvs reloaded (lvm:675) 2020-04-04 14:53:38,398+0300 DEBUG (tasks/9) [storage.ResourceManager] Trying to release resource '01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b' (resourceManager:548) 2020-04-04 14:53:38,398+0300 DEBUG (tasks/9) [storage.ResourceManager] Released resource '01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b' (0 active users) (resourceManager:566) 2020-04-04 14:53:38,398+0300 DEBUG (tasks/9) [storage.ResourceManager] Resource '01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b' is free, finding out if anyone is waiting for it. (resourceManager:572) 2020-04-04 14:53:38,398+0300 DEBUG (tasks/9) [storage.ResourceManager] No one is waiting for resource '01_img_b838bbe0-de83-47da-9b33-48facb0a6781.90d8466e-51c4-49a3-a054-c4b594a3395b', Clearing records. (resourceManager:580) 2020-04-04 14:53:38,398+0300 ERROR (tasks/9) [storage.TaskManager.Task] (Task='3d4fd75d-5f4d-418a-9126-98064b60797f') Unexpected error (task:880) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 149, in validate lvm.getLV(self.sdUUID, self.volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1249, in getLV raise se.LogicalVolumeDoesNotExistError("%s/%s" % (vgName, lvName)) vdsm.storage.exception.LogicalVolumeDoesNotExistError: Logical volume does not exist: ('b838bbe0-de83-47da-9b33-48facb0a6781/18cf9fad-dfa1-4d00-8045-a79edbe313fb',) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run return fn(*args, **kargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 350, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python3.6/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/sp.py", line 1900, in deleteVolume vol = dom.produceVolume(imgUUID, volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 969, in produceVolume volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 812, in __init__ self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 62, in __init__ volUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 84, in __init__ self.validate() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockVolume.py", line 151, in validate raise se.VolumeDoesNotExist(self.volUUID) vdsm.storage.exception.VolumeDoesNotExist: Volume does not exist: ('18cf9fad-dfa1-4d00-8045-a79edbe313fb',) logs attached
Created attachment 1679408 [details] TestCaseCopyAttachedDisk_ test_same_domain_same_alias[nfs]
Created attachment 1679409 [details] TestCaseCopyAttachedDisk_ test_same_domain_same_alias[glusterfs]
Created attachment 1679411 [details] TestCaseCopyAttachedDisk_ test_same_domain_same_alias[iscsi]
still occurring in the latest engine build?G
(In reply to Benny Zlotnik from comment #9) > still occurring in the latest engine build?G The issue was not seen again on several Tier1 runs on rhv-4.4.0-31/ovirt-engine 4.4.0-33. Can this be fixed by other ISCSI fixes that were merged in rhv-4.4.0-31/ovirt-engine 4.4.0-33?
(In reply to Avihai from comment #10) > (In reply to Benny Zlotnik from comment #9) > > still occurring in the latest engine build?G > > The issue was not seen again on several Tier1 runs on > rhv-4.4.0-31/ovirt-engine 4.4.0-33. > > Can this be fixed by other ISCSI fixes that were merged in > rhv-4.4.0-31/ovirt-engine 4.4.0-33? I believe so, personally I've never seen this seen, but this looks like something that could have been caused by template copying issues which I hope are fixed now as part of https://bugzilla.redhat.com/show_bug.cgi?id=1820182 https://bugzilla.redhat.com/show_bug.cgi?id=1819125
Verified on engine-4.4.0-0.33.master.el8ev and vdsm-4.40.13-1.el8ev.x86_64 with the following steps: 1.Import a template from glance and populate/copy to iscsi SD. 2.Create a VM from this template -> VM was created successfully.
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.