Created attachment 1491972 [details] logs Description of problem: Volume creation failed with the following errors, when creating live snapshot after live migration: Engine: 2018-09-20 05:38:08,956+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_2 command HSMGetAllTasksStatusesVDS failed: Error creating a new volume: ('Volume creation 982d8647-c9b2-41e8-8274-1fc7afcc0ccc failed: Failed reload: 982d8647-c9b2-41e8-8274-1fc7afcc0ccc',) 2018-09-20 05:38:08,963+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] BaseAsyncTask::logEndTaskFailure: Task '24ed00c0-2af2-45e5-9c23-ff27060f5ee7' (Parent Command 'AddImageFromScratch', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Error creating a new volume: ('Volume creation 982d8647-c9b2-41e8-8274-1fc7afcc0ccc failed: Failed reload: 982d8647-c9b2-41e8-8274-1fc7afcc0ccc',), code = 205', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Error creating a new volume: ('Volume creation 982d8647-c9b2-41e8-8274-1fc7afcc0ccc failed: Failed reload: 982d8647-c9b2-41e8-8274-1fc7afcc0ccc',), code = 205' 2018-09-20 05:38:22,163+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [snapshots_create_2b65d9fe-3de7-4584] Command 'CreateSnapshotForVm' id: '19100aa8-67bd-4ce9-aba5-ddab3393eee2' child commands '[769ff171-b4b0-46f9-9618-e2d6ac46054d, 8f6404d3-1bad-4711-917d-c3a8dec675f4, 1a549d3b-ee7f-4871-aee3-50ad6113b993]' executions were completed, status 'FAILED' 2018-09-20 05:38:23,200+03 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [snapshots_create_2b65d9fe-3de7-4584] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand' with failure. VDSM: 2018-09-20 05:37:58,652+0300 ERROR (tasks/9) [storage.Volume] Unexpected error (volume:1249) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1218, in create apparentSize = int(dom.getVSize(imgUUID, volUUID) / File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 717, in getVSize return self._manifest.getVSize(imgUUID, volUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 453, in getVSize size = lvm.getLV(self.sdUUID, volUUID).size File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 82, in __getattr__ raise AttributeError("Failed reload: %s" % self.name) AttributeError: Failed reload: 982d8647-c9b2-41e8-8274-1fc7afcc0ccc 2018-09-20 05:38:23,254+0300 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='fd02f906-4c21-4f61-acc3-e7756ab9f51f') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in revertTask File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2305, in revertTask return self.taskMng.revertTask(taskID=taskID) File "/usr/lib/python2.7/site-packages/vdsm/storage/taskManager.py", line 165, in revertTask t = self.__getTask(taskID) File "/usr/lib/python2.7/site-packages/vdsm/storage/taskManager.py", line 85, in __getTask raise se.UnknownTask(taskID) UnknownTask: Task id unknown: ('eebf168e-4a6e-4b58-9871-4a16b06a4b2e',) Version-Release number of selected component (if applicable): vdsm-4.20.40-1.el7ev.x86_64 ovirt-engine-4.2.7-0.1.el7ev.noarch How reproducible: Many times, at least 9 test cases. Steps to Reproduce: 1. Create and run VM 2. Migrate VM's disk to another SD (of the same type) 3. Create live snapshot Actual results: Failed to create snapshot. Expected results: Should succeed. Additional info: Relevant logs are attached.
Happened also to me in the same Tier3 run on an additional 7 TC's: All these TC's also failed during disk\volume creation: 1) TestCase6048.test_consecutive_snapshot_deletion_of_snapshots[iscsi] 2) TestCase6050.test_snapshot_merge_during_snapshot_merge[iscsi] 3) TestCase6062.test_live_merge_during_lsm[iscsi] 4) TestCase12215.test_snapshot_deletion_of_all_snapshots[iscsi] 5) TestCase18912.test_snapshot_deletion_of_all_snapshots[iscsi] these 2 TC's failed during volume deletion during snapshot merge/preview: 6) TestCase18904.test_consecutive_snapshot_deletion_of_snapshots[iscsi] 7) TestCase18899.test_basic_snapshot_deletion[iscsi] Logs are attached for: 1) volume creation issue I see from TestCase6048 2) volumes deletion issue during preview TestCase18904 3) volume deletion issue during cold merge TestCase18899. In all the tests including original description from Evelina's comment , it starts with Engine Error : 2018-10-04 19:32:52,722+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2018-10-04 19:32:52,727+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_2 command HSMGetAllTasksStatusesVDS failed: Error creating a new volume: (u'Volume creation 6db2da3a-bcc0-4496-ad51-5c17cdb31a1d failed: Failed reload: 6db2da3a-bcc0-4496-ad51-5c17cdb31a1d',) And VDSM all have "AttributeError: Failed reload:" issues either in with creating a new volume (code = 205) or with deleting volume/DestroyImage (code = 100)
Created attachment 1491996 [details] create/delete volume additional TC's logs Attached are engine and VDSM logs . 3 directories are there for 2 types of issues : 1) for create volume issue -> directory "create_volume" 2) for Destroy/delete volume issue two directories for 2 TC's : "preview_TestCase18899" "delete_snapshot_TestCase18904"
Checked again at the same Tier3 run on new build ovirt-engine-4.2.7.3-0.1.el7ev.noarch and issue does not reproduce. All the failed TC's passed in on this run[1] and the ERROR seen here was not seen in engine logs. [1] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv-4.2-ge-runner-tier3/88/
Created attachment 1503279 [details] 4.2.7 Logs (2018-11-07)
The bug reproduce in tier1(TestCase6004,test_storage_migration) 4.2.7_4.3 : From vdsm.log - hypervisor-lynx22_vdsm.log: 2018-11-06 21:17:14,389+0200 INFO (tasks/1) [storage.LVM] Deactivating lvs: vg=bbb3d1af-bae6-45cb-a6fc-956408328a61 lvs=[u'bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307'] (lvm:1310) 2018-11-06 21:17:14,965+0200 WARN (tasks/1) [storage.LVM] bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307 can't be reloaded, please check your storage connections. (lvm:81) 2018-11-06 21:17:14,965+0200 ERROR (tasks/1) [storage.Volume] Unexpected error (volume:1249) 2018-11-06 21:17:14,965+0200 ERROR (tasks/1) [storage.Volume] Unexpected error (volume:1249) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1218, in create apparentSize = int(dom.getVSize(imgUUID, volUUID) / File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 717, in getVSize return self._manifest.getVSize(imgUUID, volUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 453, in getVSize size = lvm.getLV(self.sdUUID, volUUID).size File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 82, in __getattr__ raise AttributeError("Failed reload: %s" % self.name) AttributeError: Failed reload: bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307 2018-11-06 21:17:14,967+0200 ERROR (tasks/1) [storage.TaskManager.Task] (Task='4285558a-9305-411f-adcf-9ba322a14a2f') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1943, in createVolume initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 820, in createVolume initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1251, in create (volUUID, e)) VolumeCreationError: Error creating a new volume: (u'Volume creation bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307 failed: Failed reload: bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307',) 2018-11-06 21:17:14,967+0200 DEBUG (tasks/1) [storage.TaskManager.Task] (Task='4285558a-9305-411f-adcf-9ba322a14a2f') aborting: Task is aborted: "Error creating a new volume: (u'Volume creation bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307 failed: Failed reload: bc5c0b5e-a94c-4a5f-a6ba-00d5420ee307',)" - code 205 (task:928)
It looks like the LVM metadata is corrupted: 2018-11-06 21:17:14,954+0200 DEBUG (tasks/1) [storage.Misc.excCmd] FAILED: <err> = ' Metadata on /dev/mapper/3600a09803830447a4f244c4657595035 at 857600 has wrong VG name "}\n\n3ca717a4-01cb-4502-84de-c8adbb69ff52 {\nid = "rc5PFT-7AMW-e5LU-rBvo-HIRk-2LOu-LEoFEV"\nstatus = ["READ", "WRITE", "VISIBLE"]\nfl" expected bbb3d1af-bae6-45cb-a6fc-956408328a61.\n Metadata on /dev/mapper/3600a09803830447a4f244c4657595035 at 857600 has wrong VG name "}\n\n3ca717a4-01cb-4502-84de-c8adbb69ff52 {\nid = "rc5PFT-7AMW-e5LU-rBvo-HIRk-2LOu-LEoFEV"\nstatus = ["READ", "WRITE", "VISIBLE"]\nfl" expected bbb3d1af-bae6-45cb-a6fc-956408328a61.\n Not repairing metadata for VG bbb3d1af-bae6-45cb-a6fc-956408328a61.\n Recovery of volume group "bbb3d1af-bae6-45cb-a6fc-956408328a61" failed.\n Cannot process volume group bbb3d1af-bae6-45cb-a6fc-956408328a61\n'; <rc> = 5 (commands:86) Could be a similar issue as : https://bugzilla.redhat.com/show_bug.cgi?id=1553133 Nir, do you think that the following bug is also related? ( Separate lvm activation from other lvm commands) https://bugzilla.redhat.com/show_bug.cgi?id=1639360
Raising severity as we encounter this in snapshot related test scenarios in our automation. Do we have any new information following comment #6?
(In reply to Fred Rolland from comment #6) > It looks like the LVM metadata is corrupted: > > 2018-11-06 21:17:14,954+0200 DEBUG (tasks/1) [storage.Misc.excCmd] FAILED: > <err> = ' Metadata on /dev/mapper/3600a09803830447a4f244c4657595035 at > 857600 has wrong VG name "}\n\n3ca717a4-01cb-4502-84de-c8adbb69ff52 {\nid = > "rc5PFT-7AMW-e5LU-rBvo-HIRk-2LOu-LEoFEV"\nstatus = ["READ", "WRITE", > "VISIBLE"]\nfl" expected bbb3d1af-bae6-45cb-a6fc-956408328a61.\n ... > Could be a similar issue as : > https://bugzilla.redhat.com/show_bug.cgi?id=1553133 Yes, this looks the like the same issue. > Nir, do you think that the following bug is also related? ( Separate lvm > activation from other lvm commands) > https://bugzilla.redhat.com/show_bug.cgi?id=1639360 I don't think it is related. For bug 1553133 we could reproduce only once, and then we fail to reproduce several times. What is the minimal test code reproducing this? Do you have single test running with a clean environment reproducing this issue?
(In reply to Elad from comment #7) > Raising severity as we encounter this in snapshot related test scenarios in > our automation. > Do we have any new information following comment #6? Not at the moment. Does the failure occur in the same storage machine (NetApp?) in all the failures?
So far yes, only on Netapp
Occurs with other storages (Kaminario). See https://bugzilla.redhat.com/show_bug.cgi?id=1553133#c61
Happened again in tier2 execution on 4.3 env. The test cases are: storage_streaming_api (download image) - TestCase18268 & TestCase19265. Both on FCP SD. Version-Release number: ovirt-engine-4.3.0-0.6.alpha2.el7.noarch vdsm-4.30.4-1.el7ev.x86_64 From the engine log: 2018-12-12 09:12:56,090+02 INFO [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (default task-50) [disks_create_1ee96e6f-0f69-4b90] Running command: AddDiskCommand internal: false. Entities affected : ID: ad097cbe-eff1-4c67-ac1d-b59fb3109fcd Type: StorageAction group CREATE_DISK with role type USER 2018-12-12 09:12:56,129+02 INFO [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (default task-50) [disks_create_1ee96e6f-0f69-4b90] Running command: AddImageFromScratchCommand internal: true. Entities affected : ID: ad097cbe-eff1-4c67-ac1d-b59fb3109fcd Type: Storage 2018-12-12 09:12:56,173+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateImageVDSCommand] (default task-50) [disks_create_1ee96e6f-0f69-4b90] START, CreateImageVDSCommand( CreateImageVDSCommandParameters:{storagePoolId='ef05d454-fd4f-11e8-b2b0-001a4a168bfc', ignoreFailoverLimit='false', storageDomainId='ad097cbe-eff1-4c67-ac1d-b59fb3109fcd', imageGroupId='4d75d747-a86a-4db9-9939-69ce405d103b', imageSizeInBytes='1073741824', volumeFormat='RAW', newImageId='e9a32dc8-76ba-4d71-8a85-7c522a2564fb', imageType='Preallocated', newImageDescription='{"DiskAlias":"disk_virtioraw_1209125485","DiskDescription":""}', imageInitialSizeInBytes='0'}), log id: 3c56dea 2018-12-12 09:12:56,174+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateImageVDSCommand] (default task-50) [disks_create_1ee96e6f-0f69-4b90] -- executeIrsBrokerCommand: calling 'createVolume' with two new parameters: description and UUID 2018-12-12 09:12:56,391+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateImageVDSCommand] (default task-50) [disks_create_1ee96e6f-0f69-4b90] FINISH, CreateImageVDSCommand, return: e9a32dc8-76ba-4d71-8a85-7c522a2564fb, log id: 3c56dea 2018-12-12 09:12:56,404+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (default task-50) [disks_create_1ee96e6f-0f69-4b90] CommandAsyncTask::Adding CommandMultiAsyncTasks object for command 'bca3e47c-b4e6-4513-b66c-eecfd9bf6e08' 2018-12-12 09:12:56,404+02 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (default task-50) [disks_create_1ee96e6f-0f69-4b90] CommandMultiAsyncTasks::attachTask: Attaching task '566b2774-6d81-43c7-9122-a23f36774184' to command 'bca3e47c-b4e6-4513-b66c-eecfd9bf6e08'. 2018-12-12 09:12:56,429+02 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (default task-50) [disks_create_1ee96e6f-0f69-4b90] Adding task '566b2774-6d81-43c7-9122-a23f36774184' (Parent Command 'AddImageFromScratch', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. 2018-12-12 09:12:56,447+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-78) [disks_create_bc00dabe-4a7c-4cd1] Command 'AddDisk' (id: 'bbabe6b5-ebb7-41b8-b1fc-957c42655bfa') waiting on child command id: 'a89f74c7-724a-4b6a-bfa6-37391e3c26db' type:'AddImageFromScratch' to complete 2018-12-12 09:12:56,454+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (default task-50) [disks_create_1ee96e6f-0f69-4b90] BaseAsyncTask::startPollingTask: Starting to poll task '566b2774-6d81-43c7-9122-a23f36774184'. 2018-12-12 09:12:56,496+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-50) [disks_create_1ee96e6f-0f69-4b90] EVENT_ID: USER_ADD_DISK(2,020), Add-Disk operation of 'disk_virtioraw_1209125485' was initiated by admin@internal-authz. 2018-12-12 09:12:58,451+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-76) [disks_create_1ee96e6f-0f69-4b90] Command 'AddDisk' (id: 'de3c90ac-047f-4785-a9ca-1c1081e5cf64') waiting on child command id: 'bca3e47c-b4e6-4513-b66c-eecfd9bf6e08' type:'AddImageFromScratch' to complete 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] Command [id=bca3e47c-b4e6-4513-b66c-eecfd9bf6e08]: Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] BaseAsyncTask::onTaskEndSuccess: Task '8baa8f3e-b5bc-4515-baae-3a945edc2203' (Parent Command 'AddImageFromScratch', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type 'AddImageFromScratch' completed, handling the result. 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] CommandAsyncTask::HandleEndActionResult [within thread]: endAction for action type 'AddImageFromScratch' succeeded, clearing tasks. 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] SPMAsyncTask::ClearAsyncTask: Attempting to clear task '566b2774-6d81-43c7-9122-a23f36774184' 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] START, SPMClearTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='ef05d454-fd4f-11e8-b2b0-001a4a168bfc', ignoreFailoverLimit='false', taskId='566b2774-6d81-43c7-9122-a23f36774184'}), log id: 255e1ce5 2018-12-12 09:13:00,468+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] CommandAsyncTask::endActionIfNecessary: All tasks of command 'a89f74c7-724a-4b6a-bfa6-37391e3c26db' has ended -> executing 'endAction' 2018-12-12 09:13:00,509+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] START, HSMClearTaskVDSCommand(HostName = host_mixed_1, HSMTaskGuidBaseVDSCommandParameters:{hostId='6cd3bf70-c103-42b5-8ca3-1a269545959b', taskId='566b2774-6d81-43c7-9122-a23f36774184'}), log id: 6898b681 2018-12-12 09:13:00,545+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] FINISH, HSMClearTaskVDSCommand, return: , log id: 6898b681 2018-12-12 09:13:00,545+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] FINISH, SPMClearTaskVDSCommand, return: , log id: 255e1ce5 2018-12-12 09:13:00,553+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] BaseAsyncTask::removeTaskFromDB: Removed task '566b2774-6d81-43c7-9122-a23f36774184' from DataBase 2018-12-12 09:13:00,553+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10208) [disks_create_1ee96e6f-0f69-4b90] CommandAsyncTask::HandleEndActionResult [within thread]: Removing CommandMultiAsyncTasks object for entity 'bca3e47c-b4e6-4513-b66c-eecfd9bf6e08' 2018-12-12 09:13:02,472+02 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [disks_create_1ee96e6f-0f69-4b90] Command 'AddDisk' id: 'de3c90ac-047f-4785-a9ca-1c1081e5cf64' child commands '[bca3e47c-b4e6-4513-b66c-eecfd9bf6e08]' executions were completed, status 'FAILED' 2018-12-12 09:13:06,378+02 ERROR [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-5) [disks_create_1ee96e6f-0f69-4b90] Ending command 'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure. 2018-12-12 09:13:06,386+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-5) [disks_create_1ee96e6f-0f69-4b90] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand' with failure. From VDSM log: 2018-12-12 09:12:58,714+0200 WARN (tasks/0) [storage.LVM] e9a32dc8-76ba-4d71-8a85-7c522a2564fb can't be reloaded, please check your storage connections. (lvm:81) 2018-12-12 09:12:58,714+0200 ERROR (tasks/0) [storage.Volume] Unexpected error (volume:1247) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1216, in create apparentSize = int(dom.getVSize(imgUUID, volUUID) / File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 714, in getVSize return self._manifest.getVSize(imgUUID, volUUID) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockSD.py", line 453, in getVSize size = lvm.getLV(self.sdUUID, volUUID).size File "/usr/lib/python2.7/site-packages/vdsm/storage/lvm.py", line 82, in __getattr__ raise AttributeError("Failed reload: %s" % self.name) AttributeError: Failed reload: e9a32dc8-76ba-4d71-8a85-7c522a2564fb 2018-12-12 09:12:58,715+0200 DEBUG (tasks/0) [storage.ResourceManager] Trying to release resource '01_img_ad097cbe-eff1-4c67-ac1d-b59fb3109fcd.4d75d747-a86a-4db9-9939-69ce405d103b' (resourceManager:568) 2018-12-12 09:12:58,716+0200 DEBUG (tasks/0) [storage.ResourceManager] Released resource '01_img_ad097cbe-eff1-4c67-ac1d-b59fb3109fcd.4d75d747-a86a-4db9-9939-69ce405d103b' (0 active users) (resourceManager:586) 2018-12-12 09:12:58,716+0200 DEBUG (tasks/0) [storage.ResourceManager] Resource '01_img_ad097cbe-eff1-4c67-ac1d-b59fb3109fcd.4d75d747-a86a-4db9-9939-69ce405d103b' is free, finding out if anyone is waiting for it. (resourceManager:592) 2018-12-12 09:12:58,716+0200 DEBUG (tasks/0) [storage.ResourceManager] No one is waiting for resource '01_img_ad097cbe-eff1-4c67-ac1d-b59fb3109fcd.4d75d747-a86a-4db9-9939-69ce405d103b', Clearing records. (resourceManager:600) 2018-12-12 09:12:58,717+0200 ERROR (tasks/0) [storage.TaskManager.Task] (Task='566b2774-6d81-43c7-9122-a23f36774184') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1940, in createVolume initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 817, in createVolume initialSize=initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1249, in create (volUUID, e)) VolumeCreationError: Error creating a new volume: (u'Volume creation e9a32dc8-76ba-4d71-8a85-7c522a2564fb failed: Failed reload: e9a32dc8-76ba-4d71-8a85-7c522a2564fb',) 2018-12-12 09:12:58,717+0200 DEBUG (tasks/0) [storage.TaskManager.Task] (Task='566b2774-6d81-43c7-9122-a23f36774184') Task._run: 566b2774-6d81-43c7-9122-a23f36774184 () {} failed - stopping task (task:894) 2018-12-12 09:12:58,718+0200 DEBUG (tasks/0) [storage.TaskManager.Task] (Task='566b2774-6d81-43c7-9122-a23f36774184') stopping in state running (force False) (task:1256) 2018-12-12 09:12:58,718+0200 DEBUG (tasks/0) [storage.TaskManager.Task] (Task='566b2774-6d81-43c7-9122-a23f36774184') ref 1 aborting True (task:1002) 2018-12-12 09:12:58,718+0200 DEBUG (tasks/0) [storage.TaskManager.Task] (Task='566b2774-6d81-43c7-9122-a23f36774184') aborting: Task is aborted: "Error creating a new volume: (u'Volume creation e9a32dc8-76ba-4d71-8a85-7c522a2564fb failed: Failed reload: e9a32dc8-76ba-4d71-8a85-7c522a2564fb',)" - code 205 (task:928) Attaching two logs for two test cases.
Created attachment 1513976 [details] TestCase18268
Created attachment 1513977 [details] TestCase19265
Seems like the same issue as bug 1553133, postponing to async
*** Bug 1666207 has been marked as a duplicate of this bug. ***
Targeting to 4.2.8-2 when it will be available
If this still need fixes, can you please re-target to 4.2.8-3 ?
Created attachment 1535638 [details] TC6004_rhv-4.2.8-7_logs
Created attachment 1539161 [details] ovirt-engine-4.2.8.3-0.1_Tier1_TestCase6038_glusterfs_logs
Another encounter of this issue with the delete volume scenario on ovirt-engine-4.3.3.6-0.1.el7.noarch/vdsm-4.30.13-1.el7ev.x86_64. Occured in Tier1 Automation TestCase6038. Logs attached. Engine: 2019-05-04 21:55:33,280+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command HSMGetAllTasksStatusesVDS failed: Cannot delete volume which has children (non-ethical): ['sdUUID: ecd720bf-6e98-42d7-99b6-b2f59e12435d', u'imgUUID: b1a530b6-fdd7-4424-8945-1749f25156b6', u'volUUID: a858870b-f9e1-48a2-95b9-01337e5ba4f5'] 2019-05-04 21:55:33,280+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] SPMAsyncTask::PollTask: Polling task 'dcb6363c-e7b0-4ad3-bd34-6a139cd34d4d' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'running'. 2019-05-04 21:55:33,280+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] SPMAsyncTask::PollTask: Polling task '3f8954bd-093f-4dff-a69e-266965267a33' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2019-05-04 21:55:33,280+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-75) [] BaseAsyncTask::logEndTaskFailure: Task '3f8954bd-093f-4dff-a69e-266965267a33' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): ['sdUUID: ecd720bf-6e98-42d7-99b6-b2f59e12435d', u'imgUUID: b1a530b6-fdd7-4424-8945-1749f25156b6', u'volUUID: a858870b-f9e1-48a2-95b9-01337e5ba4f5'], code = 100', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): ['sdUUID: ecd720bf-6e98-42d7-99b6-b2f59e12435d', u'imgUUID: b1a530b6-fdd7-4424-8945-1749f25156b6', u'volUUID: a858870b-f9e1-48a2-95b9-01337e5ba4f5'], code = 100' VDSM: 2019-05-04 21:55:30,237+0300 ERROR (tasks/7) [storage.TaskManager.Task] (Task='3f8954bd-093f-4dff-a69e-266965267a33') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1967, in deleteVolume vol.delete(postZero=postZero, force=force, discard=discard) File "/usr/lib/python2.7/site-packages/vdsm/storage/blockVolume.py", line 588, in delete self.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1239, in validateDelete self._manifest.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 509, in validateDelete raise se.VolumeImageHasChildren(self) VolumeImageHasChildren: Cannot delete volume which has children (non-ethical): ['sdUUID: ecd720bf-6e98-42d7-99b6-b2f59e12435d', u'imgUUID: b1a530b6-fdd7-4424-8945-1749f25156b6', u'volUUID: a858870b-f9e1-48a2-95b9-01337e5ba4f5']
Created attachment 1563759 [details] TestCase6038_ovirt-engine-4.3.3.6-0.1_rep_delete_volume_scenario
If we have automated tests reproducing this issue, we should run them with the patch for bug 1693075. It may reveal important LVM warnings which are hidden in the default log level.
(In reply to Nir Soffer from comment #28) > If we have automated tests reproducing this issue, we should run them with > the patch for bug 1693075. It may reveal important LVM warnings which are > hidden in the default log level. Unfortunately, this bug (which includes both create/delete volume failure) is very random and does not reproduce for a specific test case. We encounter them sporadically when running Tier1/2/3, at some builds we do not see them at all and then suddenly we see them again in other builds. Please push the patch as planned and we hope we will see it when we run 4.3.5. We do not have the capacity to run temporary patches at the moment.
Another encounter of this issue with the delete volume scenario(Tier1 TestCase6038). Very strangely I cannot see any ERROR at VDSM logs. Nir did something change? can you take a peek? Builds: ovirt-engine-4.3.4.3-0.1.el7.noarch vdsm-4.30.17-1.el7ev.x86_64 libvirt-4.5.0-10.el7_6.12.x86_64 qemu-img-rhev-2.12.0-33.el7.x86_64 sanlock-3.6.0-1.el7.x86_64 Engine: 2019-06-22 09:07:28,853+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2019-06-22 09:07:28,860+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command HSMGetAllTasksStatusesVDS failed: Cannot delete volume which has children (non-ethical): [u'sdUUID: 77d0c50b-bf2f-4103-a1e3-b70024f9b734', u'imgUUID: 53d22544-f178-4587-a9a8-e35bb3634184', u'volUUID: 843c8df0-1ec8-4ef4-bcb8-8af174ca50cb'] 2019-06-22 09:07:28,866+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-63) [] BaseAsyncTask::logEndTaskFailure: Task '4a181bc1-f240-48be-8ba9-36bc19d4c52 b' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): [u'sdUUID: 77d0c50b-bf2f-4103-a1e3-b70024f9b734', u'imgUUID: 53d22544-f178-4587-a9a8-e35bb3634184', u'volUUID: 843c8df0-1ec8-4ef4-bcb8-8af174ca50cb'], code = 100', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): [u'sdUUID: 77d0c50b-bf2f-4103-a1e3-b70024f9b734', u'imgUUID: 53d22544-f178-4587-a9a8-e35bb3634184', u'volUUID: 843c8df0-1ec8-4ef4-bcb8-8af174ca50cb'], code = 100' 2019-06-22 09:07:33,891+03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [snapshots_delete_0ee04ba3-dcb5-4c18] Merging of snapshot 'b29d3af3-acca-4b4c-874d-f24c4d9f23b9' images '060c78fe-ba3d-44f0-9b8a-5788cc451b0e'..'843c8df0-1ec8-4ef4-bcb8-8af174ca50cb' failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation. 2019-06-22 09:07:33,893+03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [snapshots_delete_0ee04ba3-dcb5-4c18] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand' with failure. 2019-06-22 09:07:55,885+03 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-18) [snapshots_delete_0ee04ba3-dcb5-4c18] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand' with failure. 2019-06-22 09:07:55,916+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-18) [snapshots_delete_0ee04ba3-dcb5-4c18] EVENT_ID: USER_REMOVE_SNAPSHOT_FINISHED_FAILURE(357), Failed to delete snapshot 'snapshot_6038_nfs_1' for VM 'vm_TestCase6038_2208503035'.
Created attachment 1583625 [details] TestCase6038_ovirt-engine-4.3.4.3-0.1_rep_delete_volume_scenario
Another what looks like create snapshot issue - Not sure, Needs Nir ack. Automation Tier1 TestCase6052. Builds: ovirt-engine-4.3.4.3-0.1.el7.noarch vdsm-4.30.17-1.el7ev.ppc64le libvirt-4.5.0-10.el7_6.12.ppc64le Logs look more detailed than before. Engine: 2019-06-22 01:13:21,017+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (default task-60) [snapshots_create_f3de4fca-23ee-460a] START, CreateVolumeVDSCommand( CreateVolumeVDSCommandP arameters:{storagePoolId='99a910aa-f9d9-4d21-ba13-b966e02f5920', ignoreFailoverLimit='false', storageDomainId='0c5ae784-9441-42d8-85e1-4f378a15a1fa', imageGroupId='73c56871-dcd1-4c05-bde4-9d1799186ff1', imageSi zeInBytes='10737418240', volumeFormat='COW', newImageId='49b8220f-4ed8-4805-8f7e-79cb4b0f6f6c', imageType='Sparse', newImageDescription='', imageInitialSizeInBytes='0', imageId='54204868-cffb-4c7f-8e95-a9d3f6e2 a104', sourceImageGroupId='73c56871-dcd1-4c05-bde4-9d1799186ff1'}), log id: 286616ce 2019-06-22 01:13:21,239+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-60) [snapshots_create_f3de4fca-23ee-460a] EVENT_ID: IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command CreateVolumeVDS failed: Can't persist task: (u'708997a5-84e2-42f9-8021-7a2ce2139852: cannot access/create taskdir /rhev/data-center/99a910aa-f9d9-4d21-ba13-b966e02f5920/mastersd/master/tasks/708997a5-84e2-42f9-8021-7a2ce2139852: [Errno 11] Resource temporarily unavailable',) 2019-06-22 01:13:21,240+03 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (default task-60) [snapshots_create_f3de4fca-23ee-460a] Command 'CreateVolumeVDSCommand( CreateVolumeVDSCommandParameters:{storagePoolId='99a910aa-f9d9-4d21-ba13-b966e02f5920', ignoreFailoverLimit='false', storageDomainId='0c5ae784-9441-42d8-85e1-4f378a15a1fa', imageGroupId='73c56871-dcd1-4c05-bde4-9d1799186ff1', imageSizeInBytes='10737418240', volumeFormat='COW', newImageId='49b8220f-4ed8-4805-8f7e-79cb4b0f6f6c', imageType='Sparse', newImageDescription='', imageInitialSizeInBytes='0', imageId='54204868-cffb-4c7f-8e95-a9d3f6e2a104', sourceImageGroupId='73c56871-dcd1-4c05-bde4-9d1799186ff1'})' execution failed: IRSGenericException: IRSErrorException: Can't persist task: (u'708997a5-84e2-42f9-8021-7a2ce2139852: cannot access/create taskdir /rhev/data-center/99a910aa-f9d9-4d21-ba13-b966e02f5920/mastersd/master/tasks/708997a5-84e2-42f9-8021-7a2ce2139852: [Errno 11] Resource temporarily unavailable',) 2019-06-22 01:13:21,240+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (default task-60) [snapshots_create_f3de4fca-23ee-460a] FINISH, CreateVolumeVDSCommand, return: 00000000-0000-0000-0000-000000000000, log id: 286616ce 2019-06-22 01:13:21,240+03 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotCommand] (default task-60) [snapshots_create_f3de4fca-23ee-460a] Failed creating snapshot from image id '54204868-cffb-4c7f-8e95-a9d3f6e2a104' 2019-06-22 01:13:37,601+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2019-06-22 01:13:37,606+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command HSMGetAllTasksStatusesVDS failed: Logical volume does not exist: (u'0c5ae784-9441-42d8-85e1-4f378a15a1fa/c02be00e-dc30-4ad7-880f-6cc70099b018',) 2019-06-22 01:13:37,606+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2019-06-22 01:13:37,607+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2019-06-22 01:13:37,607+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] SPMAsyncTask::PollTask: Polling task 'ae0b383b-b069-47b9-8c13-d81d834fca27' (Parent Command 'CreateSnapshot', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2019-06-22 01:13:37,610+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] BaseAsyncTask::logEndTaskFailure: Task 'ae0b383b-b069-47b9-8c13-d81d834fca27' (Parent Command 'CreateSnapshot', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSTaskResultNotSuccessException: TaskState contained successful return code, but a non-success result ('cleanSuccess').', -- Exception: 'VDSGenericException: VDSTaskResultNotSuccessException: TaskState contained successful return code, but a non-success result ('cleanSuccess').' 2019-06-22 01:13:37,610+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '4d9789f3-8ffc-4de2-b159-493c0972715a' has ended -> executing 'endAction' 2019-06-22 01:13:37,610+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '4d9789f3-8ffc-4de2-b159-493c0972715a'): calling endAction '. 2019-06-22 01:13:37,611+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] SPMAsyncTask::PollTask: Polling task '6937cdb3-2b12-4fd8-a485-70ad53ce2284' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2019-06-22 01:13:37,611+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-66) [] BaseAsyncTask::logEndTaskFailure: Task '6937cdb3-2b12-4fd8-a485-70ad53ce2284' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Logical volume does not exist: (u'0c5ae784-9441-42d8-85e1-4f378a15a1fa/2c42218a-16f2-428a-a8cf-1662f89c0270',), code = 610', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Logical volume does not exist: (u'0c5ae784-9441-42d8-85e1-4f378a15a1fa/2c42218a-16f2-428a-a8cf-1662f89c0270',), code = 610' VDSM: 2019-06-22 01:13:21,117+0300 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='708997a5-84e2-42f9-8021-7a2ce2139852') Unexpected error (task:1117) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1115, in setPersistence getProcPool().fileUtils.createdir(taskDir) File "/usr/lib/python2.7/site-packages/vdsm/storage/outOfProcess.py", line 169, in createdir self._iop.mkdir(tmpPath) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 510, in mkdir self.timeout) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 448, in _sendCommand raise OSError(errcode, errstr) OSError: [Errno 11] Resource temporarily unavailable 2019-06-22 01:13:21,117+0300 INFO (jsonrpc/0) [vdsm.api] FINISH createVolume error=Can't persist task: (u'708997a5-84e2-42f9-8021-7a2ce2139852: cannot access/create taskdir /rhev/data-center/99a910aa-f9d9-4d21-ba13-b966e02f5920/mastersd/master/tasks/708997a5-84e2-42f9-8021-7a2ce2139852: [Errno 11] Resource temporarily unavailable',) from=::ffff:10.46.16.72,47188, flow_id=snapshots_create_f3de4fca-23ee-460a, task_id=708997a5-84e2-42f9-8021-7a2ce2139852 (api:52) 2019-06-22 01:13:21,117+0300 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='708997a5-84e2-42f9-8021-7a2ce2139852') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in createVolume File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1459, in createVolume volUUID, desc, srcImgUUID, srcVolUUID, initialSize) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 923, in _spmSchedule name, func, *args) File "/usr/lib/python2.7/site-packages/vdsm/storage/taskManager.py", line 74, in scheduleJob task.setPersistence(store, cleanPolicy=TaskCleanType.manual) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1119, in setPersistence " %s: %s" % (self, taskDir, e)) TaskPersistError: Can't persist task: (u'708997a5-84e2-42f9-8021-7a2ce2139852: cannot access/create taskdir /rhev/data-center/99a910aa-f9d9-4d21-ba13-b966e02f5920/mastersd/master/tasks/708997a5-84e2-42f9-8021-7a2ce2139852: [Errno 11] Resource temporarily unavailable',)
Created attachment 1583626 [details] TestCase6052_ovirt-engine-4.3.4.3-0.1_rep_create_volume_scenario
(In reply to Avihai from comment #32) > Another what looks like create snapshot issue - Not sure, Needs Nir ack. > Automation Tier1 TestCase6052. Does not look related, the issue is: > OSError: [Errno 11] Resource temporarily unavailable Meaning there is some issue to access file based storage using ioprocess.
This bug looks like unrelated mix of random errors which are not related. I don't think we can do anything with such bug. Please file separate bugs for every error with clear reproduction step for each one. The only issue we should handle here is the one reported in comment 0.
just saw the the delete volume scenario issue in rhv-4.3.9-6/engine 4.3.9.3-0.1.el7/vdsm-4.30.42-1.el7ev.x86_64. As requested opened new bug 1814318 on this issue. VDSM: 2020-03-17 02:57:36,194+0200 ERROR (tasks/5) [storage.TaskManager.Task] (Task='4d812043-4937-4ef7-b994-8518697ca36c') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1967, in deleteVolume vol.delete(postZero=postZero, force=force, discard=discard) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line 563, in delete self.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1292, in validateDelete self._manifest.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 545, in validateDelete raise se.VolumeImageHasChildren(self) VolumeImageHasChildren: Cannot delete volume which has children (non-ethical): [u'sdUUID: 4f3fbe9e-e4a4-45e7-9b62-27fafe535028', u'imgUUID: 1f99972b-560a-4477-833c-f0261fb28f9f', u'volUUID: 61dd1b0f-d1d3-4bd9-b15d-13da7aa927c4']