Created attachment 1670839 [details] engine and vdsm logs Description of problem: Issue also was documented many times in bug 1637405 and is hard to reproduce. As decided in bug 1637405 each issue seen there will be handled in a new bug. So here live merge fails, when Snapshot 'snapshot_6038_nfs_1' deletion for VM 'vm_TestCase6038_1702443173' was initiated and fails with the following errors: VDSM log: 020-03-17 02:57:36,194+0200 ERROR (tasks/5) [storage.TaskManager.Task] (Task='4d812043-4937-4ef7-b994-8518697ca36c') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper return method(self, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1967, in deleteVolume vol.delete(postZero=postZero, force=force, discard=discard) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", line 563, in delete self.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 1292, in validateDelete self._manifest.validateDelete() File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line 545, in validateDelete raise se.VolumeImageHasChildren(self) VolumeImageHasChildren: Cannot delete volume which has children (non-ethical): [u'sdUUID: 4f3fbe9e-e4a4-45e7-9b62-27fafe535028', u'imgUUID: 1f99972b-560a-4477-833c-f0261fb28f9f', u'volUUID: 61dd1b0f-d1d3-4bd9-b15d-13da7aa927c4'] Engine log: 2020-03-17 02:57:40,883+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] Failed in 'HSMGetAllTasksStatusesVDS' method 2020-03-17 02:57:40,890+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command HSMGetAllTasksStatusesVDS failed: Cannot delete volume which has children (non-ethical): [u'sdUUID: 4f3fbe9e-e4a4-45e7-9b62-27fafe535028', u'imgUUID: 1f99972b-560a-4477-833c-f0261fb28f9f', u'volUUID: 61dd1b0f-d1d3-4bd9-b15d-13da7aa927c4'] 2020-03-17 02:57:40,891+02 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] SPMAsyncTask::PollTask: Polling task '4d812043-4937-4ef7-b994-8518697ca36c' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'. 2020-03-17 02:57:40,891+02 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] BaseAsyncTask::logEndTaskFailure: Task '4d812043-4937-4ef7-b994-8518697ca36c' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure: -- Result: 'cleanSuccess' -- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): [u'sdUUID: 4f3fbe9e-e4a4-45e7-9b62-27fafe535028', u'imgUUID: 1f99972b-560a-4477-833c-f0261fb28f9f', u'volUUID: 61dd1b0f-d1d3-4bd9-b15d-13da7aa927c4'], code = 100', -- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot delete volume which has children (non-ethical): [u'sdUUID: 4f3fbe9e-e4a4-45e7-9b62-27fafe535028', u'imgUUID: 1f99972b-560a-4477-833c-f0261fb28f9f', u'volUUID: 61dd1b0f-d1d3-4bd9-b15d-13da7aa927c4'], code = 100' 2020-03-17 02:57:40,892+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '1d5065d1-04cd-4e40-9e89-6c6a7731d74f' has ended -> executing 'endAction' 2020-03-17 02:57:40,892+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '1d5065d1-04cd-4e40-9e89-6c6a7731d74f'): calling endAction '. 2020-03-17 02:57:40,892+02 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-9702) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 'DestroyImage', 2020-03-17 02:57:40,894+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.DestroyImageCommand] (EE-ManagedThreadFactory-engine-Thread-9702) [snapshots_delete_10e51810-3a91-4615] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.DestroyImageCommand' with failure. Version-Release number of selected component (if applicable): rhv-4.3.9-6/engine 4.3.9.3-0.1.el7/vdsm-4.30.42-1.el7ev.x86_64 How reproducible: Hard to reproduce, happens once every multiple automation runs. But it's there so do not disregard before investigating. Steps to Reproduce with timestamps from jenkins automation log: 02:44:31 2020-03-17 02:44:31,730 INFO Test Setup 3: Creating VM vm_TestCase6038_1702443173 02:44:31 2020-03-17 02:44:31,881 INFO Test Setup 4: [class] Description: clone vm from a pre-defined template latest-rhel-guest-image-8.2-infra with {'vol_sparse': True, 'name': 'vm_TestCase6038_1702443173', 'clone': False, 'cluster': 'golden_env_mixed_1', 'storagedomain': 'nfs_2', 'display_type': 'spice', 'timeout': 900, 'virtio_scsi': True, 'vol_format': 'cow', 'os_type': 'RHEL6x64', 'type': 'desktop', 'wait': True} 02:44:53 2020-03-17 02:44:53,719 INFO Test Setup 5: Starting VM vm_TestCase6038_1702443173 02:44:53 2020-03-17 02:44:53,722 INFO Test Setup 6: [class] Start VM vm_TestCase6038_1702443173 with {'wait_for_ip': True, 'pause': False, 'use_cloud_init': False, 'timeout': 600, 'wait_for_status': 'up'} 02:49:05 2020-03-17 02:49:05,089 INFO Test Setup 7: Creating disks with filesystem and attach to VM vm_TestCase6038_1702443173 02:50:16 2020-03-17 02:50:16,116 INFO 299: storage/rhevmtests.storage.storage_remove_snapshots.test_live_merge.TestCase6038.test_basic_snapshot_deletion[nfs] 02:50:16 2020-03-17 02:50:16,116 INFO STORAGE: NFS 02:50:16 2020-03-17 02:50:16,415 INFO Test Step 8: Creating files on vm's 'vm_TestCase6038_1702443173' disks 02:50:16 2020-03-17 02:50:16,472 INFO Test Step 9: Creating file /mount-point_1702493084/test_file_0 02:50:18 2020-03-17 02:50:18,533 INFO Test Step 10: Creating file /mount-point_1702494507/test_file_0 02:50:20 2020-03-17 02:50:20,664 INFO Test Step 11: Creating file /mount-point_1702495905/test_file_0 02:50:22 2020-03-17 02:50:22,869 INFO Test Step 12: Creating file /mount-point_1702501315/test_file_0 02:50:25 2020-03-17 02:50:25,793 INFO Test Step 13: Creating snapshot of vm vm_TestCase6038_1702443173 02:50:31 2020-03-17 02:50:31,873 INFO Test Step 14: Before snapshot: 6 volumes 02:50:32 2020-03-17 02:50:31,990 INFO Test Step 15: Stop vm vm_TestCase6038_1702443173 with {'async': 'true'} 02:50:44 2020-03-17 02:50:44,345 INFO Test Step 16: Adding new snapshot to vm vm_TestCase6038_1702443173 02:50:44 2020-03-17 02:50:44,347 INFO Test Step 17: Add snapshot to VM vm_TestCase6038_1702443173 with {'description': 'snapshot_6038_nfs_0', 'wait': True} 02:51:02 2020-03-17 02:51:02,882 INFO Test Step 18: Start VMs with {'vm_list': ['vm_TestCase6038_1702443173'], 'wait_for_ip': True, 'wait_for_status': 'up', 'max_workers': 2} 02:52:48 2020-03-17 02:52:48,457 INFO Test Step 19: After snapshot: 11 volumes 02:52:48 2020-03-17 02:52:48,505 INFO Test Step 20: Creating files on vm's 'vm_TestCase6038_1702443173' disks 02:52:48 2020-03-17 02:52:48,565 INFO Test Step 21: Creating file /mount-point_1702493084/test_file_1 02:52:51 2020-03-17 02:52:51,066 INFO Test Step 22: Creating file /mount-point_1702494507/test_file_1 02:52:53 2020-03-17 02:52:53,179 INFO Test Step 23: Creating file /mount-point_1702495905/test_file_1 02:52:55 2020-03-17 02:52:55,320 INFO Test Step 24: Creating file /mount-point_1702501315/test_file_1 02:52:58 2020-03-17 02:52:58,122 INFO Test Step 25: Creating snapshot of vm vm_TestCase6038_1702443173 02:53:04 2020-03-17 02:53:04,228 INFO Test Step 26: Before snapshot: 11 volumes 02:53:04 2020-03-17 02:53:04,357 INFO Test Step 27: Stop vm vm_TestCase6038_1702443173 with {'async': 'true'} 02:53:16 2020-03-17 02:53:16,717 INFO Test Step 28: Adding new snapshot to vm vm_TestCase6038_1702443173 02:53:16 2020-03-17 02:53:16,720 INFO Test Step 29: Add snapshot to VM vm_TestCase6038_1702443173 with {'description': 'snapshot_6038_nfs_1', 'wait': True} 02:53:36 2020-03-17 02:53:36,722 INFO Test Step 30: Start VMs with {'vm_list': ['vm_TestCase6038_1702443173'], 'wait_for_ip': True, 'wait_for_status': 'up', 'max_workers': 2} 02:55:01 2020-03-17 02:55:01,243 INFO Test Step 31: After snapshot: 16 volumes 02:55:01 2020-03-17 02:55:01,296 INFO Test Step 32: Creating files on vm's 'vm_TestCase6038_1702443173' disks 02:55:01 2020-03-17 02:55:01,353 INFO Test Step 33: Creating file /mount-point_1702493084/test_file_2 02:55:03 2020-03-17 02:55:03,551 INFO Test Step 34: Creating file /mount-point_1702494507/test_file_2 02:55:05 2020-03-17 02:55:05,657 INFO Test Step 35: Creating file /mount-point_1702495905/test_file_2 02:55:07 2020-03-17 02:55:07,764 INFO Test Step 36: Creating file /mount-point_1702501315/test_file_2 02:55:10 2020-03-17 02:55:10,550 INFO Test Step 37: Creating snapshot of vm vm_TestCase6038_1702443173 02:55:16 2020-03-17 02:55:16,687 INFO Test Step 38: Before snapshot: 16 volumes 02:55:16 2020-03-17 02:55:16,813 INFO Test Step 39: Stop vm vm_TestCase6038_1702443173 with {'async': 'true'} 02:55:29 2020-03-17 02:55:29,140 INFO Test Step 40: Adding new snapshot to vm vm_TestCase6038_1702443173 02:55:29 2020-03-17 02:55:29,143 INFO Test Step 41: Add snapshot to VM vm_TestCase6038_1702443173 with {'description': 'snapshot_6038_nfs_2', 'wait': True} 02:55:50 2020-03-17 02:55:50,304 INFO Test Step 42: Start VMs with {'vm_list': ['vm_TestCase6038_1702443173'], 'wait_for_ip': True, 'wait_for_status': 'up', 'max_workers': 2} 02:57:15 2020-03-17 02:57:14,958 INFO Test Step 43: After snapshot: 21 volumes 02:57:22 2020-03-17 02:57:22,687 INFO Test Step 44: Before live merge: 21 volumes 02:57:22 2020-03-17 02:57:22,813 INFO Test Step 45: Removing snapshot 'snapshot_6038_nfs_1' of vm vm_TestCase6038_1702443173 02:57:22 2020-03-17 02:57:22,816 INFO Test Step 46: Remove vm vm_TestCase6038_1702443173 snapshot with {'description': 'snapshot_6038_nfs_1', 'timeout': 2400, 'wait': True} 03:37:25 2020-03-17 03:37:25,350 ERROR Result: FAILED Actual results: snapshot_6038_nfs_1 deletion/removal fails. Expected results: snapshot_6038_nfs_1 deletion/removal suceeeds. Additional info:
This looks like a duplicate of bug 1802277. Vojta, can you confirm?
Duplicates BZ #1802277 *** This bug has been marked as a duplicate of bug 1802277 ***