+++ This bug is a downstream clone. The original bug is: +++ +++ bug 1727025 +++ ====================================================================== Description of problem: Live merge during continuous IO seems to succeed, however NPE can be seen in the logs: ERROR [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-1480) [] [within thread]: endAction for action type DestroyImage threw an exception.: java.lang.NullPointerException at org.ovirt.engine.core.bll.tasks.CoCoAsyncTaskHelper.endAction(CoCoAsyncTaskHelper.java:358) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCoordinatorImpl.endAction(CommandCoordinatorImpl.java:348) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.endCommandAction(CommandAsyncTask.java:150) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.lambda$endActionIfNecessary$0(CommandAsyncTask.java:103) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:96) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_191] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_191] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent-1.0.jar:] at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78) Version-Release number of selected component (if applicable): ovirt-engine: 4.4.0-SNAPSHOT How reproducible: NFS storage Steps to Reproduce: 1. Create VM with disk + OS 2. Add 4 disks to VM 3. Create live snapshot 4. Perform dd from the OS disk to one of the other disks 5. Live merge the snapshot Actual results: Live merge succeeds, but there are exceptions in engine log Expected results: No exceptions Additional info: (Originally by Fedor Gavrilov)
I see the same issue at 4.3.5-6 (ovirt-engine 4.3.5.4/vdsm-4.30.24-2.el7ev.x86_64) in move live disks operations(no IO is done) on FCP SD. Task to remove host/deactivate host/StopSPM get stuck for hours and operations that depend on clean tasks (as StopSPM/move host to maintenance etc) fails failing many automation TC's -> raising priority. The tests fails as SPM is not stoped as a task is still running(with status 'finished'). Not stopping SPM on vds host_mixed_2, pool id 43238d81-6835-4217-8b81-951b564655d2 as there are uncleared tasks Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished' Task for moving live disk(TestCase11619) was finished but not removed from DB for hours from 2019-07-11 22:55 until 2019-07-12 06:29:44. Issue occured here: 2019-07-11 22:55:28,516+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-44) [disks_syncAction_f770e3d9-8595-41d3] EVENT_ID: USER_MOVED_DISK_FINISHED_SUCCESS(2,010), User admin@internal-authz finished moving disk vm_thin_disk_image_11619_Disk1 to domain fcp_2. 2019-07-11 22:55:31,893+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] Polling and updating Async Tasks: 1 tasks, 1 tasks to poll now 2019-07-11 22:55:31,899+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] SPMAsyncTask::PollTask: Polling task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. 2019-07-11 22:55:31,899+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] BaseAsyncTask::onTaskEndSuccess: Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. 2019-07-11 22:55:31,900+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] CommandAsyncTask::endActionIfNecessary: All tasks of command '4d1745ef-44af-4e7d-abb9-67edd1748a58' has ended -> executing 'endAction' 2019-07-11 22:55:31,900+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: '4d1745ef-44af-4e7d-abb9-67edd1748a58'): calling endAction '. 2019-07-11 22:55:31,901+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3678) [] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 'DestroyImage', 2019-07-11 22:55:31,901+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3678) [] [within thread]: endAction for action type DestroyImage threw an exception.: java.lang.NullPointerException at org.ovirt.engine.core.bll.tasks.CoCoAsyncTaskHelper.endAction(CoCoAsyncTaskHelper.java:358) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCoordinatorImpl.endAction(CommandCoordinatorImpl.java:348) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.endCommandAction(CommandAsyncTask.java:150) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.lambda$endActionIfNecessary$0(CommandAsyncTask.java:103) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:96) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_222-ea] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_222-ea] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_222-ea] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] 2019-07-11 22:55:31,902+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3678) [] CommandAsyncTask::HandleEndActionResult: endAction for action type 'DestroyImage' threw an unrecoverable RuntimeException the task will be cleared. 2019-07-11 22:55:31,908+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3678) [] BaseAsyncTask::removeTaskFromDB: Removed task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' from DataBase 22:55:35 2019-07-11 22:55:34,943 INFO Test Step 9: Moving disk a1b9bda5-243e-4358-a98a-a398df1d507a to storage domain fcp_2 [root@hosted-engine-06 ovirt-engine]# zgrep '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' engine.log* engine.log-20190712.gz:2019-07-11 22:55:22,740+03 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (EE-ManagedThreadFactory-commandCoordinator-Thread-5) [disks_syncAction_f770e3d9-8595-41d3] CommandMultiAsyncTasks::attachTask: Attaching task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' to command '4d1745ef-44af-4e7d-abb9-67edd1748a58'. engine.log-20190712.gz:2019-07-11 22:55:22,757+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-commandCoordinator-Thread-5) [disks_syncAction_f770e3d9-8595-41d3] Adding task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. engine.log-20190712.gz:2019-07-11 22:55:22,774+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-commandCoordinator-Thread-5) [disks_syncAction_f770e3d9-8595-41d3] BaseAsyncTask::startPollingTask: Starting to poll task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7'. engine.log-20190712.gz:2019-07-11 22:55:22,774+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-commandCoordinator-Thread-5) [disks_syncAction_f770e3d9-8595-41d3] BaseAsyncTask::startPollingTask: Starting to poll task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7'. engine.log-20190712.gz:2019-07-11 22:55:31,899+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] SPMAsyncTask::PollTask: Polling task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. engine.log-20190712.gz:2019-07-11 22:55:31,899+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [] BaseAsyncTask::onTaskEndSuccess: Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. engine.log-20190712.gz:2019-07-11 22:55:31,908+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-3678) [] BaseAsyncTask::removeTaskFromDB: Removed task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' from DataBase engine.log-20190712.gz:2019-07-12 01:52:40,445+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-21) [hosts_syncAction_81f17357-1567-4103] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_2', pool id '43238d81-6835-4217-8b81-951b564655d2' as there are uncleared tasks 'Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished'' engine.log-20190712.gz:2019-07-12 01:52:40,453+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-21) [hosts_syncAction_81f17357-1567-4103] EVENT_ID: VDS_ALERT_NOT_STOPPING_SPM_UNCLEARED_TASKS(9,030), Not stopping SPM on vds host_mixed_2, pool id 43238d81-6835-4217-8b81-951b564655d2 as there are uncleared tasks Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished' engine.log-20190712.gz:2019-07-12 01:52:45,859+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-21) [hosts_syncAction_2019249e-285b-459e] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_2', pool id '43238d81-6835-4217-8b81-951b564655d2' as there are uncleared tasks 'Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished'' engine.log-20190712.gz:2019-07-12 01:52:45,867+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-21) [hosts_syncAction_2019249e-285b-459e] EVENT_ID: VDS_ALERT_NOT_STOPPING_SPM_UNCLEARED_TASKS(9,030), Not stopping SPM on vds host_mixed_2, pool id 43238d81-6835-4217-8b81-951b564655d2 as there are uncleared tasks Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished' engine.log-20190712.gz:2019-07-12 01:53:08,156+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default task-21) [hosts_syncAction_abb918c3-1807-405a] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_2', pool id '43238d81-6835-4217-8b81-951b564655d2' as there are uncleared tasks 'Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished'' engine.log-20190712.gz:2019-07-12 01:53:08,163+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-21) [hosts_syncAction_abb918c3-1807-405a] EVENT_ID: VDS_ALERT_NOT_STOPPING_SPM_UNCLEARED_TASKS(9,030), Not stopping SPM on vds host_mixed_2, pool id 43238d81-6835-4217-8b81-951b564655d2 as there are uncleared tasks Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7', status 'finished' engine.log-20190713.gz:2019-07-12 06:28:37,917+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engine-Thread-11) [44c69927] Adding task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. engine.log-20190713.gz:2019-07-12 06:28:38,215+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-11) [44c69927] BaseAsyncTask::startPollingTask: Starting to poll task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7'. engine.log-20190713.gz:2019-07-12 06:28:44,463+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:28:54,466+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-49) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:04,468+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-37) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:14,471+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-94) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:24,472+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:34,474+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-97) [44c69927] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:44,476+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:44,476+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:44,863+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] Task id '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190713.gz:2019-07-12 06:29:44,863+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] SPMAsyncTask::PollTask: Polling task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. engine.log-20190713.gz:2019-07-12 06:29:44,863+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] BaseAsyncTask::onTaskEndSuccess: Task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. engine.log-20190713.gz:2019-07-12 06:29:44,863+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] SPMAsyncTask::ClearAsyncTask: Attempting to clear task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' engine.log-20190713.gz:2019-07-12 06:29:44,864+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] START, SPMClearTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='43238d81-6835-4217-8b81-951b564655d2', ignoreFailoverLimit='false', taskId='5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7'}), log id: 5e495ff3 engine.log-20190713.gz:2019-07-12 06:29:44,865+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] START, HSMClearTaskVDSCommand(HostName = host_mixed_2, HSMTaskGuidBaseVDSCommandParameters:{hostId='f31df623-c3c9-420f-a78d-a3a9a4114b42', taskId='5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7'}), log id: c6f0343 engine.log-20190713.gz:2019-07-12 06:29:44,911+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-13) [] BaseAsyncTask::removeTaskFromDB: Removed task '5684fb07-4ff8-4d1f-b12f-8c2dc3afe0c7' from DataBase Scenario and time stamps: 22:51:03 2019-07-11 22:51:03,714 INFO Test Setup 1: [class] Create new vm with nic nic1, disk and OS with {'diskActive': True, 'wipe_after_delete': 'false', 'slim': False, 'cluster': 'golden_env_mixed_1', 'linked': 'true', 'network': 'ovirtmgmt', 'provisioned_size': 6442450944, 'cpu_socket': 1, 'nicType': 'virtio', 'cpu_cores': 1, 'plugged': 'true', 'start': 'false', 'vmDescription': '', 'memory': 1073741824, 'vmName': 'vm_thin_disk_image_11619', 'template': 'Blank', 'volumeFormat': 'cow', 'display_type': 'spice', 'installation': False, 'storageDomainName': 'fcp_1', 'attempt': 60, 'diskInterface': 'virtio', 'bootable': True, 'type': 'desktop', 'copy_permissions': False, 'interval': 60, 'volumeType': True, 'diskType': 'data', 'useAgent': True, 'async': False, 'os_type': 'RHEL6x64'} 22:51:03 2019-07-11 22:51:03,718 INFO Test Setup 2: [class] Description vm_thin_disk_image_11619: add new vm (without starting it) with {'name': 'vm_thin_disk_image_11619', 'cpu_socket': 1, 'copy_permissions': False, 'cpu_cores': 1, 'cluster': 'golden_env_mixed_1', 'display_type': 'spice', 'template': 'Blank', 'memory': 1073741824, 'async': False, 'os_type': 'RHEL6x64', 'type': 'desktop', 'wait': True} 22:51:04 2019-07-11 22:51:04,356 INFO Test Setup 3: [class] Add vNIC name nic1 to VM vm_thin_disk_image_11619 with {'interface': 'virtio', 'vnic_profile': 'ovirtmgmt', 'linked': 'true', 'network': 'ovirtmgmt', 'plugged': 'true'} 22:51:26 2019-07-11 22:51:25,927 INFO Test Setup 4: [class] Create new vm with nic nic1, disk and OS with {'diskActive': True, 'wipe_after_delete': 'false', 'slim': False, 'cluster': 'golden_env_mixed_1', 'linked': 'true', 'network': 'ovirtmgmt', 'provisioned_size': 6442450944, 'cpu_socket': 1, 'nicType': 'virtio', 'cpu_cores': 1, 'plugged': 'true', 'start': 'false', 'vmDescription': '', 'memory': 1073741824, 'vmName': 'vm_prealloc_disk_image_11619', 'template': 'Blank', 'volumeFormat': 'raw', 'display_type': 'spice', 'installation': False, 'storageDomainName': 'fcp_1', 'attempt': 60, 'diskInterface': 'virtio', 'bootable': True, 'type': 'desktop', 'copy_permissions': False, 'interval': 60, 'volumeType': False, 'diskType': 'data', 'useAgent': True, 'async': False, 'os_type': 'RHEL6x64'} 22:51:26 2019-07-11 22:51:25,931 INFO Test Setup 5: [class] Description vm_prealloc_disk_image_11619: add new vm (without starting it) with {'name': 'vm_prealloc_disk_image_11619', 'cpu_socket': 1, 'copy_permissions': False, 'cpu_cores': 1, 'cluster': 'golden_env_mixed_1', 'display_type': 'spice', 'template': 'Blank', 'memory': 1073741824, 'async': False, 'os_type': 'RHEL6x64', 'type': 'desktop', 'wait': True} 22:51:26 2019-07-11 22:51:26,661 INFO Test Setup 6: [class] Add vNIC name nic1 to VM vm_prealloc_disk_image_11619 with {'interface': 'virtio', 'vnic_profile': 'ovirtmgmt', 'linked': 'true', 'network': 'ovirtmgmt', 'plugged': 'true'} 22:51:38 2019-07-11 22:51:38,918 INFO 014: storage/rhevmtests.storage.storage_disk_image_format.test_disk_image_format.TestCase11619.test_live_move_disk[fcp] 22:51:38 2019-07-11 22:51:38,918 INFO Start a live disk migration 22:51:39 2019-07-11 22:51:38,918 INFO * Thin provisioned disk should remain the same 22:51:39 2019-07-11 22:51:38,918 INFO * Preallocated disk should change to thin provisioned 22:51:39 2019-07-11 22:51:38,918 INFO STORAGE: FCP 22:51:39 2019-07-11 22:51:38,921 INFO Test Step 7: Start VMs with {'max_workers': 2, 'wait_for_ip': False, 'wait_for_status': 'up', 'vm_list': ['vm_prealloc_disk_image_11619', 'vm_thin_disk_image_11619']} 22:53:00 2019-07-11 22:53:00,305 INFO Test Step 8: Moving disk c09c2b16-f47f-4624-abcb-153da8076fbd to storage domain fcp_2 22:55:35 2019-07-11 22:55:34,943 INFO Test Step 9: Moving disk a1b9bda5-243e-4358-a98a-a398df1d507a to storage domain fcp_2 (Originally by Avihai Efrat)
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. (Originally by pm-rhel)
Created attachment 1590489 [details] engine and vdsm logs 4.3.5-6(ovirt engine 4.3.5.4) (Originally by Avihai Efrat)
Impact: Cannot deactivate SD as host as active tasks Cannot move the host to maintenance/StopSPM as host as active tasks. Another hit of the same issue on Tier3 runs(a different setup than the previous failure) Issue fount at Engine one more time this time a SD can not be deactivated as the host/SPM had unclear tasks due to the following leaving leftover causing creation of SD on the same LUN to fail. Task started at 2019-07-13 11:35:11 and cleaned from DB at 2019-07-13 14:45:09 ( ~3H later!) : 2019-07-13 11:35:17,493+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (EE-ManagedThreadFactory-engine-Thread-34495) [] [within thread]: endAction for action type DestroyImage threw an exception.: java.lang.NullPointerException at org.ovirt.engine.core.bll.tasks.CoCoAsyncTaskHelper.endAction(CoCoAsyncTaskHelper.java:358) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCoordinatorImpl.endAction(CommandCoordinatorImpl.java:348) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.endCommandAction(CommandAsyncTask.java:150) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.lambda$endActionIfNecessary$0(CommandAsyncTask.java:103) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:96) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_222-ea] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_222-ea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_222-ea] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_222-ea] at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1] Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9') flow in engine log: [root@hosted-engine-04 ovirt-engine]# zgrep '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' engine.log* Started at 2019-07-13 11:35:11 and cleaned from DB at 2019-07-13 14:45:09 ( ~3H later!) : engine.log-20190714.gz:2019-07-13 11:35:11,009+03 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (EE-ManagedThreadFactory-commandCoordinator-Thread-4) [snapshots_delete_350a15e1-140b-40c8] CommandMultiAsyncTasks::attachTask: Attaching task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' to command '1117605c-a708-4f79-9b7e-a2bc36269a0c'. engine.log-20190714.gz:2019-07-13 11:35:11,018+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-commandCoordinator-Thread-4) [snapshots_delete_350a15e1-140b-40c8] Adding task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. engine.log-20190714.gz:2019-07-13 11:35:11,025+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-commandCoordinator-Thread-4) [snapshots_delete_350a15e1-140b-40c8] BaseAsyncTask::startPollingTask: Starting to poll task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9'. engine.log-20190714.gz:2019-07-13 11:35:11,025+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-commandCoordinator-Thread-4) [snapshots_delete_350a15e1-140b-40c8] BaseAsyncTask::startPollingTask: Starting to poll task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9'. engine.log-20190714.gz:2019-07-13 11:35:17,491+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-64) [] SPMAsyncTask::PollTask: Polling task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. engine.log-20190714.gz:2019-07-13 11:35:17,491+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-64) [] BaseAsyncTask::onTaskEndSuccess: Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'DestroyImage', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. engine.log-20190714.gz:2019-07-13 11:35:17,515+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-34495) [] BaseAsyncTask::removeTaskFromDB: Removed task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' from DataBase engine.log-20190714.gz:2019-07-13 11:36:03,592+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-98) [41f555ad] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_1', pool id '449cbed6-2319-4cc4-8c6e-8eea85bfc222' as there are uncleared tasks 'Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9', status 'finished'' engine.log-20190714.gz:2019-07-13 11:36:03,596+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-98) [41f555ad] EVENT_ID: VDS_ALERT_NOT_STOPPING_SPM_UNCLEARED_TASKS(9,030), Not stopping SPM on vds host_mixed_1, pool id 449cbed6-2319-4cc4-8c6e-8eea85bfc222 as there are uncleared tasks Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9', status 'finished' engine.log-20190714.gz:2019-07-13 11:36:06,428+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (EE-ManagedThreadFactory-engine-Thread-34526) [e493569] SpmStopVDSCommand::Not stopping SPM on vds 'host_mixed_1', pool id '449cbed6-2319-4cc4-8c6e-8eea85bfc222' as there are uncleared tasks 'Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9', status 'finished'' engine.log-20190714.gz:2019-07-13 11:36:06,433+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-34526) [e493569] EVENT_ID: VDS_ALERT_NOT_STOPPING_SPM_UNCLEARED_TASKS(9,030), Not stopping SPM on vds host_mixed_1, pool id 449cbed6-2319-4cc4-8c6e-8eea85bfc222 as there are uncleared tasks Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9', status 'finished' engine.log-20190714.gz:2019-07-13 14:44:01,657+03 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engine-Thread-10) [6c60c302] Adding task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters'), polling hasn't started yet.. engine.log-20190714.gz:2019-07-13 14:44:01,772+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engine-Thread-10) [6c60c302] BaseAsyncTask::startPollingTask: Starting to poll task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9'. engine.log-20190714.gz:2019-07-13 14:44:07,658+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-36) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:44:17,664+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-96) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:44:27,668+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-72) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:44:37,673+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-46) [452a4857] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:44:47,677+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-3) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:44:57,681+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-21) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' is in pre-polling period and should not be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:45:07,685+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:45:07,689+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:45:09,194+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] Task id '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' has passed pre-polling period time and should be polled. Pre-polling period is 60000 millis. engine.log-20190714.gz:2019-07-13 14:45:09,194+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] SPMAsyncTask::PollTask: Polling task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'success'. engine.log-20190714.gz:2019-07-13 14:45:09,194+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] BaseAsyncTask::onTaskEndSuccess: Task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended successfully. engine.log-20190714.gz:2019-07-13 14:45:09,194+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] SPMAsyncTask::ClearAsyncTask: Attempting to clear task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' engine.log-20190714.gz:2019-07-13 14:45:09,195+03 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] START, SPMClearTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='449cbed6-2319-4cc4-8c6e-8eea85bfc222', ignoreFailoverLimit='false', taskId='4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9'}), log id: 614994cf engine.log-20190714.gz:2019-07-13 14:45:09,196+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] START, HSMClearTaskVDSCommand(HostName = host_mixed_1, HSMTaskGuidBaseVDSCommandParameters:{hostId='6b0e3e6a-f290-4fa0-ba07-de8da47f48ae', taskId='4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9'}), log id: 649fbebb engine.log-20190714.gz:2019-07-13 14:45:09,231+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-81) [] BaseAsyncTask::removeTaskFromDB: Removed task '4bcfc7c3-f8a4-444e-8468-6f51c6c1b4b9' from DataBase (Originally by Avihai Efrat)
Adding TC scenario and timestamps: 11:30:30 2019-07-13 11:30:30,116 INFO Test Setup 1: Create data-center dc_type_1311303006 with cluster cl_type_1311303006 and host host_mixed_1 11:30:30 2019-07-13 11:30:30,116 INFO Test Step 2: Add data-center dc_type_1311303006 11:30:30 2019-07-13 11:30:30,243 INFO Test Step 3: Add cluster cl_type_1311303006 11:30:30 2019-07-13 11:30:30,246 INFO Test Setup 4: [class] Add cluster with {'version': '4.2', 'data_center': 'dc_type_1311303006', 'cpu': 'Intel Westmere Family', 'name': 'cl_type_1311303006'} 11:30:30 2019-07-13 11:30:30,448 INFO Test Step 5: Move host host_mixed_1 to cluster cl_type_1311303006 11:30:30 2019-07-13 11:30:30,451 INFO Test Setup 6: [class] Switch host host_mixed_1 to different cluster cl_type_1311303006 with {'activate': True} 11:30:30 2019-07-13 11:30:30,503 INFO Test Setup 7: [class] Deactivate the host host_mixed_1. with {'expected_status': 'maintenance'} 11:30:41 2019-07-13 11:30:40,943 INFO Test Setup 8: [class] Update properties of host host_mixed_1 (provided in parameters) with {'cluster': 'cl_type_1311303006'} 11:30:43 2019-07-13 11:30:43,238 INFO Test Setup 9: [class] Activate host host_mixed_1 (set status to UP) with {'wait': True} 11:30:53 2019-07-13 11:30:53,443 INFO Test Setup 10: Move host host_mixed_1 into the newly created cluster 11:30:53 2019-07-13 11:30:53,445 INFO Test Setup 11: [class] Switch host host_mixed_1 to different cluster cl_type_1311303006 with {'activate': True} 11:30:53 2019-07-13 11:30:53,499 INFO Test Setup 12: [class] Deactivate the host host_mixed_1. with {'expected_status': 'maintenance'} 11:31:04 2019-07-13 11:31:03,958 INFO Test Setup 13: [class] Update properties of host host_mixed_1 (provided in parameters) with {'cluster': 'cl_type_1311303006'} 11:31:04 2019-07-13 11:31:04,244 INFO Test Setup 14: [class] Activate host host_mixed_1 (set status to UP) with {'wait': True} 11:31:14 2019-07-13 11:31:14,473 INFO Test Setup 15: Create new storage domain sd_TestCase26087_1311311447 11:31:14 2019-07-13 11:31:14,476 INFO Test Setup 16: [class] Add storage domain to given data-center with {'index': 0, 'storage_type': 'iscsi', 'storage_domain': 'sd_TestCase26087_1311311447', 'data_center': 'dc_type_1311303006'} 11:32:32 2019-07-13 11:32:32,753 INFO Test Setup 17: Creating VM vm_TestCase26087_1311323275 11:32:32 2019-07-13 11:32:32,837 INFO Test Setup 18: [class] Create new vm with nic nic1, disk and OS with {'diskActive': True, 'wipe_after_delete': 'false', 'slim': False, 'cluster': 'cl_type_1311303006', 'linked': 'true', 'network': 'ovirtmgmt', 'provisioned_size': 6442450944, 'cpu_socket': 1, 'nicType': 'virtio', 'cpu_cores': 1, 'plugged': 'true', 'start': 'false', 'vmDescription': '', 'memory': 1073741824, 'vmName': 'vm_TestCase26087_1311323275', 'template': 'Blank', 'volumeFormat': 'cow', 'display_type': 'spice', 'installation': False, 'storageDomainName': 'sd_TestCase26087_1311311447', 'attempt': 60, 'diskInterface': 'virtio', 'bootable': True, 'type': 'desktop', 'copy_permissions': False, 'interval': 60, 'volumeType': True, 'diskType': 'data', 'useAgent': True, 'async': False, 'os_type': 'RHEL6x64'} 11:32:32 2019-07-13 11:32:32,840 INFO Test Setup 19: [class] Description vm_TestCase26087_1311323275: add new vm (without starting it) with {'name': 'vm_TestCase26087_1311323275', 'cpu_socket': 1, 'copy_permissions': False, 'cpu_cores': 1, 'cluster': 'cl_type_1311303006', 'display_type': 'spice', 'template': 'Blank', 'memory': 1073741824, 'async': False, 'os_type': 'RHEL6x64', 'type': 'desktop', 'wait': True} 11:32:33 2019-07-13 11:32:33,512 INFO Test Setup 20: [class] Add vNIC name nic1 to VM vm_TestCase26087_1311323275 with {'interface': 'virtio', 'vnic_profile': 'ovirtmgmt', 'linked': 'true', 'network': 'ovirtmgmt', 'plugged': 'true'} 11:32:45 2019-07-13 11:32:45,023 INFO 191: storage/rhevmtests.storage.storage_remove_snapshots.test_live_merge.TestCase26087.test_merge_after_cluster_upgrade[iscsi] 11:32:45 2019-07-13 11:32:45,023 INFO Live merge after cluster upgrade from 4.2 to latest version 11:32:45 2019-07-13 11:32:45,024 INFO STORAGE: ISCSI 11:32:45 2019-07-13 11:32:45,024 INFO Test Setup 21: Starting VM vm_TestCase26087_1311323275 with cloud-init as run once 11:32:45 2019-07-13 11:32:45,026 INFO Test Step 22: Run VM vm_TestCase26087_1311323275 once with {'use_cloud_init': True, 'pause': False, 'wait_for_state': 'up', 'use_sysprep': False} 11:34:13 2019-07-13 11:34:06,098 INFO Test Setup 23: Upgrading cluster cl_type_1311303006 from version 4.2 to version 4.3 11:34:13 2019-07-13 11:34:06,102 INFO Test Step 24: Update cluster cl_type_1311303006 with {'version': '4.3'} 11:34:14 2019-07-13 11:34:06,591 INFO Test Step 25: Add snapshot to VM vm_TestCase26087_1311323275 with {'persist_memory': False, 'description': 'snap_test_merge_after_cluste_1311340658', 'wait': True} 11:34:30 2019-07-13 11:34:30,519 INFO Test Setup 26: Removing snapshot snap_test_merge_after_cluste_1311340658 for VM vm_TestCase26087_1311323275 11:34:30 2019-07-13 11:34:30,521 INFO Test Step 27: Remove vm vm_TestCase26087_1311323275 snapshot with {'description': 'snap_test_merge_after_cluste_1311340658', 'timeout': 2400, 'wait': True} 11:35:17 2019-07-13 11:35:17,888 INFO Result: PASSED 11:35:17 .2019-07-13 11:35:17,892 INFO Test Teardown 28: Power off VM vm_TestCase26087_1311323275 11:35:18 2019-07-13 11:35:18,435 INFO Test Teardown 29: [class] Stop vm vm_TestCase26087_1311323275 with {'async': 'true'} 11:35:21 2019-07-13 11:35:21,526 INFO Test Teardown 30: Remove VM vm_TestCase26087_1311323275 11:35:21 2019-07-13 11:35:21,780 INFO Test Teardown 31: [class] Remove VM vm_TestCase26087_1311323275 with {'wait': False} 11:35:27 2019-07-13 11:35:27,838 INFO Test Teardown 32: Clean data-center dc_type_1311303006 and remove it 11:35:28 2019-07-13 11:35:28,014 INFO Test Step 33: Deactivate storage-domain sd_TestCase26087_1311311447 11:40:34 E (Originally by Avihai Efrat)
*** Bug 1733931 has been marked as a duplicate of this bug. *** (Originally by Dana Elfassy)
Please provide a clear reproducer to verify this bug. Without one I can not provide qa_ack for this bug. Are the following still the verification scenario? How reproducible: NFS storage Steps to Reproduce: 1. Create VM with disk + OS 2. Add 4 disks to VM 3. Create live snapshot 4. Perform dd from the OS disk to one of the other disks => can you give the CLI of exactly what dd do you do here? 5. Live merge the snapshot (Originally by Avihai Efrat)
Tested on: ovirt-engine-4.3.6.5-0.1.el7.noarch vdsm-4.30.30-1.el7ev.x86_64 Verification steps: 1. Create a VM with disk (on NFS) and with OS (rhel-8) from template 2. Add 3 more disks (NFS): 1 disk of 10 GiB 3 disks of 5 GiB 3. Start the VM 4. Create a bash file on the root of the OS disk containing: #!/bin/bash while true do dd if=1GB.bin of=/mnt_dir/1GB_target.bin done 5. Create a 1 GiB file on the OS disk using: # dd if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock 6. Create FS on the 10 GiB disk + mount 7. Execute the dd-bash-file 8. Take live snapshot (with memory) 9. Live merge the snapshot * Tried steps 8 & 9 multiple times during the dd operation - all went fine without any errors or exceptions. Also, tried steps 8 & 9 while interrupting the bash and restarting in during the same snapshot creation and/or merge. Live migrate the target disk (as for comment #1) to other NFS domain while the dd was running - worked without errors. Checked that all the tasks were removed from the DB after some time - all tasks were deleted successfully After stopping and removing the VM, I tried to move both the SD and the host to maintenance (as for comment #1) - all worked as excepted Moving to VERIFIED
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed: [Tag 'ovirt-engine-4.3.5.6' doesn't contain patch 'https://gerrit.ovirt.org/103068'] gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.3.5.6 For more info please contact: rhv-devops
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:3010
The bz was verified following those steps: rhv-release-4.4.0-23-001.noarch vdsm-4.40.5-1.el8ev.x86_64 1. Create a VM with disk (on NFS) and with OS (rhel-8) from template 2. Add 3 more disks (NFS): 1 disk of 10 GiB (OS) 3 disks of 5 GiB 3. Start the VM 4. Create a bash file on the root of the OS disk containing: #!/bin/bash while true do dd if=1GB.bin of=/mnt/1GB_target.bin done 5. Create a 1 GiB file on the OS disk using: # dd if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock 6. Create FS on the 10 GiB disk + make FS mkfs.ext4 + mount on /mnt/ 8. Take live snapshot (with memory) 9. Execute the dd-bash-file 10. Live merge the snapshot No errors were seen in engine log besides this one: 2020-03-09 09:26:08,624+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-56) [] VM 'e78caf37-ddba-499d-9290-eb24f8d2ab84' managed non pluggable device was removed unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='9fa051e3-a635-4917-ae03-365d4228a6af', vmId='e78caf37-ddba-499d-9290-eb24f8d2ab84'}', device='usb', type='CONTROLLER', specParams='[index=0, model=qemu-xhci]', address='', managed='true', plugged='false', readOnly='false', deviceAlias='usb', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}' One remark i have to say, the process of Live merge is taking lot of time. aprox ~10 minutes, I dont see why this takes so long. START: 2020-03-09 09:25:59,861+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-57) [] EVENT_ID: USER_REMOVE_SNAPSHOT(342), Snapshot 'live_snap' deletion for VM 'bz_1727025' was initiated by admin@internal-authz. END: 2020-03-09 09:35:24,916+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-22) [6b95d438-6b9e-4b4c-b471-81b02ed4977a] EVENT_ID: USER_REMOVE_SNAPSHOT_FINISHED_SUCCESS(356), Snapshot 'live_snap' deletion for VM 'bz_1727025' has been completed. Attaching vdsm + engine log storage-ge-09 Please NOTE, the vdsm log is flooded with those errors [1]. They related to another BZ [2] We assume they dont affect the functionality of was was tested. [1] libvirt.libvirtError: Requested operation is not valid: cgroup CPU controller is not mounted [2] https://bugzilla.redhat.com/show_bug.cgi?id=1810416
Created attachment 1668604 [details] verification logs engine + vdsm