Description of problem: An exception is thrown during HSMClearTaskVDSCommand: Error in the engine - 2017-06-21 09:30:17,609+03 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] Failed to save step '846f016e-6a8f-4ecf-9a06-4f7375129b91', 'FINALIZING': CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job" Detail: Key (job_id)=(c6fc638d-34c2-4d62-9844-74dd2238e8a9) is not present in table "job". Where: SQL statement "INSERT INTO step ( step_id, parent_step_id, job_id, step_type, description, step_number, status, progress, start_time, end_time, correlation_id, external_id, external_system_type, is_external ) VALUES ( v_step_id, v_parent_step_id, v_job_id, v_step_type, v_description, v_step_number, v_status, v_progress, v_start_time, v_end_time, v_correlation_id, v_external_id, v_external_system_type, v_is_external )" PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job" Detail: Key (job_id)=(c6fc638d-34c2-4d62-9844-74dd2238e8a9) is not present in table "job". Where: SQL statement "INSERT INTO step ( step_id, parent_step_id, job_id, step_type, description, step_number, status, progress, start_time, end_time, correlation_id, external_id, external_system_type, is_external ) VALUES ( v_step_id, v_parent_step_id, v_job_id, v_step_type, v_description, v_step_number, v_status, v_progress, v_start_time, v_end_time, v_correlation_id, v_external_id, v_external_system_type, v_is_external )" PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement 2017-06-21 09:30:17,610+03 INFO [org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand' successfully. 2017-06-21 09:30:17,613+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] [within thread]: endAction for action type RemoveImage threw an exception.: java.lang.NullPointerException at org.ovirt.engine.core.bll.job.ExecutionHandler.endJob(ExecutionHandler.java:520) [bll.jar:] at org.ovirt.engine.core.bll.job.ExecutionHandler.endTaskJobIfNeeded(ExecutionHandler.java:851) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.endStepsAndJobIfNeeded(CommandBase.java:590) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:563) [bll.jar:] at org.ovirt.engine.core.bll.tasks.DecoratedCommand.endAction(DecoratedCommand.java:17) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CoCoAsyncTaskHelper.endAction(CoCoAsyncTaskHelper.java:337) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandCoordinatorImpl.endAction(CommandCoordinatorImpl.java:340) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.endCommandAction(CommandAsyncTask.java:154) [bll.jar:] at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.lambda$endActionIfNecessary$0(CommandAsyncTask.java:106) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:84) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_131] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131] 2017-06-21 09:30:17,614+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] CommandAsyncTask::HandleEndActionResult: endAction for action type 'RemoveImage' threw an unrecoverable RuntimeException the task will be cleared. 2017-06-21 09:30:17,617+03 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] BaseAsyncTask::removeTaskFromDB: Removed task '50e8f07a-26ff-4733-b836-f20f2fb34061' from DataBase 2017-06-21 09:30:17,617+03 INFO [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] CommandAsyncTask::HandleEndActionRuntimeException: Removing CommandMultiAsyncTasks object for entity '5fe84ab4-d124-4fb9-8ec0-1db553c9ab8e' Version-Release number of selected component (if applicable): vdsm-4.19.19-1.el7ev.x86_64 ovirt-engine-4.1.3.4-0.1.el7.noarch How reproducible: ? Steps to Reproduce: Occurred during automation run (failed to reproduced it localy)- 1. In environment with template which has disk on 2 domains (copies) 2. Create a VM with disk from template as thin on the first domain 3. Enter the second domain to maintenance 4. Run the VM 5. Try to migrate the VM to the second domain. Actual results: An exception was thrown Expected results: We should fail to migrate the disk Additional info: Engine log attached
Fred, didn't you recently handle something like this?
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
No, first time I see this issue. Eyal, can you provide logs ?
The engine logs are attached
(In reply to Eyal Shenitzky from comment #4) > The engine logs are attached It is marked as "deleted" ...
Created attachment 1291715 [details] engine logs
Eyal, can you provide the automation code for that scenario ? Thanks
The automation code is not a simple python script, It is a large scale project that I can't provide on this platform
Tested on ovirt-engine-4.1.4-0.2.el7.noarch All automation tests (mentioned in comment #9) passed