Bug 1464348 - SQL exception occur during HSMClearTaskVDSCommand
Summary: SQL exception occur during HSMClearTaskVDSCommand
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.4
: 4.1.4
Assignee: Fred Rolland
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-23 07:42 UTC by Eyal Shenitzky
Modified: 2017-07-28 14:20 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-07-28 14:20:38 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
engine log (deleted)
2017-06-23 07:42 UTC, Eyal Shenitzky
no flags Details
engine logs (3.86 MB, application/x-gzip)
2017-06-25 12:44 UTC, Eyal Shenitzky
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 78775 0 master MERGED engine: Detach RemoveImage from MoveDisk command 2017-06-28 12:45:48 UTC
oVirt gerrit 78783 0 ovirt-engine-4.1 MERGED engine: Detach RemoveImage from MoveDisk command 2017-06-29 09:27:03 UTC

Description Eyal Shenitzky 2017-06-23 07:42:46 UTC
Description of problem:

An exception is thrown during HSMClearTaskVDSCommand:

Error in the engine - 

2017-06-21 09:30:17,609+03 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] Failed to save step '846f016e-6a8f-4ecf-9a06-4f7375129b91', 'FINALIZING': CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
  Detail: Key (job_id)=(c6fc638d-34c2-4d62-9844-74dd2238e8a9) is not present in table "job".
  Where: SQL statement "INSERT INTO step (
        step_id,
        parent_step_id,
        job_id,
        step_type,
        description,
        step_number,
        status,
        progress,
        start_time,
        end_time,
        correlation_id,
        external_id,
        external_system_type,
        is_external
        )
    VALUES (
        v_step_id,
        v_parent_step_id,
        v_job_id,
        v_step_type,
        v_description,
        v_step_number,
        v_status,
        v_progress,
        v_start_time,
        v_end_time,
        v_correlation_id,
        v_external_id,
        v_external_system_type,
        v_is_external
        )"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
  Detail: Key (job_id)=(c6fc638d-34c2-4d62-9844-74dd2238e8a9) is not present in table "job".
  Where: SQL statement "INSERT INTO step (
        step_id,
        parent_step_id,
        job_id,
        step_type,
        description,
        step_number,
        status,
        progress,
        start_time,
        end_time,
        correlation_id,
        external_id,
        external_system_type,
        is_external
        )
    VALUES (
        v_step_id,
        v_parent_step_id,
        v_job_id,
        v_step_type,
        v_description,
        v_step_number,
        v_status,
        v_progress,
        v_start_time,
        v_end_time,
        v_correlation_id,
        v_external_id,
        v_external_system_type,
        v_is_external
        )"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement
2017-06-21 09:30:17,610+03 INFO  [org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.RemoveImageCommand' successfully.
2017-06-21 09:30:17,613+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] [within thread]: endAction for action type RemoveImage threw an exception.: java.lang.NullPointerException
	at org.ovirt.engine.core.bll.job.ExecutionHandler.endJob(ExecutionHandler.java:520) [bll.jar:]
	at org.ovirt.engine.core.bll.job.ExecutionHandler.endTaskJobIfNeeded(ExecutionHandler.java:851) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endStepsAndJobIfNeeded(CommandBase.java:590) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:563) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.DecoratedCommand.endAction(DecoratedCommand.java:17) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.CoCoAsyncTaskHelper.endAction(CoCoAsyncTaskHelper.java:337) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.CommandCoordinatorImpl.endAction(CommandCoordinatorImpl.java:340) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.endCommandAction(CommandAsyncTask.java:154) [bll.jar:]
	at org.ovirt.engine.core.bll.tasks.CommandAsyncTask.lambda$endActionIfNecessary$0(CommandAsyncTask.java:106) [bll.jar:]
	at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:84) [utils.jar:]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_131]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_131]
	at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]

2017-06-21 09:30:17,614+03 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] CommandAsyncTask::HandleEndActionResult: endAction for action type 'RemoveImage' threw an unrecoverable RuntimeException the task will be cleared.
2017-06-21 09:30:17,617+03 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] BaseAsyncTask::removeTaskFromDB: Removed task '50e8f07a-26ff-4733-b836-f20f2fb34061' from DataBase
2017-06-21 09:30:17,617+03 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-38) [disks_syncAction_376ef93d-3f7a-4149] CommandAsyncTask::HandleEndActionRuntimeException: Removing CommandMultiAsyncTasks object for entity '5fe84ab4-d124-4fb9-8ec0-1db553c9ab8e'


Version-Release number of selected component (if applicable):
vdsm-4.19.19-1.el7ev.x86_64
ovirt-engine-4.1.3.4-0.1.el7.noarch

How reproducible:
?

Steps to Reproduce:
Occurred during automation run (failed to reproduced it localy)- 

1. In environment with template which has disk on 2 domains (copies)
2. Create a VM with disk from template as thin on the first domain
3. Enter the second domain to maintenance
4. Run the VM
5. Try to migrate the VM to the second domain. 

Actual results:
An exception was thrown

Expected results:
We should fail to migrate the disk

Additional info:
Engine log attached

Comment 1 Allon Mureinik 2017-06-25 08:47:07 UTC
Fred, didn't you recently handle something like this?

Comment 2 Red Hat Bugzilla Rules Engine 2017-06-25 08:47:12 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Fred Rolland 2017-06-25 09:19:35 UTC
No, first time I see this issue.
Eyal, can you provide logs ?

Comment 4 Eyal Shenitzky 2017-06-25 11:23:49 UTC
The engine logs are attached

Comment 5 Fred Rolland 2017-06-25 12:05:14 UTC
(In reply to Eyal Shenitzky from comment #4)
> The engine logs are attached

It is marked as "deleted" ...

Comment 6 Eyal Shenitzky 2017-06-25 12:44:54 UTC
Created attachment 1291715 [details]
engine logs

Comment 7 Fred Rolland 2017-06-27 12:21:53 UTC
Eyal, can you provide the automation code for that scenario ?
Thanks

Comment 8 Eyal Shenitzky 2017-06-27 13:05:44 UTC
The automation code is not a simple python script,
It is a large scale project that I can't provide on this platform

Comment 10 Raz Tamir 2017-07-12 07:54:14 UTC
Tested on ovirt-engine-4.1.4-0.2.el7.noarch
All automation tests (mentioned in comment #9) passed


Note You need to log in before you can comment on or make changes to this bug.