Bug 1005527 - Template removal leaves record in async_tasks table
Template removal leaves record in async_tasks table
Status: CLOSED CURRENTRELEASE
Product: oVirt
Classification: Community
Component: ovirt-engine-core (Show other bugs)
3.3
x86_64 Linux
unspecified Severity high
: ---
: 3.3.2
Assigned To: Yair Zaslavsky
infra
:
Depends On: 990543
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-08 02:17 EDT by Yair Zaslavsky
Modified: 2015-03-04 19:19 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 990543
Environment:
Last Closed: 2013-12-19 21:01:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 17582 None None None Never
oVirt gerrit 18959 None None None Never

  None (edit)
Description Yair Zaslavsky 2013-09-08 02:17:12 EDT
+++ This bug was initially created as a clone of Bug #990543 +++

Description of problem:
After template was removed there is still record in async_tasks table.

# psql engine postgres -c "select task_id, vdsm_task_id, task_params_class from async_tasks"
               task_id                |             vdsm_task_id             |                         task_params_class                          
--------------------------------------+--------------------------------------+--------------------------------------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 | org.ovirt.engine.core.common.action.ImagesContainterParametersBase
(1 row)



Version-Release number of selected component (if applicable):
rhevm-3.3.0-0.9.master.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Remove template
2.
3.

Actual results:
Task is left in DB

Expected results:
Task is removed when finished

Additional info:
Backend logs are attached
I don't know if this is a regression

--- Additional comment from Jakub Libosvar on 2013-07-31 08:38:30 EDT ---

Reproduced on rhevm-3.3.0-0.11.master.el6ev.noarch

--- Additional comment from Yair Zaslavsky on 2013-08-01 04:49:02 EDT ---

Hi,
Is there a way you can provide the results of the following SQL query just so we're sure that this comes indeed from this flow (I assume this was found during automation runs, right?)

select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;

And provide the relevant logs?

--- Additional comment from Jakub Libosvar on 2013-08-01 05:00:59 EDT ---

I still have the record from the time I reported this bug:
# psql engine postgres -c "select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;"               task_id                |             vdsm_task_id             | action_type | task_type |           root_command_id            
--------------------------------------+--------------------------------------+-------------+-----------+--------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 |         203 |         0 | 6ccda64d-b573-43e7-b164-3ced29332510
(1 row)


Task 6ccda64d-b573-43e7-b164-3ced29332510 corresponds with task from attached log.

--- Additional comment from Barak on 2013-08-01 07:48:14 EDT ---

was the task finished in VDSM (SPM) ?

--- Additional comment from Yair Zaslavsky on 2013-08-01 07:56:31 EDT ---

Barak - there there is no vdsm task ID.

--- Additional comment from Jakub Libosvar on 2013-08-01 08:03:13 EDT ---

(In reply to Yair Zaslavsky from comment #5)
> Barak - there there is no vdsm task ID.

Actually there is according engine logs:
2013-07-31 14:27:54,392 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-4-thread-54) CommandMultiAsyncTasks::AttachTask: Attaching task 3b9e5882-0ade-4d8e-8d22-b59234927344 to command 6ccda64d-b573-43e7-b164-3ced29332510.

in vdsm:
Thread-8118::DEBUG::2013-07-31 14:27:52,533::task::579::TaskManager.Task::(_updateState) Task=`3b9e5882-0ade-4d8e-8d22-b59234927344`::moving from state init -> state preparing
Thread-8118::INFO::2013-07-31 14:27:52,533::logUtils::44::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='3381147f-a75f-4f61-ad5e-ff435571ec92', spUUID='b72030a0-767e-4346-a6f0-99c91f1278dc', imgUUID='3a79397d-ac77-4bfb-baab-01ce3d888c14', postZero='false', force='false')

Thread-8126::INFO::2013-07-31 14:28:01,555::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='3b9e5882-0ade-4d8e-8d22-b59234927344', spUUID=None, options=None)
Thread-8126::DEBUG::2013-07-31 14:28:01,555::taskManager::161::TaskManager::(clearTask) Entry. taskID: 3b9e5882-0ade-4d8e-8d22-b59234927344
Thread-8126::DEBUG::2013-07-31 14:28:01,556::taskManager::166::TaskManager::(clearTask) Return.
Thread-8126::INFO::2013-07-31 14:28:01,557::logUtils::47::dispatcher::(wrapper) Run and protect: clearTask, Return response: None

(In reply to Barak from comment #4)
> was the task finished in VDSM (SPM) ?

According vdsm logs yes.

--- Additional comment from Jakub Libosvar on 2013-08-01 08:04:10 EDT ---

Attaching vdsm log containing relevant task

--- Additional comment from Yair Zaslavsky on 2013-08-01 08:07:34 EDT ---

Managed to reproduce upstream.

Reproduction steps:
1. Create VM with 2 images
2. Create template
3. Remove template - two records remain.

--- Additional comment from Yair Zaslavsky on 2013-08-01 09:46:57 EDT ---

Here is my analysis:

The command flow is as follows:

RemoveVmTemplate -> RemoveAllVmTemplateImageTemplates ->  X times of RemoveTemplateSnapshot

RemoveAllVMTemplateImageTemplate call is done inside a new transaction which is bad - as in RemoveTemplateSnapshot there are VDSM calls.

executeAction for each child command that directly creates a task is as follows -a. Guid taskId = persistAsyncTaskPlaceHolder(VdcActionType.RemoveVmTemplate)

This methods creates a new placeholder - it checks if the task already exists in the tasks map for the command, and if not - creates it and saves to do db.
(The transaction scope of this is determined by the caller of this method, so in the current case - it is done inside the big transaction, let's call it transaction A).

b. When SPM call returns the "vdsm task id" -

   getReturnValue().getInternalVdsmTaskIdList().add(
                    createTask(taskId,
                            vdsReturnValue.getCreationInfo(),
                            VdcActionType.RemoveVmTemplate,
                            VdcObjectType.Storage,
                            getParameters().getStorageDomainId()));

createTask method tries to fetch the task from DB, and if is not found, creates a new task, and save it - this is done in a new transaction (as many child commands are non transactive). Let's mark this transaction as transaction B (bare in mind - A is still open).

Since we work with READ UNCOMITTED isolation level, transaction B is not aware to changes made by A, so the task added to the db by transaction A is not recognized, and a new task is aded.

What I suggest - 
a. As overall solution - we have to go over places of the code and fix child commands to be non transactive (I'll open a separate bug for this specific flow).
b. A new version for persistPlaceHolder that will perform the persistence in a new transaction (let's call it C) should be introduced as an intermid solution.
This way, transaction B will be ware to changes introduced by the already committed transaction B.

--- Additional comment from Yair Zaslavsky on 2013-08-01 10:20:01 EDT ---

Actually, a better solution for b is to use the command transactivity to ensure the c
For a command that is transactive, the record will be inserted in a new transaction (RequiresNew scope) and for a command that is non transactive, the "Required" scope will be used (as if persistePlaceHolder is called from insertPlaceHolders - all place holders should be interested in a single transaction).

--- Additional comment from Yair Zaslavsky on 2013-08-01 10:21:08 EDT ---

(In reply to Yair Zaslavsky from comment #10)
> Actually, a better solution for b is to use the command transactivity to
> ensure the c

Disregard "to ensure the c" - bad wording/typo.
> For a command that is transactive, the record will be inserted in a new
> transaction (RequiresNew scope) and for a command that is non transactive,
> the "Required" scope will be used (as if persistePlaceHolder is called from
> insertPlaceHolders - all place holders should be interested in a single
> transaction).

--- Additional comment from Arik on 2013-08-13 02:34:04 EDT ---



--- Additional comment from Barak Dagan on 2013-09-02 14:09:05 EDT ---

Verified on is12
Comment 1 Itamar Heim 2013-12-19 21:01:16 EST
this should have been fixed in 3.3 by now.

Note You need to log in before you can comment on or make changes to this bug.