Bug 990543

Summary: Template removal leaves record in async_tasks table
Product: Red Hat Enterprise Virtualization Manager Reporter: Jakub Libosvar <jlibosva>
Component: ovirt-engineAssignee: Yair Zaslavsky <yzaslavs>
Status: CLOSED CURRENTRELEASE QA Contact: Barak Dagan <bdagan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: acathrow, bazulay, bdagan, cdhouch, eedri, iheim, jlibosva, lpeer, oourfali, oramraz, pstehlik, Rhev-m-bugs, yeylon, yzaslavs
Target Milestone: ---   
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: is8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1005527 (view as bug list) Environment:
Last Closed: 2014-01-21 22:17:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1005527    
Attachments:
Description Flags
backend log
none
vdsm log none

Description Jakub Libosvar 2013-07-31 12:33:38 UTC
Created attachment 781087 [details]
backend log

Description of problem:
After template was removed there is still record in async_tasks table.

# psql engine postgres -c "select task_id, vdsm_task_id, task_params_class from async_tasks"
               task_id                |             vdsm_task_id             |                         task_params_class                          
--------------------------------------+--------------------------------------+--------------------------------------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 | org.ovirt.engine.core.common.action.ImagesContainterParametersBase
(1 row)



Version-Release number of selected component (if applicable):
rhevm-3.3.0-0.9.master.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Remove template
2.
3.

Actual results:
Task is left in DB

Expected results:
Task is removed when finished

Additional info:
Backend logs are attached
I don't know if this is a regression

Comment 1 Jakub Libosvar 2013-07-31 12:38:30 UTC
Reproduced on rhevm-3.3.0-0.11.master.el6ev.noarch

Comment 2 Yair Zaslavsky 2013-08-01 08:49:02 UTC
Hi,
Is there a way you can provide the results of the following SQL query just so we're sure that this comes indeed from this flow (I assume this was found during automation runs, right?)

select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;

And provide the relevant logs?

Comment 3 Jakub Libosvar 2013-08-01 09:00:59 UTC
I still have the record from the time I reported this bug:
# psql engine postgres -c "select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;"               task_id                |             vdsm_task_id             | action_type | task_type |           root_command_id            
--------------------------------------+--------------------------------------+-------------+-----------+--------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 |         203 |         0 | 6ccda64d-b573-43e7-b164-3ced29332510
(1 row)


Task 6ccda64d-b573-43e7-b164-3ced29332510 corresponds with task from attached log.

Comment 4 Barak 2013-08-01 11:48:14 UTC
was the task finished in VDSM (SPM) ?

Comment 5 Yair Zaslavsky 2013-08-01 11:56:31 UTC
Barak - there there is no vdsm task ID.

Comment 6 Jakub Libosvar 2013-08-01 12:03:13 UTC
(In reply to Yair Zaslavsky from comment #5)
> Barak - there there is no vdsm task ID.

Actually there is according engine logs:
2013-07-31 14:27:54,392 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-4-thread-54) CommandMultiAsyncTasks::AttachTask: Attaching task 3b9e5882-0ade-4d8e-8d22-b59234927344 to command 6ccda64d-b573-43e7-b164-3ced29332510.

in vdsm:
Thread-8118::DEBUG::2013-07-31 14:27:52,533::task::579::TaskManager.Task::(_updateState) Task=`3b9e5882-0ade-4d8e-8d22-b59234927344`::moving from state init -> state preparing
Thread-8118::INFO::2013-07-31 14:27:52,533::logUtils::44::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='3381147f-a75f-4f61-ad5e-ff435571ec92', spUUID='b72030a0-767e-4346-a6f0-99c91f1278dc', imgUUID='3a79397d-ac77-4bfb-baab-01ce3d888c14', postZero='false', force='false')

Thread-8126::INFO::2013-07-31 14:28:01,555::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='3b9e5882-0ade-4d8e-8d22-b59234927344', spUUID=None, options=None)
Thread-8126::DEBUG::2013-07-31 14:28:01,555::taskManager::161::TaskManager::(clearTask) Entry. taskID: 3b9e5882-0ade-4d8e-8d22-b59234927344
Thread-8126::DEBUG::2013-07-31 14:28:01,556::taskManager::166::TaskManager::(clearTask) Return.
Thread-8126::INFO::2013-07-31 14:28:01,557::logUtils::47::dispatcher::(wrapper) Run and protect: clearTask, Return response: None

(In reply to Barak from comment #4)
> was the task finished in VDSM (SPM) ?

According vdsm logs yes.

Comment 7 Jakub Libosvar 2013-08-01 12:04:10 UTC
Created attachment 781549 [details]
vdsm log

Attaching vdsm log containing relevant task

Comment 8 Yair Zaslavsky 2013-08-01 12:07:34 UTC
Managed to reproduce upstream.

Reproduction steps:
1. Create VM with 2 images
2. Create template
3. Remove template - two records remain.

Comment 9 Yair Zaslavsky 2013-08-01 13:46:57 UTC
Here is my analysis:

The command flow is as follows:

RemoveVmTemplate -> RemoveAllVmTemplateImageTemplates ->  X times of RemoveTemplateSnapshot

RemoveAllVMTemplateImageTemplate call is done inside a new transaction which is bad - as in RemoveTemplateSnapshot there are VDSM calls.

executeAction for each child command that directly creates a task is as follows -a. Guid taskId = persistAsyncTaskPlaceHolder(VdcActionType.RemoveVmTemplate)

This methods creates a new placeholder - it checks if the task already exists in the tasks map for the command, and if not - creates it and saves to do db.
(The transaction scope of this is determined by the caller of this method, so in the current case - it is done inside the big transaction, let's call it transaction A).

b. When SPM call returns the "vdsm task id" -

   getReturnValue().getInternalVdsmTaskIdList().add(
                    createTask(taskId,
                            vdsReturnValue.getCreationInfo(),
                            VdcActionType.RemoveVmTemplate,
                            VdcObjectType.Storage,
                            getParameters().getStorageDomainId()));

createTask method tries to fetch the task from DB, and if is not found, creates a new task, and save it - this is done in a new transaction (as many child commands are non transactive). Let's mark this transaction as transaction B (bare in mind - A is still open).

Since we work with READ UNCOMITTED isolation level, transaction B is not aware to changes made by A, so the task added to the db by transaction A is not recognized, and a new task is aded.

What I suggest - 
a. As overall solution - we have to go over places of the code and fix child commands to be non transactive (I'll open a separate bug for this specific flow).
b. A new version for persistPlaceHolder that will perform the persistence in a new transaction (let's call it C) should be introduced as an intermid solution.
This way, transaction B will be ware to changes introduced by the already committed transaction B.

Comment 10 Yair Zaslavsky 2013-08-01 14:20:01 UTC
Actually, a better solution for b is to use the command transactivity to ensure the c
For a command that is transactive, the record will be inserted in a new transaction (RequiresNew scope) and for a command that is non transactive, the "Required" scope will be used (as if persistePlaceHolder is called from insertPlaceHolders - all place holders should be interested in a single transaction).

Comment 11 Yair Zaslavsky 2013-08-01 14:21:08 UTC
(In reply to Yair Zaslavsky from comment #10)
> Actually, a better solution for b is to use the command transactivity to
> ensure the c

Disregard "to ensure the c" - bad wording/typo.
> For a command that is transactive, the record will be inserted in a new
> transaction (RequiresNew scope) and for a command that is non transactive,
> the "Required" scope will be used (as if persistePlaceHolder is called from
> insertPlaceHolders - all place holders should be interested in a single
> transaction).

Comment 12 Arik 2013-08-13 06:34:04 UTC
*** Bug 995616 has been marked as a duplicate of this bug. ***

Comment 13 Barak Dagan 2013-09-02 18:09:05 UTC
Verified on is12

Comment 14 Itamar Heim 2014-01-21 22:17:57 UTC
Closing - RHEV 3.3 Released

Comment 15 Itamar Heim 2014-01-21 22:24:26 UTC
Closing - RHEV 3.3 Released