Bug 990543 - Template removal leaves record in async_tasks table
Template removal leaves record in async_tasks table
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.3.0
x86_64 Linux
unspecified Severity high
: ---
: 3.3.0
Assigned To: Yair Zaslavsky
Barak Dagan
infra
:
: 995616 (view as bug list)
Depends On:
Blocks: 1005527
  Show dependency treegraph
 
Reported: 2013-07-31 08:33 EDT by Jakub Libosvar
Modified: 2016-02-10 14:36 EST (History)
14 users (show)

See Also:
Fixed In Version: is8
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1005527 (view as bug list)
Environment:
Last Closed: 2014-01-21 17:17:57 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
backend log (7.12 KB, text/x-log)
2013-07-31 08:33 EDT, Jakub Libosvar
no flags Details
vdsm log (72.65 KB, application/x-xz)
2013-08-01 08:04 EDT, Jakub Libosvar
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 17582 None None None Never
oVirt gerrit 18959 None None None Never

  None (edit)
Description Jakub Libosvar 2013-07-31 08:33:38 EDT
Created attachment 781087 [details]
backend log

Description of problem:
After template was removed there is still record in async_tasks table.

# psql engine postgres -c "select task_id, vdsm_task_id, task_params_class from async_tasks"
               task_id                |             vdsm_task_id             |                         task_params_class                          
--------------------------------------+--------------------------------------+--------------------------------------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 | org.ovirt.engine.core.common.action.ImagesContainterParametersBase
(1 row)



Version-Release number of selected component (if applicable):
rhevm-3.3.0-0.9.master.el6ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Remove template
2.
3.

Actual results:
Task is left in DB

Expected results:
Task is removed when finished

Additional info:
Backend logs are attached
I don't know if this is a regression
Comment 1 Jakub Libosvar 2013-07-31 08:38:30 EDT
Reproduced on rhevm-3.3.0-0.11.master.el6ev.noarch
Comment 2 Yair Zaslavsky 2013-08-01 04:49:02 EDT
Hi,
Is there a way you can provide the results of the following SQL query just so we're sure that this comes indeed from this flow (I assume this was found during automation runs, right?)

select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;

And provide the relevant logs?
Comment 3 Jakub Libosvar 2013-08-01 05:00:59 EDT
I still have the record from the time I reported this bug:
# psql engine postgres -c "select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;"               task_id                |             vdsm_task_id             | action_type | task_type |           root_command_id            
--------------------------------------+--------------------------------------+-------------+-----------+--------------------------------------
 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 |         203 |         0 | 6ccda64d-b573-43e7-b164-3ced29332510
(1 row)


Task 6ccda64d-b573-43e7-b164-3ced29332510 corresponds with task from attached log.
Comment 4 Barak 2013-08-01 07:48:14 EDT
was the task finished in VDSM (SPM) ?
Comment 5 Yair Zaslavsky 2013-08-01 07:56:31 EDT
Barak - there there is no vdsm task ID.
Comment 6 Jakub Libosvar 2013-08-01 08:03:13 EDT
(In reply to Yair Zaslavsky from comment #5)
> Barak - there there is no vdsm task ID.

Actually there is according engine logs:
2013-07-31 14:27:54,392 INFO  [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-4-thread-54) CommandMultiAsyncTasks::AttachTask: Attaching task 3b9e5882-0ade-4d8e-8d22-b59234927344 to command 6ccda64d-b573-43e7-b164-3ced29332510.

in vdsm:
Thread-8118::DEBUG::2013-07-31 14:27:52,533::task::579::TaskManager.Task::(_updateState) Task=`3b9e5882-0ade-4d8e-8d22-b59234927344`::moving from state init -> state preparing
Thread-8118::INFO::2013-07-31 14:27:52,533::logUtils::44::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='3381147f-a75f-4f61-ad5e-ff435571ec92', spUUID='b72030a0-767e-4346-a6f0-99c91f1278dc', imgUUID='3a79397d-ac77-4bfb-baab-01ce3d888c14', postZero='false', force='false')

Thread-8126::INFO::2013-07-31 14:28:01,555::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='3b9e5882-0ade-4d8e-8d22-b59234927344', spUUID=None, options=None)
Thread-8126::DEBUG::2013-07-31 14:28:01,555::taskManager::161::TaskManager::(clearTask) Entry. taskID: 3b9e5882-0ade-4d8e-8d22-b59234927344
Thread-8126::DEBUG::2013-07-31 14:28:01,556::taskManager::166::TaskManager::(clearTask) Return.
Thread-8126::INFO::2013-07-31 14:28:01,557::logUtils::47::dispatcher::(wrapper) Run and protect: clearTask, Return response: None

(In reply to Barak from comment #4)
> was the task finished in VDSM (SPM) ?

According vdsm logs yes.
Comment 7 Jakub Libosvar 2013-08-01 08:04:10 EDT
Created attachment 781549 [details]
vdsm log

Attaching vdsm log containing relevant task
Comment 8 Yair Zaslavsky 2013-08-01 08:07:34 EDT
Managed to reproduce upstream.

Reproduction steps:
1. Create VM with 2 images
2. Create template
3. Remove template - two records remain.
Comment 9 Yair Zaslavsky 2013-08-01 09:46:57 EDT
Here is my analysis:

The command flow is as follows:

RemoveVmTemplate -> RemoveAllVmTemplateImageTemplates ->  X times of RemoveTemplateSnapshot

RemoveAllVMTemplateImageTemplate call is done inside a new transaction which is bad - as in RemoveTemplateSnapshot there are VDSM calls.

executeAction for each child command that directly creates a task is as follows -a. Guid taskId = persistAsyncTaskPlaceHolder(VdcActionType.RemoveVmTemplate)

This methods creates a new placeholder - it checks if the task already exists in the tasks map for the command, and if not - creates it and saves to do db.
(The transaction scope of this is determined by the caller of this method, so in the current case - it is done inside the big transaction, let's call it transaction A).

b. When SPM call returns the "vdsm task id" -

   getReturnValue().getInternalVdsmTaskIdList().add(
                    createTask(taskId,
                            vdsReturnValue.getCreationInfo(),
                            VdcActionType.RemoveVmTemplate,
                            VdcObjectType.Storage,
                            getParameters().getStorageDomainId()));

createTask method tries to fetch the task from DB, and if is not found, creates a new task, and save it - this is done in a new transaction (as many child commands are non transactive). Let's mark this transaction as transaction B (bare in mind - A is still open).

Since we work with READ UNCOMITTED isolation level, transaction B is not aware to changes made by A, so the task added to the db by transaction A is not recognized, and a new task is aded.

What I suggest - 
a. As overall solution - we have to go over places of the code and fix child commands to be non transactive (I'll open a separate bug for this specific flow).
b. A new version for persistPlaceHolder that will perform the persistence in a new transaction (let's call it C) should be introduced as an intermid solution.
This way, transaction B will be ware to changes introduced by the already committed transaction B.
Comment 10 Yair Zaslavsky 2013-08-01 10:20:01 EDT
Actually, a better solution for b is to use the command transactivity to ensure the c
For a command that is transactive, the record will be inserted in a new transaction (RequiresNew scope) and for a command that is non transactive, the "Required" scope will be used (as if persistePlaceHolder is called from insertPlaceHolders - all place holders should be interested in a single transaction).
Comment 11 Yair Zaslavsky 2013-08-01 10:21:08 EDT
(In reply to Yair Zaslavsky from comment #10)
> Actually, a better solution for b is to use the command transactivity to
> ensure the c

Disregard "to ensure the c" - bad wording/typo.
> For a command that is transactive, the record will be inserted in a new
> transaction (RequiresNew scope) and for a command that is non transactive,
> the "Required" scope will be used (as if persistePlaceHolder is called from
> insertPlaceHolders - all place holders should be interested in a single
> transaction).
Comment 12 Arik 2013-08-13 02:34:04 EDT
*** Bug 995616 has been marked as a duplicate of this bug. ***
Comment 13 Barak Dagan 2013-09-02 14:09:05 EDT
Verified on is12
Comment 14 Itamar Heim 2014-01-21 17:17:57 EST
Closing - RHEV 3.3 Released
Comment 15 Itamar Heim 2014-01-21 17:24:26 EST
Closing - RHEV 3.3 Released

Note You need to log in before you can comment on or make changes to this bug.