Bug 990543
Summary: | Template removal leaves record in async_tasks table | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Jakub Libosvar <jlibosva> | ||||||
Component: | ovirt-engine | Assignee: | Yair Zaslavsky <yzaslavs> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Barak Dagan <bdagan> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 3.3.0 | CC: | acathrow, bazulay, bdagan, cdhouch, eedri, iheim, jlibosva, lpeer, oourfali, oramraz, pstehlik, Rhev-m-bugs, yeylon, yzaslavs | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 3.3.0 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | infra | ||||||||
Fixed In Version: | is8 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1005527 (view as bug list) | Environment: | |||||||
Last Closed: | 2014-01-21 22:17:57 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1005527 | ||||||||
Attachments: |
|
Reproduced on rhevm-3.3.0-0.11.master.el6ev.noarch Hi, Is there a way you can provide the results of the following SQL query just so we're sure that this comes indeed from this flow (I assume this was found during automation runs, right?) select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks; And provide the relevant logs? I still have the record from the time I reported this bug: # psql engine postgres -c "select task_id, vdsm_task_id, action_type, task_type, root_command_id from async_tasks;" task_id | vdsm_task_id | action_type | task_type | root_command_id --------------------------------------+--------------------------------------+-------------+-----------+-------------------------------------- 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 | 203 | 0 | 6ccda64d-b573-43e7-b164-3ced29332510 (1 row) Task 6ccda64d-b573-43e7-b164-3ced29332510 corresponds with task from attached log. was the task finished in VDSM (SPM) ? Barak - there there is no vdsm task ID. (In reply to Yair Zaslavsky from comment #5) > Barak - there there is no vdsm task ID. Actually there is according engine logs: 2013-07-31 14:27:54,392 INFO [org.ovirt.engine.core.bll.CommandMultiAsyncTasks] (pool-4-thread-54) CommandMultiAsyncTasks::AttachTask: Attaching task 3b9e5882-0ade-4d8e-8d22-b59234927344 to command 6ccda64d-b573-43e7-b164-3ced29332510. in vdsm: Thread-8118::DEBUG::2013-07-31 14:27:52,533::task::579::TaskManager.Task::(_updateState) Task=`3b9e5882-0ade-4d8e-8d22-b59234927344`::moving from state init -> state preparing Thread-8118::INFO::2013-07-31 14:27:52,533::logUtils::44::dispatcher::(wrapper) Run and protect: deleteImage(sdUUID='3381147f-a75f-4f61-ad5e-ff435571ec92', spUUID='b72030a0-767e-4346-a6f0-99c91f1278dc', imgUUID='3a79397d-ac77-4bfb-baab-01ce3d888c14', postZero='false', force='false') Thread-8126::INFO::2013-07-31 14:28:01,555::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='3b9e5882-0ade-4d8e-8d22-b59234927344', spUUID=None, options=None) Thread-8126::DEBUG::2013-07-31 14:28:01,555::taskManager::161::TaskManager::(clearTask) Entry. taskID: 3b9e5882-0ade-4d8e-8d22-b59234927344 Thread-8126::DEBUG::2013-07-31 14:28:01,556::taskManager::166::TaskManager::(clearTask) Return. Thread-8126::INFO::2013-07-31 14:28:01,557::logUtils::47::dispatcher::(wrapper) Run and protect: clearTask, Return response: None (In reply to Barak from comment #4) > was the task finished in VDSM (SPM) ? According vdsm logs yes. Created attachment 781549 [details]
vdsm log
Attaching vdsm log containing relevant task
Managed to reproduce upstream. Reproduction steps: 1. Create VM with 2 images 2. Create template 3. Remove template - two records remain. Here is my analysis: The command flow is as follows: RemoveVmTemplate -> RemoveAllVmTemplateImageTemplates -> X times of RemoveTemplateSnapshot RemoveAllVMTemplateImageTemplate call is done inside a new transaction which is bad - as in RemoveTemplateSnapshot there are VDSM calls. executeAction for each child command that directly creates a task is as follows -a. Guid taskId = persistAsyncTaskPlaceHolder(VdcActionType.RemoveVmTemplate) This methods creates a new placeholder - it checks if the task already exists in the tasks map for the command, and if not - creates it and saves to do db. (The transaction scope of this is determined by the caller of this method, so in the current case - it is done inside the big transaction, let's call it transaction A). b. When SPM call returns the "vdsm task id" - getReturnValue().getInternalVdsmTaskIdList().add( createTask(taskId, vdsReturnValue.getCreationInfo(), VdcActionType.RemoveVmTemplate, VdcObjectType.Storage, getParameters().getStorageDomainId())); createTask method tries to fetch the task from DB, and if is not found, creates a new task, and save it - this is done in a new transaction (as many child commands are non transactive). Let's mark this transaction as transaction B (bare in mind - A is still open). Since we work with READ UNCOMITTED isolation level, transaction B is not aware to changes made by A, so the task added to the db by transaction A is not recognized, and a new task is aded. What I suggest - a. As overall solution - we have to go over places of the code and fix child commands to be non transactive (I'll open a separate bug for this specific flow). b. A new version for persistPlaceHolder that will perform the persistence in a new transaction (let's call it C) should be introduced as an intermid solution. This way, transaction B will be ware to changes introduced by the already committed transaction B. Actually, a better solution for b is to use the command transactivity to ensure the c For a command that is transactive, the record will be inserted in a new transaction (RequiresNew scope) and for a command that is non transactive, the "Required" scope will be used (as if persistePlaceHolder is called from insertPlaceHolders - all place holders should be interested in a single transaction). (In reply to Yair Zaslavsky from comment #10) > Actually, a better solution for b is to use the command transactivity to > ensure the c Disregard "to ensure the c" - bad wording/typo. > For a command that is transactive, the record will be inserted in a new > transaction (RequiresNew scope) and for a command that is non transactive, > the "Required" scope will be used (as if persistePlaceHolder is called from > insertPlaceHolders - all place holders should be interested in a single > transaction). *** Bug 995616 has been marked as a duplicate of this bug. *** Verified on is12 Closing - RHEV 3.3 Released Closing - RHEV 3.3 Released |
Created attachment 781087 [details] backend log Description of problem: After template was removed there is still record in async_tasks table. # psql engine postgres -c "select task_id, vdsm_task_id, task_params_class from async_tasks" task_id | vdsm_task_id | task_params_class --------------------------------------+--------------------------------------+-------------------------------------------------------------------- 143bf357-d24a-4187-bfe2-381301a9ca96 | 00000000-0000-0000-0000-000000000000 | org.ovirt.engine.core.common.action.ImagesContainterParametersBase (1 row) Version-Release number of selected component (if applicable): rhevm-3.3.0-0.9.master.el6ev.noarch How reproducible: Always Steps to Reproduce: 1. Remove template 2. 3. Actual results: Task is left in DB Expected results: Task is removed when finished Additional info: Backend logs are attached I don't know if this is a regression