Bug 1103687

Summary: psql table, "async_task" isn't cleared,after creation of multiple templates and restarting vdsmd service
Product: Red Hat Enterprise Virtualization Manager Reporter: Ori Gofen <ogofen>
Component: ovirt-engineAssignee: Liron Aravot <laravot>
Status: CLOSED CURRENTRELEASE QA Contact: Ori Gofen <ogofen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.0CC: acanan, amureini, eedri, gklein, iheim, lpeer, ofrenkel, ogofen, oourfali, rbalakri, Rhev-m-bugs, rnori, scohen, tnisan, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: vt1.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1142923, 1156165    
Attachments:
Description Flags
async_tasks
none
vdsm+engine logs as requested
none
vdsm+engine logs none

Description Ori Gofen 2014-06-02 11:17:25 UTC
Created attachment 901402 [details]
async_tasks

Description of problem:
When creating large amount of templates,the tasks info are kept in the psql table,
once interrupting the procedure with vdsmd restart,the operation fails,but the tasks remain on the table(see image).

engine=# SELECT task_id FROM async_tasks;
               task_id                
--------------------------------------
 0a52b2ac-a060-4ccf-8975-822832f2aa69
 05d95bee-0afd-40dc-80f3-6bdab279a0a1
 a20acce1-02fd-4bef-ac3a-598b2349f441
 fffa8ef2-d89e-4f29-9a3b-28933d6daeac
 7ca2550a-9e91-416b-adcb-8af2ca4eb648
 8c08e5b2-71cf-43b8-a672-4d8b44829242
 be0fc12b-a4c4-4c0e-bdb0-2b5358698bb4
(7 rows)


Version-Release number of selected component (if applicable):
rhevm-3.4.0-0.21.el6ev.noarch
vdsm-4.14.7-3.el6ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1.create 7 vm's + (2X disks) each
2.create templates from all vm's at the same time
3.restart vdsm daemon 

Actual results:
operation fails,async_tasks are not cleared

Expected results:
operation should fail,async_tasks should be cleared

Additional info:

Comment 1 Omer Frenkel 2014-06-03 07:48:50 UTC
please attach enigne.log ...

Comment 2 Ori Gofen 2014-06-05 11:11:20 UTC
Created attachment 902496 [details]
vdsm+engine logs as requested

please pay attention to task's time of execute,some of the unclear tasks are from older tries.

engine=# SELECT action_type,task_id,started_at  FROM async_tasks;
 action_type |               task_id                |     started_at         
-------------+--------------------------------------+----------------------------
         211 | 2cf0e4c6-9e08-4892-ae63-99a8bc96a570 | 2014-06-05 14:00:05.694+03
         211 | 53db65ed-3e93-4fe2-9b06-ec64df17fa4a | 2014-06-05 14:00:05.847+03
         211 | 73e9cc84-2ce4-4b6d-bf54-3a75485c173f | 2014-06-05 14:00:09.31+03
         211 | 7f59fb97-c596-4311-90bd-1e322e885b1d | 2014-06-05 14:00:07.362+03
         211 | 3e0ce088-b2d2-463f-a936-dbf021753c29 | 2014-06-05 14:00:11.167+03
         211 | 0248ece2-db44-41d7-aff6-9e2187ee86cc | 2014-06-05 14:00:13.402+03
         211 | 0a52b2ac-a060-4ccf-8975-822832f2aa69 | 2014-06-02 11:30:15.506+03
         211 | 05d95bee-0afd-40dc-80f3-6bdab279a0a1 | 2014-06-02 11:30:15.505+03
         211 | a20acce1-02fd-4bef-ac3a-598b2349f441 | 2014-06-02 11:30:15.593+03
         211 | fffa8ef2-d89e-4f29-9a3b-28933d6daeac | 2014-06-02 11:30:15.739+03
         211 | 7ca2550a-9e91-416b-adcb-8af2ca4eb648 | 2014-06-02 11:30:16.31+03
         211 | 8c08e5b2-71cf-43b8-a672-4d8b44829242 | 2014-06-02 11:30:16.972+03
         211 | be0fc12b-a4c4-4c0e-bdb0-2b5358698bb4 | 2014-06-02 11:30:18.432+03

Comment 3 Arik 2014-06-17 14:51:05 UTC
The tasks are for remove-image operations on non-existing images:
AddVmTemplate fails => trying to end with failure CreateImageTemplate => call RemoveImage => DeleteImageGroupVDSCommand returns an error that the image doesn't exist
IIUC, in this case there is no task in VDSM for the RemoveImage, thus the task in the engine will not be removed.

Comment 4 Allon Mureinik 2014-06-18 06:38:42 UTC
(In reply to Arik from comment #3)
> IIUC, in this case there is no task in VDSM for the RemoveImage, thus the
> task in the engine will not be removed.
If this is true, it's either a misuse of the existing infra or a bug in the said infra - in any event, it should be fixed.

Comment 5 Liron Aravot 2014-06-19 14:18:58 UTC
The problem here is that a task holder is being persisted in RemoveImage, while getting from vdsm specific errors in task creation like ImageDoesNotExistInDomainError are still considered as success although task has not been created -
In that case, the task placeholder won't be cleared from the async tasks table.

We need to clear the placeholders in the end of each execution regardless to it's success (each flow should decide wether it succeeded or not) to avoid this issue in more flow. We can inspect the removal of the placeholder in remove image regardless (as the placeholders are less useful when creating one task) although we might use it for other benefits si i prefer to leave it there in the meanwhile.

Comment 6 Allon Mureinik 2014-06-19 14:20:51 UTC
Oved/Ravi - the provided patch handles this on an infra level - your feedback would be appreciated.

Comment 7 Ori Gofen 2014-07-29 07:38:12 UTC
is this fixed on oVirt beta.2 also?
or should we verify only on the downstream build?

Comment 8 Eyal Edri 2014-07-29 07:45:36 UTC
as long as it's not a downstream fix only, which it isn't as you can see the fix was done on upstream ovirt, you can continue to verify on upstream beta.
downstream build was very initial and doesn't contain all components.
was done mostly for build purpose.

Comment 9 Ori Gofen 2014-07-29 09:30:52 UTC
Created attachment 922091 [details]
vdsm+engine logs

bug reproduced on beta.2

engine=# SELECT task_id,action_type,status,vdsm_task_id from async_tasks;
               task_id                | action_type | status |             vdsm_task_id             
--------------------------------------+-------------+--------+--------------------------------------
 7b55facf-f955-487c-975f-b6a64a98ec80 |         201 |      2 | 00000000-0000-0000-0000-000000000000
 23df9612-b7f6-45a3-b21f-0ea82c26bce7 |         201 |      2 | 00000000-0000-0000-0000-000000000000
 8b1ea5f2-38ea-4135-8e11-15c75adaf521 |         201 |      2 | 00000000-0000-0000-0000-000000000000
(3 rows)

Comment 10 Ori Gofen 2014-08-03 15:49:18 UTC
after further investigation,psql async_tasks isn't cleared due to different bugs which do not affect this one.

we opened BZ #1126204 , BZ #1126205 to monitor current behavior.

moving this bug to be verified on beta.2

Comment 11 Allon Mureinik 2015-02-16 19:13:26 UTC
RHEV-M 3.5.0 has been released, closing this bug.

Comment 12 Allon Mureinik 2015-02-16 19:13:26 UTC
RHEV-M 3.5.0 has been released, closing this bug.