Bug 1139678

Summary: placeholders of child commands aren't cleared when failing during the CDA phase
Product: Red Hat Enterprise Virtualization Manager Reporter: Ori Gofen <ogofen>
Component: ovirt-engineAssignee: Ravi Nori <rnori>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Mucha <mmucha>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: acanan, amureini, ecohen, gklein, iheim, laravot, lpeer, ogofen, oourfali, pstehlik, rbalakri, Rhev-m-bugs, rnori, sherold, tnisan, yeylon
Target Milestone: ---Keywords: CodeChange
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: org.ovirt.engine-root-3.5.0-13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 17:13:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1156162    
Attachments:
Description Flags
vdsm+engine logs none

Description Ori Gofen 2014-09-09 12:25:32 UTC
Description of problem:
This bug is opened separately from the UI bug BZ #1139628 the steps to reproduce this one are similar but not the same,in this scenario the broken volume is a snapshot volume.

This bug deals with the backend Errors that an unsuccessful attempt of template creation creates.

first and foremost the operation creates a task which continue to "live" in the system without a successful nor unsuccessful ending.

async_task table isn't cleared

engine=# SELECT task_id,action_type,task_type,started_at,command_id from async_tasks;
               task_id                | action_type | task_type |         started_at         |              command_id              
--------------------------------------+-------------+-----------+----------------------------+--------------------------------------
 fc184e28-2c7e-43dd-be65-9779b98443f7 |         201 |         1 | 2014-09-09 11:43:51.611+03 | 4900e5a0-5f85-4b4c-9245-65249f32a4e5
 95f3a6a0-47e1-4dce-b5b9-9cd9ff78e719 |         201 |         1 | 2014-09-09 11:43:51.625+03 | 6758136a-6720-4507-b4ce-be7067f849a8
 759b1706-c01b-4154-8057-0f809f9079d9 |         201 |         1 | 2014-09-09 11:43:51.654+03 | dda74871-3d0c-43ef-9f8a-9c0143d426fb
 79ebbaee-ae7d-4d28-9fad-059a4caaff3a |         201 |         1 | 2014-09-09 11:43:51.64+03  | 56f50b19-7c50-468b-8e08-cf6a3a902002
 64a06721-372d-4931-b81e-47417b94e180 |         201 |         1 | 2014-09-09 11:46:40.765+03 | 24528134-3bd2-46ef-bf12-eaaab0d75d8e
 72d55895-58b1-491f-96e5-c3c4cb0ef86a |         201 |         1 | 2014-09-09 11:30:46.082+03 | fcc2ae33-f964-419c-9e15-f1f17e2dd17c
 17f09e59-1172-4023-881b-7176a27d4ff1 |         201 |         1 | 2014-09-09 11:46:40.824+03 | 11fd25f7-5972-4a6c-bf12-ec6c7ca63337
 616f3d6c-db18-4644-8b98-545acd4c3c73 |         201 |         1 | 2014-09-09 11:46:40.84+03  | 1bb4aab2-7353-4320-80c2-0ccd1691de9e
 47c5abe2-899b-49f6-b8e0-735e7e5b3d1d |         201 |         1 | 2014-09-09 11:46:40.854+03 | 6910c5fa-db60-4107-851f-3deda58531f3
(9 rows)


Version-Release number of selected component (if applicable):
rhev3.5 vt2.2

How reproducible:
100%

Steps to Reproduce:
Setup:have a vm+broken snapshot volume

1.create a template out of this vm

Actual results:
zombie thread

Expected results:
no task should continue "living" upon an unsuccessful operation,async_task sql table should be cleared.

Additional info:

Comment 1 Ori Gofen 2014-09-09 12:26:09 UTC
Created attachment 935655 [details]
vdsm+engine logs

Comment 2 Allon Mureinik 2014-09-09 21:06:33 UTC
> Setup:have a vm+broken snapshot volume
How do you produce this?

Comment 3 Liron Aravot 2014-09-10 08:16:04 UTC
the issue is that the child command tasks placeholders are inserted before the CDA while they are cleared only on failure on the execution phase and not on failure in the cda.

IMO the solution here should be that the placeholders will be inserted just before the execute phase, there's no need to insert them before the cda.

moving to infra.

Comment 4 Ori Gofen 2014-09-10 08:52:04 UTC
(In reply to Allon Mureinik from comment #2)
> > Setup:have a vm+broken snapshot volume
> How do you produce this?


To produce a corrupted volume chain I used to commence multiple live merge sessions then restart vdsmd service (like I did in BZ #1124498),but since the live merge operation is now blocked,I haven't found an exact way to reproduce a corrupted chain, I do have a broken volume chain on my setup though(which I'm currently working on figuring what had caused this).

for you as a developer to reproduce a situation which interpreted as a broken chain by oVirt is easy,all you need is to change one of the volumes name.then create a template out of this VM.

** please note that this bug is not intended to address the broken chain issue, instead it refers to the zombie tasks that are created due to the "Create Template" operation out of a VM that hold this corrupted data **

Comment 5 Liron Aravot 2014-09-10 09:01:28 UTC
After talk with oved raising the priority has having the placeholders may affect other flows of the systme (like putting the current host which is spm to maintenance).

Comment 6 Eyal Edri 2014-09-28 11:29:40 UTC
this bug was moved to MODIFIED before vt4 build date thus moving to ON_QA.
if you belive this bug isn't in vt4, please report to rhev-integ

Comment 7 Eyal Edri 2015-02-17 17:13:36 UTC
rhev 3.5.0 was released. closing.