Bug 1139678 - placeholders of child commands aren't cleared when failing during the CDA phase
Summary: placeholders of child commands aren't cleared when failing during the CDA phase
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.0
Assignee: Ravi Nori
QA Contact: Martin Mucha
URL:
Whiteboard: infra
Depends On:
Blocks: rhev3.5beta3
TreeView+ depends on / blocked
 
Reported: 2014-09-09 12:25 UTC by Ori Gofen
Modified: 2016-05-26 01:49 UTC (History)
16 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.0-13
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:13:36 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vdsm+engine logs (1.40 MB, application/octet-stream)
2014-09-09 12:26 UTC, Ori Gofen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 32790 0 master MERGED engine: placeholders of child commands aren't cleared when failing during the CDA phase Never
oVirt gerrit 32880 0 ovirt-engine-3.5 MERGED engine: placeholders of child commands aren't cleared when failing during the CDA phase Never

Description Ori Gofen 2014-09-09 12:25:32 UTC
Description of problem:
This bug is opened separately from the UI bug BZ #1139628 the steps to reproduce this one are similar but not the same,in this scenario the broken volume is a snapshot volume.

This bug deals with the backend Errors that an unsuccessful attempt of template creation creates.

first and foremost the operation creates a task which continue to "live" in the system without a successful nor unsuccessful ending.

async_task table isn't cleared

engine=# SELECT task_id,action_type,task_type,started_at,command_id from async_tasks;
               task_id                | action_type | task_type |         started_at         |              command_id              
--------------------------------------+-------------+-----------+----------------------------+--------------------------------------
 fc184e28-2c7e-43dd-be65-9779b98443f7 |         201 |         1 | 2014-09-09 11:43:51.611+03 | 4900e5a0-5f85-4b4c-9245-65249f32a4e5
 95f3a6a0-47e1-4dce-b5b9-9cd9ff78e719 |         201 |         1 | 2014-09-09 11:43:51.625+03 | 6758136a-6720-4507-b4ce-be7067f849a8
 759b1706-c01b-4154-8057-0f809f9079d9 |         201 |         1 | 2014-09-09 11:43:51.654+03 | dda74871-3d0c-43ef-9f8a-9c0143d426fb
 79ebbaee-ae7d-4d28-9fad-059a4caaff3a |         201 |         1 | 2014-09-09 11:43:51.64+03  | 56f50b19-7c50-468b-8e08-cf6a3a902002
 64a06721-372d-4931-b81e-47417b94e180 |         201 |         1 | 2014-09-09 11:46:40.765+03 | 24528134-3bd2-46ef-bf12-eaaab0d75d8e
 72d55895-58b1-491f-96e5-c3c4cb0ef86a |         201 |         1 | 2014-09-09 11:30:46.082+03 | fcc2ae33-f964-419c-9e15-f1f17e2dd17c
 17f09e59-1172-4023-881b-7176a27d4ff1 |         201 |         1 | 2014-09-09 11:46:40.824+03 | 11fd25f7-5972-4a6c-bf12-ec6c7ca63337
 616f3d6c-db18-4644-8b98-545acd4c3c73 |         201 |         1 | 2014-09-09 11:46:40.84+03  | 1bb4aab2-7353-4320-80c2-0ccd1691de9e
 47c5abe2-899b-49f6-b8e0-735e7e5b3d1d |         201 |         1 | 2014-09-09 11:46:40.854+03 | 6910c5fa-db60-4107-851f-3deda58531f3
(9 rows)


Version-Release number of selected component (if applicable):
rhev3.5 vt2.2

How reproducible:
100%

Steps to Reproduce:
Setup:have a vm+broken snapshot volume

1.create a template out of this vm

Actual results:
zombie thread

Expected results:
no task should continue "living" upon an unsuccessful operation,async_task sql table should be cleared.

Additional info:

Comment 1 Ori Gofen 2014-09-09 12:26:09 UTC
Created attachment 935655 [details]
vdsm+engine logs

Comment 2 Allon Mureinik 2014-09-09 21:06:33 UTC
> Setup:have a vm+broken snapshot volume
How do you produce this?

Comment 3 Liron Aravot 2014-09-10 08:16:04 UTC
the issue is that the child command tasks placeholders are inserted before the CDA while they are cleared only on failure on the execution phase and not on failure in the cda.

IMO the solution here should be that the placeholders will be inserted just before the execute phase, there's no need to insert them before the cda.

moving to infra.

Comment 4 Ori Gofen 2014-09-10 08:52:04 UTC
(In reply to Allon Mureinik from comment #2)
> > Setup:have a vm+broken snapshot volume
> How do you produce this?


To produce a corrupted volume chain I used to commence multiple live merge sessions then restart vdsmd service (like I did in BZ #1124498),but since the live merge operation is now blocked,I haven't found an exact way to reproduce a corrupted chain, I do have a broken volume chain on my setup though(which I'm currently working on figuring what had caused this).

for you as a developer to reproduce a situation which interpreted as a broken chain by oVirt is easy,all you need is to change one of the volumes name.then create a template out of this VM.

** please note that this bug is not intended to address the broken chain issue, instead it refers to the zombie tasks that are created due to the "Create Template" operation out of a VM that hold this corrupted data **

Comment 5 Liron Aravot 2014-09-10 09:01:28 UTC
After talk with oved raising the priority has having the placeholders may affect other flows of the systme (like putting the current host which is spm to maintenance).

Comment 6 Eyal Edri 2014-09-28 11:29:40 UTC
this bug was moved to MODIFIED before vt4 build date thus moving to ON_QA.
if you belive this bug isn't in vt4, please report to rhev-integ

Comment 7 Eyal Edri 2015-02-17 17:13:36 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.