Bug 1107758

Summary: Restarting VDSM during LSM will cause zombie task in db
Product: Red Hat Enterprise Virtualization Manager Reporter: Raz Tamir <ratamir>
Component: ovirt-engineAssignee: Daniel Erez <derez>
Status: CLOSED CURRENTRELEASE QA Contact: Raz Tamir <ratamir>
Severity: high Docs Contact:
Priority: high    
Version: 3.4.0CC: amureini, derez, gklein, iheim, lpeer, ratamir, rbalakri, Rhev-m-bugs, scohen, tnisan, yeylon
Target Milestone: ---Keywords: Regression
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1142923, 1156165    
Attachments:
Description Flags
vdsm and engine logs none

Description Raz Tamir 2014-06-10 14:29:06 UTC
Description of problem:
If restarting VDSM during live storage migration, there is a task in db (iveMigrateVmDisksParameters) that is not deleted:

engine=# select task_id, action_type, status, task_type, vdsm_task_id, action_params_class from async_tasks;
               task_id                | action_type | status | task_type |             vdsm_task_id             |                       action_params_class                        
--------------------------------------+-------------+--------+-----------+--------------------------------------+------------------------------------------------------------------
 b5658a22-dc22-40e1-8275-ab29289939e3 |        1011 |      2 |         3 | 7114d94d-8d33-442e-8fb8-c43a841d58d2 | org.ovirt.engine.core.common.action.LiveMigrateVmDisksParameters 
 9d6eee72-8b15-4b2d-bf45-e9e5691d728b |         211 |      2 |         5 | 00000000-0000-0000-0000-000000000000 | org.ovirt.engine.core.common.action.RemoveImageParameters        
(2 rows

In vdsm there is no running tasks:
[root@aqua-vds4 ~]# vdsClient -s 0 getAllTasks

[root@aqua-vds5 ~]# vdsClient -s 0 getAllTasks

This scenario cause that the Auto-generated snapshot is locked and the vms disk is locked as well and this is stayed that way until manual intervention. 

Version-Release number of selected component (if applicable):
vdsm-4.14.7-1.el6ev.x86_64
rhevm-3.4.0-0.20.el6ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. LSM vm disk
2. restart vdsm during LSM
3.

Actual results:
Explained above

Expected results:
should fail gracefully 

Additional info:

Comment 3 Daniel Erez 2014-06-18 15:29:53 UTC
Hi Raz,

* Can you please attach relevant vdsm/engine logs.
* In which step of LSM the vdsm has been restarted?

Comment 4 Raz Tamir 2014-06-19 06:39:27 UTC
Created attachment 910269 [details]
vdsm and engine logs

Comment 5 Raz Tamir 2014-06-19 06:41:16 UTC
Hi Daniel,
I restarted VDSM right after the engine report: "User admin moving disk <disk_alias> to domain <domain_name>."

Comment 6 Allon Mureinik 2014-08-19 08:34:50 UTC
Daniel, this seems a lot like bug 1122639. Could that fix solve this one too?

Comment 7 Daniel Erez 2014-09-01 14:34:26 UTC
(In reply to Allon Mureinik from comment #6)
> Daniel, this seems a lot like bug 1122639. Could that fix solve this one too?

Yes, seems like it. Couldn't reproduce on latest build (neither on dev nor QE environments). Moving to ON_QA.

Comment 8 Raz Tamir 2014-09-01 14:50:53 UTC
verified

Comment 9 Allon Mureinik 2015-02-16 19:11:44 UTC
RHEV-M 3.5.0 has been released, closing this bug.

Comment 10 Allon Mureinik 2015-02-16 19:11:44 UTC
RHEV-M 3.5.0 has been released, closing this bug.