Bug 1170378

Summary: [BLOCKED]RHEV export fails leaving VM and disks locked and job unfinished.
Product: Red Hat Enterprise Virtualization Manager Reporter: Gordon Watson <gwatson>
Component: ovirt-engineAssignee: Liron Aravot <laravot>
Status: CLOSED WONTFIX QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.4.1CC: amureini, ecohen, gklein, iheim, ishaby, lkuchlan, lpeer, lsurette, mkalinin, rbalakri, Rhev-m-bugs, tnisan, yeylon, ylavi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1184489 (view as bug list) Environment:
Last Closed: 2015-10-22 08:44:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1185830    
Bug Blocks: 1184489    
Attachments:
Description Flags
logs+image none

Description Gordon Watson 2014-12-03 22:54:57 UTC
Description of problem:

If a RHEV export fails while it is trying to write the OVF file, then the VM and disks can be left in a locked state and the associated job is not set to 'FINISHED'.

The crux of this is that after an export fails, manual intervention is required before the VM can be used again.


Version-Release number of selected component (if applicable):

RHEV 3.4.1
RHEL 6.5 hosts with vdsm-4.14.7-3
  

How reproducible:

Only happened once. However, you can make this happen by renaming the 'master/vms' directory in the export domain. 


Steps to Reproduce:
1. Rename the '..../master/vms' directory in the export domain.
2. Export a VM.
3. It should fail with "Cannot found VMs directory".


Actual results:

Manual intervention is required to unlock the VM, its disk images and to mark the job as finished in the database.

Expected results:

No manual intervention required. When the export fails, the entities in question should be unlocked and job completed.


Additional info:

Comment 6 Tal Nisan 2015-01-21 10:37:29 UTC
Given the change in bug 1167297, the entities remaining in a locked state issue should indeed be resolved

Comment 7 Allon Mureinik 2015-01-21 11:38:27 UTC
(In reply to Tal Nisan from comment #6)
> Given the change in bug 1167297, the entities remaining in a locked state
> issue should indeed be resolved
Moving to ON_QA to verify based on this comment.

Comment 10 lkuchlan 2015-01-25 10:51:42 UTC
Created attachment 983944 [details]
logs+image

Tested using RHEVM 3.5 vt13.8
It is still reproduced by following the above steps

Comment 11 Allon Mureinik 2015-03-24 07:05:20 UTC
This is an edge case of an edge case - exporting all the images succeeds (potentially many GBs), and just writing the OVF (several KBs, at most) fails. So far, we've only seen this with a corrupted domain.

In the rare case this happens, there is a relatively simple workaround:
1. Fix the domain structure (e.g., create the missing dir)
2. Remove the exported images
3. unlock the VM with the unlocker tool
4. re-export

In 3.5's architecture, we don't have a good solution in the engine, but 3.6's reworking of the command should supply us with an opportunity to fix this.
Remove the 3.5.z flag and the triaged flag, so we can discuss.