Bug 1127294

Summary: Live Merge: Resolve unknown merge status in vdsm after host crash
Product: [Retired] oVirt Reporter: Adam Litke <alitke>
Component: vdsmAssignee: Adam Litke <alitke>
Status: CLOSED CURRENTRELEASE QA Contact: Kevin Alon Goldblatt <kgoldbla>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5CC: acanan, amureini, bazulay, bugs, ecohen, gklein, iheim, mgoldboi, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: v4.16.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-17 12:42:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1073943    

Description Adam Litke 2014-08-06 14:39:58 UTC
Description of problem: When a host crashes during a live merge operation resulting in the VM crashing, we need to take some special steps in vdsm to ensure that we can recover using shared storage only.  This involves two changes:

1. For active layer merges, mark the leaf volume as ILLEGAL before pivoting.  This allows engine to determine that this leaf has been merged and can be discarded.

2. Allow a live merge to be resubmitted on a new host using blank volume uuids.  This will tell vdsm to skip the libvirt part of the merge but to rerun synchronization operations.

How reproducible: Always


Steps to Reproduce:
1. Start live merge operation
2. Panic host in the middle of the operation
3. Reboot host
4. Re-run VM

Actual results: Engine cannot re-run the vm because the previous live merge operation status cannot be determined.


Expected results: Engine re-runs the VM and the metadata is properly synchronized to the results of the initial live merge operation.

Comment 1 Sandro Bonazzola 2014-10-17 12:42:19 UTC
oVirt 3.5 has been released and should include the fix for this issue.