1323629 – [Docs][VMM][Tech] Document Live Merge recovery mechanism

Bug 1323629 - [Docs][VMM][Tech] Document Live Merge recovery mechanism

Summary: [Docs][VMM][Tech] Document Live Merge recovery mechanism

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	Documentation
Sub Component:
Version:	4.0.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	unspecified
Target Milestone:	ovirt-4.1.3
Target Release:	---
Assignee:	Emma Heftman
QA Contact:	Byron Gravenorst
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-04-04 09:54 UTC by Ala Hino
Modified:	2019-05-07 12:55 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Failures during live merge Consequence: Volume may be deleted from chain but engine fails to see that, i.e. volume still appear in the engine but its state is ILLEGAL and, the volume cannot be removed Fix: Allow the user to delete ILLEGAL volume by attempting live merge again Result: ILLEGAL volume is removed
Clone Of:
Environment:
Last Closed:	2017-07-18 11:47:53 UTC
oVirt Team:	Docs
Target Upstream Version:
Embargoed:
Flags:	eheftman: needinfo-

Attachments	(Terms of Use)

Description Ala Hino 2016-04-04 09:54:28 UTC

Description of problem:

Live merge operation could fail due to different reasons. For example:
- User starts a live merge but vdsm fails to get the command
- User starts a live merge but engine fails to get vdsm response
- Engine goes down while live merge is running
- vdsm goes down while live merge is running

These failures leave the volume in ILLEGAL state.
To recover from these failures, we ask the user to try live merge again. The recovery mechanism will:
- Start a new merge job (i.e. send merge command to vdsm) if the old didn't start
- Sync the database if merge succeeded at storage

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Allon Mureinik 2016-04-07 12:36:20 UTC

Not sure what this BZ is about - fixing something in the flow, or documenting the behavior?

Comment 2 Ala Hino 2016-04-18 11:39:28 UTC

Fixing something in the flow

Comment 3 Ala Hino 2016-04-18 11:47:34 UTC

(In reply to Ala Hino from comment #2)
> Fixing something in the flow

My bad - this is for doc purposes

Comment 4 Allon Mureinik 2016-04-18 12:45:32 UTC

(In reply to Ala Hino from comment #3)
> (In reply to Ala Hino from comment #2)
> > Fixing something in the flow
> 
> My bad - this is for doc purposes
Ack. Moving to the proper team and resetting the assignee to the default assignee. 

Ala, please provide the relevant information so one of the technical writers can pick it up and update the documentation correctly.

Comment 5 Michal Skrivanek 2016-04-20 11:33:59 UTC

virt is not the correct team;-)

Comment 6 Allon Mureinik 2016-04-20 12:43:23 UTC

(In reply to Michal Skrivanek from comment #5)
> virt is not the correct team;-)
PgDn instead of PgUp ;-) my apologies.

Comment 7 Ala Hino 2016-05-04 08:36:23 UTC

In addition to Doc Text, following info might be helpful:

The guiding principle is here is simple. Assume you have a chain A<-B<-C, where A is the base and C is the active volume, and assume you want to remove snapshot A, which means you'll be pushing data from B into A, and eventually removing B (yes, this can be counter-intuitive at first, but remember that removing A actually means that you are willing to lose the ability to revert back in time to the A snapshot). At the moment you begin this merge operation, A will illegal, as it will no longer represent a consistent point in time, but the chain, as a whole, is still completely intact. In other words, you can safely run a VM on it, but cannot revert back to A's state. Under this logic, it does not matter how many times you stop and restart the live merge process, it can pick up where it left (more-or-less, sans some overhead), and complete eventually - and this is also the guideline to the field in 3.6.z. Have a failed live merge? Fix the underlying problem (e.g., failed host or inaccessible storage device) and retry.

Comment 8 Yaniv Lavi 2016-05-09 11:07:28 UTC

oVirt 4.0 Alpha has been released, moving to oVirt 4.0 Beta target.

Comment 12 Ala Hino 2016-06-07 09:49:14 UTC

*** Bug 1342681 has been marked as a duplicate of this bug. ***

Comment 22 Lucy Bopf 2017-05-25 04:32:06 UTC

Moving back to NEW to be reassigned as resources allow.

Comment 23 Lucy Bopf 2017-06-20 05:00:51 UTC

Assigning to Emma for review.

Emma, looks like we just need to add the note suggested in comment 20.

We don't really refer to 'Live Merge' as a concept, because it's an underlying operation when you delete a snapshot on a running virtual machine.

I think the following two locations would be the most suitable for this information:

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/technical_reference/#Snapshot_Deletion
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/virtual_machine_management_guide/#Deleting_a_snapshot

Comment 27 Emma Heftman 2017-07-18 11:47:53 UTC

The updated documentation is available on the Customer Portal:

Technical Guide: 
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html-single/technical_reference/#Snapshot_Deletion

Virtual Machine Management Guide:
https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/virtual_machine_management_guide/sect-snapshots

Note You need to log in before you can comment on or make changes to this bug.