Red Hat Bugzilla – Bug 1024811
[engine] Failure during live snapshot leaves vm configured to use new volume on next start
Last modified: 2016-02-10 11:52:25 EST
Created attachment 817457 [details]
engine and vdsm logs
Description of problem:
If a live snapshot fails when attempting to configure the vm to use the new volume (after creating it successfully), the vm is still configured to use the new volume for the next time it is launched.
Also, the following message is displayed in the engine log:
2013-10-30 14:41:20,364 WARN [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (pool-4-thread-50) Wasnt able to live snapshot due to error: VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to SnapshotVDS, error = Snapshot failed (Failed with error SNAPSHOT_FAILED and code 48). VM will still be configured to the new created snapshot
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a live snapshot of a vm and have it fail after volume creation (e.g. after volume is created, block connection from host to storage on the host running the vm)
VM is configured to run with the new volume on next start
If the vm could not be configured to change to the new volume after creation, the snapshot process should be considered "failed" and the snapshot deleted on next vm start. The vm configuration should keep using the old volume
Liron, is this related to the recent changes you've been doing around that area?
Right now in case of failure in the live snapshot verb the only treathment that we have is a message to the user that the new volumes were created and that on the next vm restart it will start writing to them.
Of course that this is not optimal, but handle this failure in a "smarter" way (e.g - exploring the retrieved error from the execution of the live snapshot and act accordingly) requires few changes in engine, as the the scenario of having it fail is also very rare, so IMO that's not 3.3 material.
Regardless, Fede is working on a patch to provide that "smarter" handling there -
IMO the severity can be reduced and it can be postponed.
*** This bug has been marked as a duplicate of bug 1018867 ***