Bug 1287066

Summary: [Cinder] Live preview fails to wake up a VM from hibernation
Product: [oVirt] ovirt-engine Reporter: Ori Gofen <ogofen>
Component: BLL.VirtAssignee: Daniel Erez <derez>
Status: CLOSED CURRENTRELEASE QA Contact: Natalie Gavrielov <ngavrilo>
Severity: high Docs Contact:
Priority: high    
Version: 3.6.0.2CC: acanan, amureini, bugs, michal.skrivanek, sbonazzo, tnisan, ylavi
Target Milestone: ovirt-3.6.2Flags: rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
amureini: devel_ack+
acanan: testing_ack+
Target Release: 3.6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 11:17:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log none

Description Ori Gofen 2015-12-01 12:27:20 UTC
Description of problem:
VMs with cinder storage provider disks fail to perform a live preview (preview a snapshot with ram enabled) the ERROR being thrown in engine log:

"2015-12-01 12:09:31,629 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-99) [51d5c2a] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm_cinder_2 is down with error. Exit message: Wake up from hibernation failed:'_srcDomXML'."

After failure the machine rolled back and successfully boot from it's hd device.

Version-Release number of selected component (if applicable):
rhevm-3.6.1-0.2.el6.noarch

How reproducible:
100%

Steps to Reproduce:
1.preview a live snapshot on a VM with ceph storage

Actual results:
Live preview fails to wake up a VM from hibernation

Expected results:
Live preview should be supported

Additional info:

Comment 1 Ori Gofen 2015-12-01 12:29:11 UTC
Created attachment 1100862 [details]
Log

Comment 2 Allon Mureinik 2015-12-01 16:08:46 UTC
Taking to storage to research. If we discover that Cinder is inconsequential here we may move it to the virt experts to research.

Comment 3 Daniel Erez 2015-12-01 19:15:03 UTC
Reproduced the issue also with a VM with no disks. Seems like an old issue since the introduction of memory snapshots. Proposed a fix to vdsm.

Comment 4 Daniel Erez 2015-12-02 08:05:41 UTC
Hi Michal,

Should we support previewing a memory snapshot without disks?
Currently, when previewing such a snapshot the VM fails to start due to an exception in vdsm (as mentioned in the description).

Thanks.

Comment 5 Michal Skrivanek 2015-12-02 11:12:00 UTC
yeah, we do want to support that. It was not exercised too often (apparently), but there's no reason we shouldn't fix it.

However taking snapshot with RAM when the VM has direct LUNs or cinder...I don't think as a general case we should silently succeed when we don't ensure LUN/cinder is snapshot consistently as well. We should not allow resume without restoring the complete state of directly attached storage (as opposed to network-aware mounts like NFS)
Thoughts?

Comment 6 Sandro Bonazzola 2015-12-23 13:43:01 UTC
oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA

Comment 7 Allon Mureinik 2016-01-04 13:59:39 UTC
(In reply to Michal Skrivanek from comment #5)
> yeah, we do want to support that. It was not exercised too often
> (apparently), but there's no reason we shouldn't fix it.
> 
> However taking snapshot with RAM when the VM has direct LUNs or cinder...I
> don't think as a general case we should silently succeed when we don't
> ensure LUN/cinder is snapshot consistently as well. 
For Cinder we do, of course.

Comment 8 Natalie Gavrielov 2016-02-02 12:55:07 UTC
Verified, rhevm-3.6.2.6-0.1.el6.noarch.