Bug 1412646

Summary: Restore/Preview of RAM snapshot failed
Product: [oVirt] vdsm Reporter: Raz Tamir <ratamir>
Component: GeneralAssignee: Milan Zamazal <mzamazal>
Status: CLOSED CURRENTRELEASE QA Contact: Raz Tamir <ratamir>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.19.1CC: ahadas, bugs, derez, gklein, mzamazal, ratamir, tjelinek, tnisan
Target Milestone: ovirt-4.1.0-betaKeywords: Automation, Regression
Target Release: ---Flags: rule-engine: ovirt-4.1+
rule-engine: blocker+
rule-engine: planning_ack+
ahadas: devel_ack+
rule-engine: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-15 14:53:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine and vdsm logs none

Description Raz Tamir 2017-01-12 14:05:54 UTC
Created attachment 1239940 [details]
engine and vdsm logs

Description of problem:
When trying to restore a VM to snapshot with memory, the operation succeeds but the memory state is not restored.

In the engine.log:
2017-01-12 11:43:07,973+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-5) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm_TestCase5134_REST_ISCSI_1211373723 is down with error. Exit message: Wake up from hibernation failed:'MutableDomainDescriptor' object has no attribute '_devices'.
2017-01-12 11:43:07,974+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-5) [] add VM 'c1c3c2af-456b-485a-b0b0-0a7037426c51'(vm_TestCase5134_REST_ISCSI_1211373723) to rerun treatment
2017-01-12 11:43:07,981+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-5) [] Rerun VM 'c1c3c2af-456b-485a-b0b0-0a7037426c51'. Called from VDS 'host_mixed_3'
2017-01-12 11:43:07,995+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-26) [] Correlation ID: vms_syncAction_d01938e6-1199-45e0, Job ID: 504bda6d-67fe-41ee-af6d-ffaf4f669f08, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM vm_TestCase5134_REST_ISCSI_1211373723 on Host host_mixed_3.


In vdsm.log:
2017-01-12 11:43:07,078 INFO  (vm/c1c3c2af) [dispatcher] Run and protect: prepareImage, Return response: {'info': {'domainID': '13790281-69c2-4a63-9ec4-0781d4cb33d3', 'volType': 'path', 'leaseOffset': 119537664, 'path': '/rhev/data-center/mnt/blockSD/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/eeaea5de-35d4-47ff-b213-0c85cf91930c', 'volumeID': 'eeaea5de-35d4-47ff-b213-0c85cf91930c', 'leasePath': '/dev/13790281-69c2-4a63-9ec4-0781d4cb33d3/leases', 'imageID': '1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d'}, 'path': '/rhev/data-center/ad32aa2c-fda4-4eab-83b2-a78568d48bd1/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/eeaea5de-35d4-47ff-b213-0c85cf91930c', 'imgVolumesInfo': [{'domainID': '13790281-69c2-4a63-9ec4-0781d4cb33d3', 'volType': 'path', 'leaseOffset': 109051904, 'path': '/rhev/data-center/mnt/blockSD/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/08e1b37f-eff4-4bac-a567-53e90bee5c75', 'volumeID': '08e1b37f-eff4-4bac-a567-53e90bee5c75', 'leasePath': '/dev/13790281-69c2-4a63-9ec4-0781d4cb33d3/leases', 'imageID': '1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d'}, {'domainID': '13790281-69c2-4a63-9ec4-0781d4cb33d3', 'volType': 'path', 'leaseOffset': 119537664, 'path': '/rhev/data-center/mnt/blockSD/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/eeaea5de-35d4-47ff-b213-0c85cf91930c', 'volumeID': 'eeaea5de-35d4-47ff-b213-0c85cf91930c', 'leasePath': '/dev/13790281-69c2-4a63-9ec4-0781d4cb33d3/leases', 'imageID': '1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d'}, {'domainID': '13790281-69c2-4a63-9ec4-0781d4cb33d3', 'volType': 'path', 'leaseOffset': 114294784, 'path': '/rhev/data-center/mnt/blockSD/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/d96b059a-f631-4ed6-a334-1271fc546b65', 'volumeID': 'd96b059a-f631-4ed6-a334-1271fc546b65', 'leasePath': '/dev/13790281-69c2-4a63-9ec4-0781d4cb33d3/leases', 'imageID': '1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d'}, {'domainID': '13790281-69c2-4a63-9ec4-0781d4cb33d3', 'volType': 'path', 'leaseOffset': 117440512, 'path': '/rhev/data-center/mnt/blockSD/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/a042979c-83da-4442-849b-cc7de904237d', 'volumeID': 'a042979c-83da-4442-849b-cc7de904237d', 'leasePath': '/dev/13790281-69c2-4a63-9ec4-0781d4cb33d3/leases', 'imageID': '1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d'}]} (logUtils:52)
2017-01-12 11:43:07,078 INFO  (vm/c1c3c2af) [vds] prepared volume path: /rhev/data-center/ad32aa2c-fda4-4eab-83b2-a78568d48bd1/13790281-69c2-4a63-9ec4-0781d4cb33d3/images/1720e3e1-b1a9-44ad-a18b-cf6c6f292e0d/eeaea5de-35d4-47ff-b213-0c85cf91930c (clientIF:374)
2017-01-12 11:43:07,104 ERROR (vm/c1c3c2af) [virt.vm] (vmId='c1c3c2af-456b-485a-b0b0-0a7037426c51') The vm start process failed (vm:616)
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 552, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1953, in _run
    srcDomXML = self._correctDiskVolumes(srcDomXML)
  File "/usr/share/vdsm/virt/vm.py", line 2028, in _correctDiskVolumes
    for element in domain.get_device_elements('disk'):
  File "/usr/share/vdsm/virt/domain_descriptor.py", line 41, in get_device_elements
    return vmxml.find_all(self._devices, tagName)
AttributeError: 'MutableDomainDescriptor' object has no attribute '_devices'
2017-01-12 11:43:07,105 INFO  (vm/c1c3c2af) [virt.vm] (vmId='c1c3c2af-456b-485a-b0b0-0a7037426c51') Changed state to Down: 'MutableDomainDescriptor' object has no attribute '_devices' (code=1) (vm:1198)



Version-Release number of selected component (if applicable):
ovirt-engine-4.1.0-0.4.master.20170111000229.git9ce0636.el7.centos.noarch
vdsm-4.19.1-24.git7747cad.el7.centos.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create a VM with 1 disk and OS installed
2. Run the VM and start a cat process --> $ cat &
3. Create a snapshot (VM running) with memory state
4. Power off the VM
5. Preview the snapshot
6. Start the VM and check for the process --> $ pgrep cat


Actual results:
The process doesn't exists

Expected results:


Additional info:
This is a regression from 4.0.6.3-0.1.el7ev

Comment 1 Yaniv Kaul 2017-01-12 15:13:55 UTC
virt issue?

Comment 2 Tal Nisan 2017-01-12 16:00:25 UTC
Daniel, any chance it's related to you latest change in the snapshots mechanism?

Whether it's virt or storage it needs to be targeted to 4.1 beta so I'm targeting.

Comment 3 Daniel Erez 2017-01-15 13:25:14 UTC
@Raz - is it reproducible without an installed OS? or without step 2 ("start a cat process --> $ cat &"). I've couldn't reproduce the issue on a local env.

Comment 4 Raz Tamir 2017-01-15 15:54:17 UTC
Daniel,
The snapshot operation works fine, and therefore, there is nothing interesting to check this scenario without an OS because the issue relays on comparing the processes that are running when taking the ram snapshot and after restoring/previewing the snapshot

Comment 5 Daniel Erez 2017-01-15 18:09:04 UTC
> Daniel,
> The snapshot operation works fine, and therefore, there is nothing
> interesting to check this scenario without an OS because the issue relays on
> comparing the processes that are running when taking the ram snapshot and
> after restoring/previewing the snapshot

@Arik - what do you think? familiar with this issue?

Comment 6 Arik 2017-01-16 08:21:37 UTC
(In reply to Daniel Erez from comment #5) 
> @Arik - what do you think? familiar with this issue?

Seems like a regression on VDSM related to refactoring that was done in VIRT related code.

Comment 7 Milan Zamazal 2017-01-16 08:54:54 UTC
This has already been fixed in master (commit f83f3a0) and ovirt-4.1 (commit e19ff7a).

Comment 8 Tal Nisan 2017-01-16 12:31:32 UTC
Milan, please verify this fix, I'd like to make sure it's verified before the build

Comment 9 Milan Zamazal 2017-01-16 14:16:55 UTC
I tried the initial Steps to Reproduce with Engine master and with current Vdsm ovirt-4.1 branch. It works for me, so there should be no problem anymore on the Vdsm side.

Comment 10 Raz Tamir 2017-01-23 15:35:31 UTC
Executed all our tier 1 and tier 2 on ovirt-engine-4.1.0.3-0.0.master.20170122091652.gitc6fc2c2.el7.centos and all passed.
Moving to verified