Bug 1567617 - Failure to resume VM, Error: Wake up from hibernation failed:'type'.
Summary: Failure to resume VM, Error: Wake up from hibernation failed:'type'.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.2.2
Hardware: Unspecified
OS: Unspecified
high
urgent vote
Target Milestone: ovirt-4.2.3
: 4.2.3.2
Assignee: Francesco Romani
QA Contact: Israel Pinto
URL:
Whiteboard:
: 1567773 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-15 14:28 UTC by Israel Pinto
Modified: 2018-05-10 06:29 UTC (History)
3 users (show)

Fixed In Version: ovirt-engine-4.2.3.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-10 06:29:49 UTC
oVirt Team: Virt
michal.skrivanek: ovirt-4.2?
ykaul: blocker+
ipinto: planning_ack?
rule-engine: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
engine,vdsm logs (1.27 MB, application/zip)
2018-04-15 14:28 UTC, Israel Pinto
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 90312 master MERGED virt: domxml_preproc: CDRoms may lack driver attrs 2018-04-16 12:57:52 UTC
oVirt gerrit 90313 master MERGED virt: domxml_preproc: minimal update for floppies 2018-04-16 16:22:10 UTC
oVirt gerrit 90341 master MERGED tests: fix usage of old XML API 2018-04-16 14:31:43 UTC
oVirt gerrit 90353 ovirt-4.2 MERGED virt: domxml_preproc: CDRoms may lack driver attrs 2018-04-17 10:10:17 UTC
oVirt gerrit 90354 ovirt-4.2 MERGED virt: domxml_preproc: minimal update for floppies 2018-04-17 13:14:56 UTC

Description Israel Pinto 2018-04-15 14:28:46 UTC
Created attachment 1422194 [details]
engine,vdsm logs

Description of problem:
Failed to resume VM after suspension 

Version-Release number of selected component (if applicable):
Engine version:4.2.3-0.1.el7
Host:RHEL - 7.5 - 8.el7
Kernel Version:3.10.0 - 861.el7.x86_64
KVM Version:2.10.0 - 21.el7_5.2
LIBVIRT Version:libvirt-3.9.0-14.el7_5.2
VDSM Version:vdsm-4.20.25-1.el7ev


How reproducible:
100 %

Steps to Reproduce:
1. Start VM
2. Suspend VM
3. Resume VM

Actual results:
VM failed to start.

Additional info:
Engine log:
2018-04-15 17:16:53,276+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-0) [] VM 'd6a8fd3f-6bdf-4030-9e31-ae193543e1c6'(suspend_resume_vm) moved from 'RestoringState' --> 'Down'
2018-04-15 17:16:53,387+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-0) [] EVENT_ID: VM_DOWN_ERROR(119), VM suspend_resume_vm is down with error. Exit message: Wake up from hibernation failed:'type'.

VDSM log:
2018-04-15 17:16:52,477+0300 ERROR (vm/d6a8fd3f) [virt.vm] (vmId='d6a8fd3f-6bdf-4030-9e31-ae193543e1c6') The vm start process failed (vm:943)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2811, in _run
    hooks.before_vm_start(self._buildDomainXML(), self._custom)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2246, in _buildDomainXML
    self, dom, self._devices[hwclass.DISK])
  File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line 197, in update_disks_xml_from_objs
    dev_elem, disk_obj, vm.log, replace_attribs=True)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmdevices/storagexml.py", line 302, in update_disk_element_from_object
    old_drive_format = driver.attrib['type']
KeyError: 'type'
2018-04-15 17:16:52,478+0300 INFO  (vm/d6a8fd3f) [virt.vm] (vmId='d6a8fd3f-6bdf-4030-9e31-ae193543e1c6') Changed state to Down: 'type' (code=1) (vm:1683)
2018-04-15 17:16:52,478+0300 DEBUG (vm/d6a8fd3f) [virt.metadata.Descriptor] values: {'minGuaranteedMemoryMb': 1024, 'clusterVersion': '4.2', 'startTime': 1523801666.38, 'destroy_on_reboot': False, 'resumeBehavior': 'auto_resume', 'launchPaused': 'false', 'memGuaranteedSize': 1024} (metadata:596)
2018-04-15 17:16:52,478+0300 DEBUG (vm/d6a8fd3f) [virt.metadata.Descriptor] values updated: {'guestAgentAPIVersion': 3, 'clusterVersion': '4.2', 'exitMessage': "Wake up from hibernation failed:'type'", 'resumeBehavior': 'auto_resume', 'exitReason': 1, 'memGuaranteedSize': 1024, 'minGuaranteedMemoryMb': 1024, 'startTime': 1523801770.477857, 'destroy_on_reboot': False, 'launchPaused': 'false', 'exitCode': 1} (metadata:601)

Comment 1 Red Hat Bugzilla Rules Engine 2018-04-16 05:08:32 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 2 Yaniv Kaul 2018-04-16 06:33:36 UTC
This is regularly tested in OST.
Please try to understand how it was not caught there.

Comment 3 Francesco Romani 2018-04-16 07:22:30 UTC
(In reply to Israel Pinto from comment #0)
> Created attachment 1422194 [details]
> engine,vdsm logs
> 
> Description of problem:
> Failed to resume VM after suspension 
> 
> Version-Release number of selected component (if applicable):
> Engine version:4.2.3-0.1.el7
> Host:RHEL - 7.5 - 8.el7
> Kernel Version:3.10.0 - 861.el7.x86_64
> KVM Version:2.10.0 - 21.el7_5.2
> LIBVIRT Version:libvirt-3.9.0-14.el7_5.2
> VDSM Version:vdsm-4.20.25-1.el7ev

Looks like libvirt 3.9.0 is giving back XML which is valid and legal but omits the data Vdsm used to find and expects there. Examples are, in both cases for cdroms:

1. <source file="" startup="optional"> became
   <source startup="optional"> (fixed in Ia4dac75678b58b2f11f3cced2a88a78b17a76488)

2. now <driver name="qemu" type="raw" error_policy="report">
   became  <driver error_policy='report'>
   which is fine (from libvirt POV) because name=qemu and type=raw are QEMU's defaults AFAIR

It escaped OST because OST/CI workers are still on EL7.4 IIUC. I for myself did very limited testing on EL7.5 and that contributed to the bug, I'll update my env and do more tests ASAP.

The fix for this specific bz is simple, but we need more testing on 7.5, even though a simple OST run would be enough.

Comment 4 Michal Skrivanek 2018-04-17 05:04:34 UTC
*** Bug 1567773 has been marked as a duplicate of this bug. ***

Comment 5 Israel Pinto 2018-04-22 14:53:48 UTC
Verify with: 
Engine version: 4.2.3.2-0.1.el7

Steps to Reproduce:
1. Start VM
2. Suspend VM
3. Resume VM

PASS

Comment 6 Sandro Bonazzola 2018-05-10 06:29:49 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.