Created attachment 1364482 [details] engine and vdsm logs Description of problem: Start VM that was restored from a snapshot with memory is failing. engine.log: 2017-12-07 23:17:15,155+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (default task-3) [vms_syncAction_9e520e8a-0d11-4b6f] VM {pitReinjection=false, memGuaranteedSize=1024, smpThreadsPerCore=1, cpuType=Conroe , vmId=0139b240-d208-4d7b-8f0b-e8981ce26988, acpiEnable=true, vmType=kvm, smp=1, smpCoresPerSocket=1, emulatedMachine=pc-i440fx-rhel7.3.0, smartcardEnable=false, guestNumaNodes=[{memory=1024, cpus=0, nodeIndex=0}], transparentHugePages=tr ue, displayNetwork=ovirtmgmt, vmName=vm_TestCase5138_0723125633, hiberVolHandle=650155db-bb01-4234-a974-724658ea8365,93512097-d821-43d7-806d-1b7cb44091b4,48dd6512-a488-4ba1-ae00-db5de01b6c9a,b101fce4-7f77-4e4d-83eb-b1674e8ea2ca,ef009c18-0 c8a-4888-8b0c-d23f61c97c0c,f30b55cb-9377-4634-8e8d-fbe480d07e66, maxVCpus=16, kvmEnable=true, devices=[{address={bus=0x00, domain=0x0000, function=0x0, slot=0x02, type=pci}, type=video, specParams={vgamem=16384, heads=1, vram=8192, ram=65 536}, device=qxl, deviceId=ad4b78b4-ecc0-466b-90ef-b25d1e0c8506}, {type=graphics, specParams={fileTransferEnable=true, spiceSecureChannels=smain,sinputs,scursor,splayback,srecord,sdisplay,ssmartcard,susbredir, copyPasteEnable=true}, devic e=spice, deviceId=6b4563ca-b43b-439e-a797-9552469d5fb5}, {iface=ide, shared=false, path=, address={bus=1, controller=0, unit=0, type=drive, target=0}, readonly=true, index=2, type=disk, specParams={}, device=cdrom, deviceId=9de754c3-c9cd- 4018-a307-425ae9a2bda0}, {discard=false, shared=false, address={bus=0x00, domain=0x0000, function=0x0, slot=0x07, type=pci}, imageID=e0908c0b-69ce-4e43-b4d0-fd4e4bddb3ef, format=cow, index=0, optional=false, type=disk, deviceId=e0908c0b-6 9ce-4e43-b4d0-fd4e4bddb3ef, domainID=650155db-bb01-4234-a974-724658ea8365, propagateErrors=off, iface=virtio, readonly=false, bootOrder=1, poolID=93512097-d821-43d7-806d-1b7cb44091b4, volumeID=8e3d6391-5987-435d-b4f2-5bb1c1892e38, diskTyp e=file, specParams={}, device=disk}, {filter=vdsm-no-mac-spoofing, nicModel=pv, address={bus=0x00, domain=0x0000, function=0x0, slot=0x03, type=pci}, filterParameters=[], type=interface, specParams={inbound={}, outbound={}}, device=bridge , linkActive=true, deviceId=29a6d2f8-4a8c-4ce2-a7db-069540ab0ab3, macAddr=00:1a:4a:16:25:9e, network=ovirtmgmt}, {address={bus=0x00, domain=0x0000, function=0x0, slot=0x04, type=pci}, type=sound, specParams={}, device=ich6, deviceId=887c8 033-c85c-4102-bcb3-07f0d494bbaf}, {address={bus=0x00, domain=0x0000, function=0x2, slot=0x01, type=pci}, index=0, model=piix3-uhci, type=controller, specParams={}, device=usb, deviceId=2e7f5d47-f728-4df8-be10-faf51e7c1102}, {address={bus= 0x00, domain=0x0000, function=0x0, slot=0x08, type=pci}, type=balloon, specParams={model=virtio}, device=memballoon, deviceId=92744e14-a7b6-42cd-b0f3-a65e606a430f}, {address={bus=0x00, domain=0x0000, function=0x0, slot=0x05, type=pci}, in dex=0, model=virtio-scsi, type=controller, specParams={}, device=scsi, deviceId=9deba246-6f03-4e39-8213-6265d6520399}, {address={bus=0x00, domain=0x0000, function=0x0, slot=0x06, type=pci}, type=controller, specParams={}, device=virtio-se rial, deviceId=efedebcc-37fd-4771-82c2-a40d97e39567}, {model=virtio, type=rng, specParams={source=urandom}, device=virtio, deviceId=0c7b8ac9-e134-4b24-8141-e9365f4414f9}], custom={device_6c492117-858e-4f15-a319-801339270d03device_9ae8e066 -6afd-4575-a525-816f5444a9cbdevice_be9e350c-6556-4c00-9373-9d4528cb0b71=VmDevice:{id='VmDeviceId:{deviceId='be9e350c-6556-4c00-9373-9d4528cb0b71', vmId='0139b240-d208-4d7b-8f0b-e8981ce26988'}', device='ide', type='CONTROLLER', specParams= '[index=0]', address='{type=pci, slot=0x01, bus=0x00, domain=0x0000, function=0x1}', managed='false', plugged='true', readOnly='false', deviceAlias='ide', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}, d evice_6c492117-858e-4f15-a319-801339270d03device_9ae8e066-6afd-4575-a525-816f5444a9cbdevice_be9e350c-6556-4c00-9373-9d4528cb0b71device_5c00be2a-429b-4c97-953e-23dad0207fbd=VmDevice:{id='VmDeviceId:{deviceId='5c00be2a-429b-4c97-953e-23dad0 207fbd', vmId='0139b240-d208-4d7b-8f0b-e8981ce26988'}', device='spicevmc', type='CHANNEL', specParams='[]', address='{type=virtio-serial, bus=0, controller=0, port=3}', managed='false', plugged='true', readOnly='false', deviceAlias='chann el2', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}, device_6c492117-858e-4f15-a319-801339270d03device_9ae8e066-6afd-4575-a525-816f5444a9cb=VmDevice:{id='VmDeviceId:{deviceId='9ae8e066-6afd-4575-a525-816 f5444a9cb', vmId='0139b240-d208-4d7b-8f0b-e8981ce26988'}', device='unix', type='CHANNEL', specParams='[]', address='{type=virtio-serial, bus=0, controller=0, port=2}', managed='false', plugged='true', readOnly='false', deviceAlias='channe l1', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}, device_6c492117-858e-4f15-a319-801339270d03=VmDevice:{id='VmDeviceId:{deviceId='6c492117-858e-4f15-a319-801339270d03', vmId='0139b240-d208-4d7b-8f0b-e8 981ce26988'}', device='unix', type='CHANNEL', specParams='[]', address='{type=virtio-serial, bus=0, controller=0, port=1}', managed='false', plugged='true', readOnly='false', deviceAlias='channel0', customProperties='[]', snapshotId='null ', logicalName='null', hostDevice='null'}}, timeOffset=0, nice=0, maxMemSize=4096, maxMemSlots=16, bootMenuEnable=false, memSize=1024, agentChannelName=ovirt-guest-agent.0} 2017-12-07 23:17:15,506+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateBrokerVDSCommand] (default task-3) [vms_syncAction_9e520e8a-0d11-4b6f] FINISH, CreateBrokerVDSCommand, log id: 3de9145 2017-12-07 23:17:15,510+02 INFO [org.ovirt.engine.core.vdsbroker.CreateVDSCommand] (default task-3) [vms_syncAction_9e520e8a-0d11-4b6f] FINISH, CreateVDSCommand, return: RestoringState, log id: 3ade8afd 2017-12-07 23:17:15,510+02 INFO [org.ovirt.engine.core.bll.RunVmCommand] (default task-3) [vms_syncAction_9e520e8a-0d11-4b6f] Lock freed to object 'EngineLock:{exclusiveLocks='[0139b240-d208-4d7b-8f0b-e8981ce26988=VM]', sharedLocks=''}' 2017-12-07 23:17:15,517+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-7) [] VM '0139b240-d208-4d7b-8f0b-e8981ce26988' was reported as Down on VDS '801e4f72-1885-4fee-8cc2-afc31a6d6068'(host_mixed_ 2) 2017-12-07 23:17:15,520+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-7) [] START, DestroyVDSCommand(HostName = host_mixed_2, DestroyVmVDSCommandParameters:{hostId='801e4f72-1885-4fee-8cc2-a fc31a6d6068', vmId='0139b240-d208-4d7b-8f0b-e8981ce26988', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 1ed2dac4 2017-12-07 23:17:15,524+02 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-3) [vms_syncAction_9e520e8a-0d11-4b6f] EVENT_ID: USER_STARTED_VM(153), VM vm_TestCase5138_0723125633 was started by admi n@internal-authz (Host: host_mixed_2). 2017-12-07 23:17:15,621+02 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engineScheduled-Thread-28) [] Setting new tasks map. The map contains now 2 tasks 2017-12-07 23:17:17,554+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-7) [] FINISH, DestroyVDSCommand, log id: 1ed2dac4 2017-12-07 23:17:17,555+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-7) [] VM '0139b240-d208-4d7b-8f0b-e8981ce26988'(vm_TestCase5138_0723125633) moved from 'RestoringState' --> 'Down' 2017-12-07 23:17:17,574+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-7) [] EVENT_ID: VM_DOWN_ERROR(119), VM vm_TestCase5138_0723125633 is down with error. Exit message: Wake up fro m hibernation failed:'vmName'. 2017-12-07 23:17:17,575+02 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-7) [] add VM '0139b240-d208-4d7b-8f0b-e8981ce26988'(vm_TestCase5138_0723125633) to rerun treatment 2017-12-07 23:17:17,582+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-7) [] Rerun VM '0139b240-d208-4d7b-8f0b-e8981ce26988'. Called from VDS 'host_mixed_2' 2017-12-07 23:17:17,603+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-15422) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run VM vm_TestCase5138_0723125 633 on Host host_mixed_2. vdsm.log: 2017-12-07 23:17:15,501+0200 INFO (jsonrpc/0) [api.virt] FINISH create return={'status': {'message': 'Done', 'code': 0}, 'vmList': {'status': 'Restoring state', 'vmName': 'vm_TestCase5138_0723125633', 'vmId': '0139b240-d208-4d7b-8f0b-e89 81ce26988', 'devices': [], 'guestDiskMapping': {}, 'clientIp': '', 'statusTime': '4314659320', 'arch': 'x86_64'}} from=::ffff:10.35.161.183,58774, flow_id=vms_syncAction_9e520e8a-0d11-4b6f (api:52) 2017-12-07 23:17:15,503+0200 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.create succeeded in 0.34 seconds (__init__:573) 2017-12-07 23:17:15,505+0200 INFO (vm/0139b240) [virt.vm] (vmId='0139b240-d208-4d7b-8f0b-e8981ce26988') VM wrapper has started (vm:2605) 2017-12-07 23:17:15,507+0200 INFO (vm/0139b240) [virt.vm] (vmId='0139b240-d208-4d7b-8f0b-e8981ce26988') Enabling drive monitoring (drivemonitor:53) 2017-12-07 23:17:15,508+0200 WARN (vm/0139b240) [virt.vm] (vmId='0139b240-d208-4d7b-8f0b-e8981ce26988') No balloon device present (vm:715) 2017-12-07 23:17:15,509+0200 ERROR (vm/0139b240) [virt.vm] (vmId='0139b240-d208-4d7b-8f0b-e8981ce26988') The vm start process failed (vm:911) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 840, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2652, in _run hooks.before_vm_start(self._buildDomainXML(), self._custom) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2166, in _buildDomainXML return self._make_domain_xml() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2172, in _make_domain_xml libvirtxml.Domain(self.conf, self.log, self.arch) File "/usr/lib/python2.7/site-packages/vdsm/virt/libvirtxml.py", line 73, in __init__ self.dom.appendChildWithArgs('name', text=self.conf['vmName']) KeyError: 'vmName' 2017-12-07 23:17:15,509+0200 INFO (vm/0139b240) [virt.vm] (vmId='0139b240-d208-4d7b-8f0b-e8981ce26988') Changed state to Down: 'vmName' (code=1) (vm:1630) Version-Release number of selected component (if applicable): vdsm-4.20.9-1.el7ev.x86_64 rhvm-4.2.0-0.6.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Create a live snapshot with memory 2. Restore to that snapshot (preview + commit) 3. Start the VM Actual results: Ram snapshot is not working as expected Expected results: Additional info: Regression - bug 1516405 was verified
works for me In which version was the snapshot created?
(In reply to Michal Skrivanek from comment #1) ah, the debug logging is not set up correctly, nevermind the previous question. Raz, can you please make sure the debug log level is enable for api.virt as well? Despite all the storage debug noise we're missing the crucial logs for many other calls
(In reply to Michal Skrivanek from comment #2) > (In reply to Michal Skrivanek from comment #1) > ah, the debug logging is not set up correctly, nevermind the previous > question. The snapshot was created in the same version as the engine and vdsm > > Raz, can you please make sure the debug log level is enable for api.virt as > well? Despite all the storage debug noise we're missing the crucial logs for > many other calls Sure, Do you want me to reproduce with DEBUG level for virt?
Seems the assumption in https://gerrit.ovirt.org/#/c/84266/ is wrong. The hiberVolHandle volume doesn't contain vmconf, it just contains the libvirt xml (obtained from libvirt on hibernation in both 4.1 and 4.2) Arik, please doublecheck the code in executeVdsBrokerCommand checks properly the compatibility version of that snapshot and not the current VM's compat. (i.e. snapshot taken in 4.1, now the VM is in 4.2 cluster, on preview there should be a warning/note that the VM is going to be restored as a 4.1 compat VM (with cluster level override set to 4.1) - please make sure in that case the vmconf is sent. Once that's correct the vdsm side can safely assume the snapshot is either: - created in 4.1, then engine sends vmconf and the code works as is. The _srcDomXML is used only for libvirt create call. - created in 4.2 and engine doesn't send vmconf - it needs to replace(create) "xml" with _srcDomXML as the current way how to differentiate old from new (if "xml" in params or self.conf) relies on it
(In reply to Raz Tamir from comment #3) > Sure, Thanks, it would greatly help with these cases > Do you want me to reproduce with DEBUG level for virt? Currently there's no need to, I believe I was able to deduce what's happening. Let's wait for a fix and try then!:)
Applied the patch to one of the hosts and went through the reproducer steps to create the snapshot and switch to it. Attempts to run the VM on a non-patched host fail and succeed on the patched one.
Passed automation run on: rhvm-4.2.0.2-0.1.el7.noarch vdsm-4.20.9.2-1.el7ev.x86_64 Verified
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.