Bug 1507884 - VM fails to run after being suspended
Summary: VM fails to run after being suspended
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.2.0
: ---
Assignee: Milan Zamazal
QA Contact: Mor
URL:
Whiteboard:
: 1507966 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-31 11:36 UTC by Mor
Modified: 2019-04-28 11:13 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-12-20 11:34:09 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)
logs (libvirt, vdsm, engine) (12.79 MB, application/octet-stream)
2017-10-31 11:37 UTC, Mor
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 83521 0 master MERGED virt: Make sure Drive.path is not None for file devices 2017-11-06 12:28:35 UTC

Description Mor 2017-10-31 11:36:42 UTC
Description of problem:
When a running VM is set on suspend mode, it fails to run again.

Version-Release number of selected component (if applicable):
4.2.0-0.0.master.20171029154613.git19686f3.el7.centos
vdsm-4.20.4-13.gitec7ffb6.el7.centos.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Run a VM.
2. Set VM on suspend.
3. Try to run it again.

Actual results:
Error message.

Expected results:
Should run.

Additional info:

engine.log:
-----------
2017-10-31 13:28:48,580+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] Failed in 'DestroyVDS' method
2017-10-31 13:28:48,591+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-15) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command Destr
oyVDS failed: General Exception: ("VM '696da21b-693a-4b57-abfa-743a76f4cb4e' was not defined yet or was undefined",)
2017-10-31 13:28:48,592+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand' return value 'St
atusOnlyReturn [status=Status [code=100, message=General Exception: ("VM '696da21b-693a-4b57-abfa-743a76f4cb4e' was not defined yet or was undefined",)]]'
2017-10-31 13:28:48,592+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] HostName = host_mixed_1
2017-10-31 13:28:48,592+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] Command 'DestroyVDSCommand(HostName = host_mixed_1, DestroyVmVDSCommandParameters:{hos
tId='9d81c1b8-2d74-431c-81eb-881cc64095a6', vmId='696da21b-693a-4b57-abfa-743a76f4cb4e', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'})' execution failed: VDSGenericException: VDSErrorExce
ption: Failed to DestroyVDS, error = General Exception: ("VM '696da21b-693a-4b57-abfa-743a76f4cb4e' was not defined yet or was undefined",), code = 100
2017-10-31 13:28:48,592+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-15) [] FINISH, DestroyVDSCommand, log id: 13cc05f7
2017-10-31 13:28:48,593+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-15) [] VM '696da21b-693a-4b57-abfa-743a76f4cb4e'(golden_env_mixed_virtio_1) moved from 'RestoringSt
ate' --> 'Down'
2017-10-31 13:28:48,755+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-15) [] EVENT_ID: VM_DOWN_ERROR(119), VM golden_env_mixed_virtio_1 is down with error
. Exit message: Wake up from hibernation failed:cannot serialize None (type NoneType).
2017-10-31 13:28:48,757+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-15) [] add VM '696da21b-693a-4b57-abfa-743a76f4cb4e'(golden_env_mixed_virtio_1) to rerun treatment
2017-10-31 13:28:48,763+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-15) [] Rerun VM '696da21b-693a-4b57-abfa-743a76f4cb4e'. Called from VDS 'host_mixed_1'
2017-10-31 13:28:48,883+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-4485) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run 
VM golden_env_mixed_virtio_1 on Host host_mixed_1.

vdsm.log:
---------
2017-10-31 13:28:48,577+0200 ERROR (jsonrpc/3) [api] FINISH destroy error=VM '696da21b-693a-4b57-abfa-743a76f4cb4e' was not defined yet or was undefined (api:127)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 312, in destroy
    res = self.vm.destroy(gracefulAttempts)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4984, in destroy
    self._deleteVm()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4973, in _deleteVm
    self._undefine_domain()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2296, in _undefine_domain
    self._dom.undefine()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 47, in __getattr__
    % self.vmid)
NotConnectedError: VM '696da21b-693a-4b57-abfa-743a76f4cb4e' was not defined yet or was undefined

Comment 1 Mor 2017-10-31 11:37:59 UTC
Created attachment 1345841 [details]
logs (libvirt, vdsm, engine)

Comment 2 Mor 2017-10-31 19:57:05 UTC
Correction to step 1 on 'steps to reproduce':
1. Use Run VM once, and choose a specific host to run at.

Comment 3 Yaniv Kaul 2017-11-01 08:07:52 UTC

*** This bug has been marked as a duplicate of bug 1507966 ***

Comment 4 Yaniv Kaul 2017-11-01 09:45:00 UTC
*** Bug 1507966 has been marked as a duplicate of this bug. ***

Comment 5 Michal Skrivanek 2017-11-01 10:11:48 UTC
at the first glance the error looks the same as bug 1507511
Either way the logging is not sufficient here...

Comment 6 Mor 2017-11-01 12:17:40 UTC
(In reply to Michal Skrivanek from comment #5)
> at the first glance the error looks the same as bug 1507511
> Either way the logging is not sufficient here...

BZ 1507511 has different steps to reproduce. I have included vdsm, libvirt and engine logs. What other logs are needed?

Comment 7 Michal Skrivanek 2017-11-02 09:56:43 UTC
better logging is needed, not logs:)

We were not able to reproduce this problem anywhere, but now we have better logging (https://gerrit.ovirt.org/#/c/83506/) and it would be great to reproduce and see more details then

Comment 8 Mor 2017-11-02 11:35:13 UTC
(In reply to Michal Skrivanek from comment #7)
> better logging is needed, not logs:)
> 
> We were not able to reproduce this problem anywhere, but now we have better
> logging (https://gerrit.ovirt.org/#/c/83506/) and it would be great to
> reproduce and see more details then

It is still reproducible on my environment: 4.2.0-0.0.master.20171030210714.gitef6bb9c.el7.centos
Let me know if you need access to it (through IRC, email).

Comment 9 Milan Zamazal 2017-11-02 12:37:24 UTC
I managed to reproduce the bug and a fix has been posted.

Comment 10 Red Hat Bugzilla Rules Engine 2017-11-03 09:11:03 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 11 Yaniv Kaul 2017-11-16 08:25:29 UTC
Can the bug move to MODIFIED state?

Comment 12 Milan Zamazal 2017-11-16 08:58:11 UTC
Yes, done.

Comment 13 Mor 2017-11-16 09:56:55 UTC
I am sure that I am getting the right behaviour with the patch. 

Does suspend action suppose to take a memory snapshot? 
Currently, when I run the VM after the suspending it, the VM gets up without restoring its previous state (like a restart).

Comment 14 Milan Zamazal 2017-11-16 12:40:57 UTC
Suspend should resume the previous state. Currently it doesn't and it's an (unrelated) bug, I'll file a new bug on that.

Comment 15 Milan Zamazal 2017-11-16 13:08:12 UTC
Reported as https://bugzilla.redhat.com/1513996. Thanks for catching that!

Comment 16 Mor 2017-11-16 14:03:55 UTC
Thanks for filing a bug.

Verified on:
Version 4.2.0-0.0.master.20171114111003.git7aa1b91.el7.centos

Comment 17 Sandro Bonazzola 2017-12-20 11:34:09 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.