Bug 1290361 - "VDSM <host name> command failed: Virtual machine does not exist" error appears every time when shutting down a running VM
"VDSM <host name> command failed: Virtual machine does not exist" error appea...
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt (Show other bugs)
3.6.1.2
x86_64 Linux
unspecified Severity medium (vote)
: ovirt-3.6.3
: 3.6.3
Assigned To: Arik
sefi litmanovich
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-10 05:32 EST by Michael Burman
Modified: 2016-02-18 06:02 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-18 06:02:06 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: planning_ack+
michal.skrivanek: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
engine log (241.00 KB, application/x-gzip)
2015-12-10 05:32 EST, Michael Burman
no flags Details
qemu 'vm' log (3.00 KB, application/x-gzip)
2015-12-10 05:49 EST, Michael Burman
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 50398 master ABANDONED core: set better fetch time for events 2015-12-16 10:36 EST
oVirt gerrit 50529 master MERGED core: ignore noVm error for internal destroy ops Never
oVirt gerrit 51083 ovirt-engine-3.6 MERGED core: ignore noVm error for internal destroy ops 2015-12-28 04:20 EST

  None (edit)
Description Michael Burman 2015-12-10 05:32:27 EST
Created attachment 1104296 [details]
engine log

Description of problem:
"VDSM <host name> command failed: Virtual machine does not exist" error appears every time when shutting down a running VM.
This error shown in the engine log and in the event UI log.

2015-12-10 12:22:28,531 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-12) [] Command 'DestroyVDSCommand(HostName = silver-vdsa.qa.lab.tlv.redhat.com, DestroyVmVDSCommandParameters:{runAsync='true', hostId='6e8c9954-3287-4192-8522-25fa55271889', vmId='404f96db-b224-4163-a21e-eeb8eb084d7b', force='false', secondsToWait='0', gracefully='false', reason=''})' execution failed: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1

2015-12-10 12:22:28,531 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-12) [] FINISH, DestroyVDSCommand, log id: 2179476e
2015-12-10 12:22:28,614 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM v4 is down. Exit message: Admin shut down from the engine
2015-12-10 12:22:28,618 INFO  [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (ForkJoinPool-1-worker-12) [] VM '404f96db-b224-4163-a21e-eeb8eb084d7b(v4) is running in db and not running in VDS 'silver-vdsa.qa.lab.tlv.redhat.com'

Version-Release number of selected component (if applicable):
3.6.1.2-0.1.el6

How reproducible:
almost 100%

Steps to Reproduce:
1. Run VM in latest 3.6.1.2 engine with vdsm 4.17.13
2. Shut down the VM


Actual results:
"VDSM <host name> command failed: Virtual machine does not exist" error appears every time when shutting down a running VM.

Expected results:
Shouldn't see such errors.

Additional info:
I noticed this on previous 3.6.z builds as well. Can't found any report for this issue.
Comment 1 Michael Burman 2015-12-10 05:49 EST
Created attachment 1104299 [details]
qemu 'vm' log
Comment 2 Arik 2015-12-13 10:34:05 EST
Unfortunately known race (100% reproducible with breakpoints):
1. engine calls destroy on stop-vm at X1 (The VM was UP)
2. vdsm generates Down event at X2
3. the event is received by the engine at X3 and at this time the VM is still UP
4. engine sets the VM status to down at X4 and completes the stop-vm (lock is freed)
5. the engine calls vms-monitoring to process the event at X5
6. the vms-monitoring sees that the previous state of the was UP and now Down so it triggers redundant destroy

- In regular flow the 'previous status' of the VM is PoweringDown so the Down event is ignored
- In polling flow we are able to set a point in time in which we know that the information we will get is relevant for (fetch-time). we have no way to say that for events though.

Possible solution would be to set the fetch-time for events just before processing the event - it is not as good as the solution we have for polling but it will solve this race.
Comment 3 Red Hat Bugzilla Rules Engine 2015-12-15 10:37:08 EST
Fixed bug tickets must have version flags set prior to fixing them. Please set the correct version flags and move the bugs back to the previous status after this is corrected.
Comment 4 Red Hat Bugzilla Rules Engine 2015-12-27 05:34:33 EST
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 5 sefi litmanovich 2016-02-09 12:05:28 EST
Verified according to description with rhevm-3.6.3-0.1.el6.noarch.

shutdown process did not report 'Vm does not exist' in audit log (whereas in engine.log it does).
Comment 6 sefi litmanovich 2016-02-09 12:05:45 EST
Verified according to description with rhevm-3.6.3-0.1.el6.noarch.

shutdown process did not report 'Vm does not exist' in audit log (whereas in engine.log it does).

Note You need to log in before you can comment on or make changes to this bug.