Bug 1332039 - VM can be "down" and "migrating" at the same time
Summary: VM can be "down" and "migrating" at the same time
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.6.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.0.0-alpha
: 4.0.0
Assignee: Arik
QA Contact: Israel Pinto
URL:
Whiteboard:
: 1348847 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-01 14:54 UTC by Barak Korren
Modified: 2016-08-23 20:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 20:38:10 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Engine log from around the time of the problem (102.12 KB, application/x-bzip)
2016-05-01 14:54 UTC, Barak Korren
no flags Details
vdsm log (373.96 KB, application/x-xz)
2016-05-02 05:57 UTC, Barak Korren
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:1743 0 normal SHIPPED_LIVE Red Hat Virtualization Manager 4.0 GA Enhancement (ovirt-engine) 2016-09-02 21:54:01 UTC
oVirt gerrit 57104 0 master MERGED core: handle failed internal migrations properly 2016-05-08 08:38:54 UTC

Description Barak Korren 2016-05-01 14:54:11 UTC
Created attachment 1152774 [details]
Engine log from around the time of the problem

Description of problem:
VM can be "down" and "migrating" at the same time (Which prevents bringing the VM back up)

Version-Release number of selected component (if applicable):
3.6.5.3-0.1.el6

Steps to Reproduce:
1. Setup a cluster with 2 hosts and a VM so that VM migration will be impossible (For example with large memory differences between hosts and a VM which need more RAM then available on the smaller host.
2. Try to bring the bigger host into maintenance mode
3. Wait for VM migration to be attempted and failed 
4. Note the lock icon on the VM in the admin UI
5. SSH into the VM and shut it down - That will also cause the host to go into maintenance
6. Bring the host out of maintenance
7. Try to start the VM

Actual results:
An error message shows up:
'Cannot run VM. VM _foo_ is being migrated.'

Expected results:
The VM should just start up

Additional info:
Engine log attached, the name of the VM for which this happened is: "nested-lab4-rgi-7.2-20160302.0-builder-4"
This occurred on 1/5/16 around 15:34.

Comment 1 Michal Skrivanek 2016-05-02 05:11:19 UTC
Please attach vdsm logs from both hosts as well

Comment 2 Barak Korren 2016-05-02 05:57:36 UTC
Created attachment 1152858 [details]
vdsm log

Please note that the system in practice was much bigger then two hosts and the migration did not fail because of missing RAM but for other reasons I've not yet investigated (something to do with the host being from a different model then the rest).

I've attached the VDSM log of the host the VM was migrating from and remained running on until it was shut down.
Please specify if you need logs from other hosts, I can also provide direct access to the system if needed.

Comment 3 Arik 2016-05-02 11:27:56 UTC
The problem is that InternalMigrateVmCommand always returns successfully, even when it is not. In this particular case, although I don't have supporting log message or audit log I'm convinced that no host that the VM could migrate to was found, so there was no call to MigrateVDSCommand. Generally, that would cause the command to finish unsuccessfully, but InternalMigrateVmCommand overrides it - the infrastructure says "I have a command that implements IVdsAsyncCommand where the lock was defined for the whole command execution and the execute phase finished successfully, so the lock should not be released".

We should fix InternalMigrateVmCommand..

Comment 5 Israel Pinto 2016-07-11 12:00:54 UTC
Verify with:
RHEVM Version: 4.0.2-0.2.rc1.el7ev
Hosts:
OS Version:RHEL - 7.2 - 9.el7_2.1
Kernel Version:3.10.0 - 327.22.2.el7.x86_64
KVM Version:2.3.0 - 31.el7_2.16
LIBVIRT Version:libvirt-1.2.17-13.el7_2.5
VDSM Version:vdsm-4.18.5.1-1.el7ev
SPICE Version:0.12.4 - 15.el7_2.1 

Steps:
1. Migrate VM
2. Shut down VM during migration form: engine, and from VM.
3. VM is down
4. Start VM

Results: PASS
VM is up after shut down.

Comment 6 Michal Skrivanek 2016-07-12 13:14:05 UTC
*** Bug 1348847 has been marked as a duplicate of this bug. ***

Comment 9 errata-xmlrpc 2016-08-23 20:38:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-1743.html


Note You need to log in before you can comment on or make changes to this bug.