Hide Forgot
Created attachment 1152774 [details] Engine log from around the time of the problem Description of problem: VM can be "down" and "migrating" at the same time (Which prevents bringing the VM back up) Version-Release number of selected component (if applicable): 3.6.5.3-0.1.el6 Steps to Reproduce: 1. Setup a cluster with 2 hosts and a VM so that VM migration will be impossible (For example with large memory differences between hosts and a VM which need more RAM then available on the smaller host. 2. Try to bring the bigger host into maintenance mode 3. Wait for VM migration to be attempted and failed 4. Note the lock icon on the VM in the admin UI 5. SSH into the VM and shut it down - That will also cause the host to go into maintenance 6. Bring the host out of maintenance 7. Try to start the VM Actual results: An error message shows up: 'Cannot run VM. VM _foo_ is being migrated.' Expected results: The VM should just start up Additional info: Engine log attached, the name of the VM for which this happened is: "nested-lab4-rgi-7.2-20160302.0-builder-4" This occurred on 1/5/16 around 15:34.
Please attach vdsm logs from both hosts as well
Created attachment 1152858 [details] vdsm log Please note that the system in practice was much bigger then two hosts and the migration did not fail because of missing RAM but for other reasons I've not yet investigated (something to do with the host being from a different model then the rest). I've attached the VDSM log of the host the VM was migrating from and remained running on until it was shut down. Please specify if you need logs from other hosts, I can also provide direct access to the system if needed.
The problem is that InternalMigrateVmCommand always returns successfully, even when it is not. In this particular case, although I don't have supporting log message or audit log I'm convinced that no host that the VM could migrate to was found, so there was no call to MigrateVDSCommand. Generally, that would cause the command to finish unsuccessfully, but InternalMigrateVmCommand overrides it - the infrastructure says "I have a command that implements IVdsAsyncCommand where the lock was defined for the whole command execution and the execute phase finished successfully, so the lock should not be released". We should fix InternalMigrateVmCommand..
Verify with: RHEVM Version: 4.0.2-0.2.rc1.el7ev Hosts: OS Version:RHEL - 7.2 - 9.el7_2.1 Kernel Version:3.10.0 - 327.22.2.el7.x86_64 KVM Version:2.3.0 - 31.el7_2.16 LIBVIRT Version:libvirt-1.2.17-13.el7_2.5 VDSM Version:vdsm-4.18.5.1-1.el7ev SPICE Version:0.12.4 - 15.el7_2.1 Steps: 1. Migrate VM 2. Shut down VM during migration form: engine, and from VM. 3. VM is down 4. Start VM Results: PASS VM is up after shut down.
*** Bug 1348847 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-1743.html