Created attachment 1044952 [details] logs Description of problem: Migration one VM with two hosts in cluster, The migration is succeeded but the VM is down, and after ~2 up on destination. Version-Release number of selected component (if applicable): oVirt Engine Version: 3.6.0-0.0.master.20150627185750.git6f063c1.el6 (3.6.0-03) How reproducible: 2 of 5 times manually in automation also. Steps to Reproduce: Migrate VM Actual results: VM is down although migration succeeded Expected results: VM up and migration succeeded Additional info: Event log:
the problem is with new events infrastructure for vm stats mechanism: engine receive event that sent from one host on all the hosts, so once the migration completed, and vm is UP on the destination, engine receive this event on destination host, but also on the source host, this makes the engine think the migration failed (because the event say the vm moved to up on source) later the engine discovers the real status. i was able to reproduce this easily on latest master and verify the above with extra logging of the events. adding a link to a patch that were merged earlier today that should fix this.
Now that the bug in the events is fixed, trying to verify it also fix the migration scenario, i am able to see there is also a bug with the code that reads the event and execute the monitoring code, which still cause the reported issue. moving back to virt to handle the new issue.
Verify with version:3.6.0-5 3.6.0-0.0.master.20150804111407.git122a3a0.el6 VDSM: vdsm-4.17.0-1239.git6575e3f.el7.noarch Check with Automation and manually https://rhev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/3.6-GE-compute/144/