Created attachment 1242331 [details] engine.log Description of problem: Crash VM during migration with error "Failed in MigrateBrokerVDS" at 10:24:54 PM
Created attachment 1242332 [details] vdsm-node01.log
Created attachment 1242333 [details] vdsm-node02.log
could you please also attach libvirt and qemu logs?
Where path of the logs is located?(/var/log/libvirt/quemu/vm.log?)
yeah, that and than /var/log/messages If there will be nothing interesting we can enable debug logging of libvirt and look at there.
Created attachment 1243854 [details] messages
Created attachment 1243855 [details] vm.log
hmm: 2017-01-19 03:23:22.986+0000: initiating migration 2017-01-19 03:24:38.939+0000: shutting down 2017-01-19T03:24:39.464773Z qemu-kvm: terminating on signal 15 from pid 2669 @Francesco: any idea?
*** Bug 1413847 has been marked as a duplicate of this bug. ***
Looks like there is a couple of bugs in Vdsm. 1. Vdsm fails to retrieve the progress from libvirt job stats. This is one issue per se, as we fail to update the downtime, and this could make migration not converging, or converging slower. 2. There is a race in migration progress reporting. This could cause the progression meter go backward, but it is much easier to trigger only if we hit bug #1. In this case, the race confused the migration source Vdsm, leading it to believe the migration was NOT completed - while it was. What happened 2.a. migration attempt #1 completed, despite lack of downtime adjustment 2.b. due to bug#1 and the race, the progress report was not correctly set to 100% after migration completed 2.c. the migration source handler, misdetected the migration completed (because the progress was not 100% once it ended) and started a new one, which failed 2.d. the Engine only saw the last failed migration - this error was bogus, and acted accordingly We will fix both issues.
bug actually on Vdsm, and fixed there. Engine reacted according to (false) information reported, so it's innocent.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
no doc_text, this is just a plain bug caused by one unusual, but possible, sequence of events,
patches merged in the stable branch -> MODIFIED
Verify with: Red Hat Virtualization Manager Version: 4.1.1.2-0.1.el7 run migration sanity all pass