Merged u/s to ovirt-3.3 as http://gerrit.ovirt.org/gitweb?p=vdsm.git;a=commit;h=9369b370369057832eff41793075fc1a63c42279
Verified in vdsm-4.13.2-0.8.el6ev.x86_64 (is34). Verification steps: 1. Preparation: On destination migration host, set 'migration_destination_timeout' to '120' in VDSM config.py (located at /usr/lib64/python2.6/site-packages/vdsm/config.py). This reduces the verification time, otherwise the default is 6 hours. 2. Have a running VM (F19 in my case) with some ongoing memory-stressing operation (I used `memtester` utility). This should make the migration process long enough to give us time in step 3 to simulate the error-prone environment. 2. Migrate the VM from source host1 do destination host2. 3. Immediately after migration starts, block on the source host1: - connection to destination host VDSM (simulating connection loss to dest. VDSM) `iptables -I OUTPUT 1 -p tcp -d <host2> --dport 54321 -j DROP` - connection to the storage (simulating migration error) `iptables -I OUTPUT 1 -d <storage> -j DROP` 4. Wait `migration_destination_timeout` seconds (120). Results: The migration fails (due to our blocking of storage) and is aborted. On destination host, the migrating VM is destroyed (the host shows 0 running VMs and no VM migrating). The VM stays on the source host (paused due to inaccessible storage; after unblocking the storage the VM should run as if nothing happened). The source host shows 1 running VM and no VM migrating.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0219.html