Description of problem: rhevm reports VM "up" on destination Host even after VM migration failure. Version-Release number of selected component (if applicable): RHEV 4.1.5 vdsm-4.19.15-1.el7ev.x86_64 How reproducible: N/A Steps to Reproduce: 1. 2. 3. Actual results: Host A was moved to maintenance mode after facing issues with Storage connections. VM was in unknown state. VM migration was triggered from host A to B which failed but rhevm reported VM to be "up" on Host B and "not responding" on Host A. Expected results: Upon VM migration failure VM should not be reported up on Destination host. Additional info:
We have fixed what sounds like a very similar issue in 4.1.6. See https://bugzilla.redhat.com/show_bug.cgi?id=1487913 .
Meital, can your team please check if this scenario reproducible or fixed in 4.1.6 as part of bz#1487913. From the two bug descriptions it sounds like 2 different issues. Thank you!
(In reply to Marina from comment #5) > Meital, can your team please check if this scenario reproducible or fixed in > 4.1.6 as part of bz#1487913. > From the two bug descriptions it sounds like 2 different issues. > > Thank you! Yes, we can try. Israel, can you please try to reproduce?
I will reproduce with the following steps: (not HA VM): 1. Start VM on source host 2. Block storage connection to source host 3. Wait for VM to become unknown 4. Switch source host to maintenance 4. Check that VM is migration and up on destination host and not exist on source host using "virsh -r list"
where do you see postcopy migration?
Please approve the steps at: https://bugzilla.redhat.com/show_bug.cgi?id=1534664#c11
it's a bit more tricky. You'd need to reproduce exactly what happened in comment #17 start migration, during migration fencing (or maybe you can do that manually) restarts vdsm, recovery finishes(while that vm is still migrating) and engine migrates the VM again to a different host(that should happen automatically once the vdsm recovery finishes in the previous step). Now the second migration should fail after some time, and the first one concludes (either successfully or not, that should not matter)
verify with engine version:4.2.2.2-0.1.el7 Host: OS Version: RHEL - 7.5 - 8.el7 Kernel Version:3.10.0 - 858.el7.x86_64 KVM Version:2.10.0 - 21.el7 LIBVIRT Version:libvirt-3.9.0-14.el7 VDSM Version:vdsm-4.20.19-1.el7ev For Migration policy: post copy and Minimal downtime run the following cases: Case 1: 1. Run VM (run load on vm to slow migration) 2. Migration VM (wait about 30 sec) 3. Block connection to storage on source host (wait about 60 sec) 4. Re connect connection to storage 5. Restart vdsm on both source and destination host Case 2: 1. Run VM (run load on vm to slow migration) 2. Migration VM (wait about 30 sec) 3. Block connection to storage on destination host (wait about 60 sec) 4. Re connect connection to storage 5. Restart vdsm on both source and destination host Results: In post copy, VM was up all migration time, migration succeeded. In Minimal downtime VM become "not responding" after block connection to storage migration succeeded after connection back. Did not see VM running on both host after migration done, migration did not failed.
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed: [No relevant external trackers attached] For more info please contact: rhv-devops
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1488
BZ<2>Jira Resync