Hide Forgot
Created attachment 1082306 [details] libvirtd log from target host Description of problem: Migrating a guest whose machine type is pc-i440fx-rhel7.2.0 without --live option, after migration, guest OS doesn't respond, although "virsh list" indicates that the guest is running: # virsh list Id Name State ---------------------------------------------------- 10 rhel7.2 running Then I use "virsh suspend rhel7.2" and "virsh resume rhel7.2", guest OS can respond again. # virsh suspend rhel7.2 Domain rhel7.2 suspended # virsh resume rhel7.2 Domain rhel7.2 resumed Version-Release number of selected component (if applicable): libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-30.el7.x86_64 How reproducible: maybe >80% Steps to Reproduce: 1. Prepare a running guest with machine type ='pc-i440fx-rhel7.2.0' 2. Do migration without --live: # virsh migrate rhel7.2 qemu+ssh://10.66.5.20/system --verbose Migration: [100 %] 3. Check the guest status on target: # virsh list Id Name State ---------------------------------------------------- 10 rhel7.2 running But actually, try to do some operation in the guest, the guest OS is not responding. 4.Suspend and resume the guest. # virsh suspend rhel7.2 Domain rhel7.2 suspended # virsh resume rhel7.2 Domain rhel7.2 resumed Now try to do some operation in the guest, the guest OS can respond again. Actual results: As step3~4. Expected results: The guest can work normally after migration. Additional info: 1) When migration with --live option, there is no such issue. 2) Machine type "pc-i440fx-rhel7.1.0"/"pc-i440fx-rhel7.0.0" have no such issue.
Created attachment 1082307 [details] libvirtd log from source host
Created attachment 1082308 [details] qemu log form source host
Created attachment 1082309 [details] qemu log form target host
Created attachment 1082310 [details] The guest XML
Please also have a look at this test case: My guest paused unexpectedly when I test managedsave, the behaviour is similar, but I'm not sure if they are same issue. And it's not 100% reproducible. What I did generally is: How reproducible: < 10% Versions: libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-29.el7.x86_64 Steps: 1.Prepare a running guest. 2.Restart service libvirt-guests. 3.Check the state of guest: # virsh list Id Name State ---------------------------------------------------- 2 rhel7.2 running But actually, try to do some operation in the guest, the guest OS is not responding. 4.Then I use "virsh suspend rhel7.2" and "virsh resume rhel7.2", the guest can respond again.
Created attachment 1082312 [details] libvirtd log when test managedsave
There is a relevant bug in libvirt which has been resolved: 1265902. Refer to this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1265902#c29
migration.c/process_incoming_migration_co sends the COMPLETED event before the announce self, and before the bdrv_invlidate_cache_all and before the global_state stuff - so there's a race there. quintela: Any reason not to just move the generate_event to the bottom of that function? (and add a failed event in the other exit path)
David: nothin7g wrong that I can see. I stopped there, because migration "properly" has been finished, but nothing againsnt moving it to the end of the function.
Posted fix upstream: Migration: Generate the completed event only when we complete
Fix included in qemu-kvm-rhev-2.3.0-31.el7
According to comment17, set this issue as verified. Best Regards, Junyi
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html