Bug 1271145
Summary: | Guest OS paused after migration. | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> | ||||||||||||||
Component: | qemu-kvm-rhev | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||
Priority: | unspecified | ||||||||||||||||
Version: | 7.2 | CC: | amit.shah, dgilbert, dyuan, hhuang, huding, juzhang, knoel, lmiksik, mrezanin, mzhan, pezhang, quintela, virt-maint, xfu, zpeng | ||||||||||||||
Target Milestone: | rc | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | qemu-kvm-rhev-2.3.0-31.el7 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2015-12-04 16:47:06 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 1265902 | ||||||||||||||||
Attachments: |
|
Created attachment 1082307 [details]
libvirtd log from source host
Created attachment 1082308 [details]
qemu log form source host
Created attachment 1082309 [details]
qemu log form target host
Created attachment 1082310 [details]
The guest XML
Please also have a look at this test case: My guest paused unexpectedly when I test managedsave, the behaviour is similar, but I'm not sure if they are same issue. And it's not 100% reproducible. What I did generally is: How reproducible: < 10% Versions: libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-29.el7.x86_64 Steps: 1.Prepare a running guest. 2.Restart service libvirt-guests. 3.Check the state of guest: # virsh list Id Name State ---------------------------------------------------- 2 rhel7.2 running But actually, try to do some operation in the guest, the guest OS is not responding. 4.Then I use "virsh suspend rhel7.2" and "virsh resume rhel7.2", the guest can respond again. Created attachment 1082312 [details]
libvirtd log when test managedsave
There is a relevant bug in libvirt which has been resolved: 1265902. Refer to this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1265902#c29 migration.c/process_incoming_migration_co sends the COMPLETED event before the announce self, and before the bdrv_invlidate_cache_all and before the global_state stuff - so there's a race there. quintela: Any reason not to just move the generate_event to the bottom of that function? (and add a failed event in the other exit path) David: nothin7g wrong that I can see. I stopped there, because migration "properly" has been finished, but nothing againsnt moving it to the end of the function. Posted fix upstream: Migration: Generate the completed event only when we complete Fix included in qemu-kvm-rhev-2.3.0-31.el7 According to comment17, set this issue as verified. Best Regards, Junyi Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html |
Created attachment 1082306 [details] libvirtd log from target host Description of problem: Migrating a guest whose machine type is pc-i440fx-rhel7.2.0 without --live option, after migration, guest OS doesn't respond, although "virsh list" indicates that the guest is running: # virsh list Id Name State ---------------------------------------------------- 10 rhel7.2 running Then I use "virsh suspend rhel7.2" and "virsh resume rhel7.2", guest OS can respond again. # virsh suspend rhel7.2 Domain rhel7.2 suspended # virsh resume rhel7.2 Domain rhel7.2 resumed Version-Release number of selected component (if applicable): libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-30.el7.x86_64 How reproducible: maybe >80% Steps to Reproduce: 1. Prepare a running guest with machine type ='pc-i440fx-rhel7.2.0' 2. Do migration without --live: # virsh migrate rhel7.2 qemu+ssh://10.66.5.20/system --verbose Migration: [100 %] 3. Check the guest status on target: # virsh list Id Name State ---------------------------------------------------- 10 rhel7.2 running But actually, try to do some operation in the guest, the guest OS is not responding. 4.Suspend and resume the guest. # virsh suspend rhel7.2 Domain rhel7.2 suspended # virsh resume rhel7.2 Domain rhel7.2 resumed Now try to do some operation in the guest, the guest OS can respond again. Actual results: As step3~4. Expected results: The guest can work normally after migration. Additional info: 1) When migration with --live option, there is no such issue. 2) Machine type "pc-i440fx-rhel7.1.0"/"pc-i440fx-rhel7.0.0" have no such issue.