Bug 536943
| Summary: | RFE: migration enhancements - make sure live migration ends | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Dor Laor <dlaor> |
| Component: | libvirt | Assignee: | Daniel Veillard <veillard> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.4 | CC: | berrange, dallan, jdenemar, jjarvis, juzhang, kelvin.zhao, llim, sputhenp, syeghiay, tburke, virt-maint, weizhan, xen-maint |
| Target Milestone: | rc | Keywords: | FutureFeature |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-0.8.2-1.el5 | Doc Type: | Enhancement |
| Doc Text: | Story Points: | --- | |
| Clone Of: | RHEVmigration | Environment: | |
| Last Closed: | 2011-01-13 22:53:15 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Comment 3
Jiri Denemark
2010-09-02 11:59:29 UTC
Sure, 5.6 doesn't support JSON so you won't see the errors coming from JSON implementation being broken. Even for JSON the log content mentioned in comment 21 is not the expected behavior. The expected outcome is described in other comments. Basically, live migration works as follows: First, all guest's memory is transferred to the target host. In each following iteration only memory pages changed from last iteration are transferred. When the amount of changed pages is low enough to be transferred in maxdowntime, the guest is paused and the migration is finished offline. To test it, you need to start a guest which continuously changes lots of memory pages so that the migration never gets to the last step. After you increase maxdowntime enough, the migration will finish. However, the implementation in qemu is a black magic and cannot really be done accurately, which makes testing this a bit harder. To add even more complexity to this, it was recently discovered that the way QEMU itself handles 'max migration downtime' is completely & utterly broken. Even if you set max downtime of 25ms, migrating a 200 GB guest may still see a downtime of 30 *Minutes*. Basically this is untestable as far as I can see, unless you can attach a debugger to QEMU and watch for QEMU processing the monitor command from libvirt. Hi Jiri, I test the set bandwidth on virt-manager and it seem work. I set the bandwith with 1M and 50M, and it will show the obvious difference on the migration speed. For testing the setmaxdowntime, I also use the bandwidth, I set the bandwidth to 1M and the migration will hold a long time. But After I setmaxdowntime with 1000, it will stop immediately. Can it be the right method to verify the bug? weizhang, see Daniel's comment #10. (In reply to comment #10) > To add even more complexity to this, it was recently discovered that the way > QEMU itself handles 'max migration downtime' is completely & utterly broken. > Even if you set max downtime of 25ms, migrating a 200 GB guest may still see a > downtime of 30 *Minutes*. Basically this is untestable as far as I can see, > unless you can attach a debugger to QEMU and watch for QEMU processing the > monitor command from libvirt. How to " attach a debugger to QEMU and watch for QEMU processing the monitor command from libvirt." ? Thanks. test steps: 1. set log_outputs="1:file:/var/log/libvirt/libvirt.log" in the libvirtd.conf file and restart libvirtd 2. do migration in virt-manager with bandwidth = 1 M 3. when migration do #virsh migrate-setmaxdowntime mig 1000 4. see the log info expect result: you can see the following message on the /var/log/libvirt/libvirt.log and no error info when migration ... 17:09:55.003: debug : qemuMonitorCommandWithHandler:231 : Send command 'migrate_set_speed 1m' for write with FD -1 ... 17:09:58.349: debug : virDomainMigrateSetMaxDowntime:12141 : domain=0x1abdd4b0, downtime=1000, flags=0 17:09:58.349: debug : qemuDomainMigrateSetMaxDowntime:11611 : Requesting migration downtime change to 1000ms ... 17:09:58.388: debug : qemuDomainWaitForMigrationComplete:4843 : Setting migration downtime to 1000ms 17:09:58.388: debug : qemuMonitorSetMigrationDowntime:1262 : mon=0x1abbb770 downtime=1000 17:09:58.388: debug : qemuMonitorCommandWithHandler:231 : Send command 'migrate_set_downtime 1000ms' for write with FD -1 ... This bug can be verified. # uname -r 2.6.18-228.el5 # rpm -qa libvirt libvirt-0.8.2-8.el5 # rpm -qa |grep kvm kvm-83-205.el5 kvm-qemu-img-83-205.el5 kmod-kvm-83-205.el5 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0060.html |