Bug 713392
Summary: | Increase migration max_downtime/or speed cause guest stalls. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Mike Cao <bcao> |
Component: | kvm | Assignee: | Juan Quintela <quintela> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 5.7 | CC: | bcao, bgollahe, cww, dyasny, ehabkost, gcosta, iheim, juzhang, knoel, llim, lyarwood, michen, mkalinin, mkenneth, mshao, quintela, syeghiay, tburke, virt-maint, vromanov, xfu |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kvm-83-239.el5 | Doc Type: | Bug Fix |
Doc Text: |
Due to a regression, when the values for maximum downtime or maximum speed were increased during a migration, the guests experienced heavy stalls and the migration did not finish in a reasonable time. With this update, a patch has been provided and the migration process finishes successfully in the described scenario.
|
Story Points: | --- |
Clone Of: | 690521 | Environment: | |
Last Closed: | 2011-07-21 08:50:05 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 690521 | ||
Bug Blocks: | 580949, 696155, 707606 |
Comment 3
Mike Cao
2011-06-22 07:59:13 UTC
Test is half wrong, you need to do both: ping <guest> -> outside ping <outside> -> guest outside -> guest you see the stalls easily. Inside -> outside only happens sometimes (guest is paused after all). Later, Juan. Tried the steps provided by Juan Actual Results: from guest side :ping 8.8.8.8 -i 0.2 ----> no packages lost from host side :ping <guest ip> -i 0.2 ---> 24 packages lost bcao--->Juan Does the packages lost above means I reproduced this issue ? Best Regards, Mike Talked with Juan via IRC. comment #5 means reproduced this issue due to guest was not able to answer a ping for so long (24*0.2=4.8 sec). Based on above ,provide qa_ack+. Tried on kvm-83-239.el5 ,I found Bug 690521 was regressed and it blocks me to verify this bug. Steps: 1.I tried several times both image is located on nfs server or using lvm,still can *not* reproduce ,following is my steps: 1.start guest with -m 1G -cpu 4 eg:/usr/libexec/qemu-kvm -m 1G -smp 4,sockets=4,cores=1,threads=1 -name RHEL5u7 -uuid 13bd47ff-7458-a214-9c43-d311ed5ca5a3 -monitor stdio -no-kvm-pit-reinjection -boot c -drive file=/mnt/RHEL5.7-virtio.qcow2,if=virtio,format=qcow2,cache=none,boot=on -net nic,macaddr=54:52:00:52:ed:61,vlan=0,model=virtio -net tap,script=/etc/qemu-ifup,downscript=no,vlan=0 -serial pty -parallel none -usb -vnc :1 -k en-us -vga cirrus -balloon none -M rhel5.6.0 -usbdevice tablet 2.in the guest #ping 8.8.8.8 -i 0.1 #stress -c 1 -m 1 3.(qemu)migrate_set_speed 1G 4.(qemu) migrate -d tcp:<hostB>:5888 Actual Results: wait for more than 30 mins ,migration never finished ,I can not Verity this Bug Juan ,Could you gave some suggestions how to verify this issue without Bug 690521 fixed ? Best Regards, Mike when do local migration, migration default transfer speed is about 35M/sec after changed migrate_set_speed to 1G, migration transfer speed is about 160M. default speed info: (qemu) info migrate Migration status: active transferred ram: 90881 kbytes remaining ram: 3993092 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 127135 kbytes remaining ram: 3956908 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 179874 kbytes remaining ram: 3904272 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 212834 kbytes remaining ram: 3871376 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 245791 kbytes remaining ram: 3838484 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 291936 kbytes remaining ram: 3792428 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 324897 kbytes remaining ram: 3759532 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 361151 kbytes remaining ram: 3723348 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 397149 kbytes remaining ram: 3634812 kbytes total ram: 4214796 kbytes after setting migrate speed 1G, migration info: (qemu) info migrate Migration status: active transferred ram: 782433 kbytes remaining ram: 3237796 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 944833 kbytes remaining ram: 3074260 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1165022 kbytes remaining ram: 2854524 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1301184 kbytes remaining ram: 2718660 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1456158 kbytes remaining ram: 2564552 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1596031 kbytes remaining ram: 2424972 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1749662 kbytes remaining ram: 2271892 kbytes total ram: 4214796 kbytes QEMU 0.9.1 monitor - type 'help' for more information (qemu) info migrate Migration status: active transferred ram: 1898016 kbytes remaining ram: 2124208 kbytes total ram: 4214796 kbytes (In reply to comment #12) > Juan ,Could you gave some suggestions how to verify this issue without Bug > 690521 fixed ? > > Best Regards, > Mike Does 'https://bugzilla.redhat.com/show_bug.cgi?id=690521#c70' help? (In reply to comment #14) > (In reply to comment #12) > > Juan ,Could you gave some suggestions how to verify this issue without Bug > > 690521 fixed ? > > > > Best Regards, > > Mike > > Does 'https://bugzilla.redhat.com/show_bug.cgi?id=690521#c70' help? I am afraid not , following is the reason : this bug was mainly about *increasing migration_max_speed* costs migration downtime also increased itself(from my result ,it increase to 4s) ,that't the reason cause customers application failed . If I can not verify this Bug via increase migration_down_time.migration may finish ,but it is not the original Bug. Actually ,QE really found that ,doing (qemu)migration_set_speed 1G the speed increased on kvm-239 was much smaller than that on kvm-238 Mike (In reply to comment #15) > (In reply to comment #14) > > (In reply to comment #12) > > > Juan ,Could you gave some suggestions how to verify this issue without Bug > > > 690521 fixed ? > > > > > > Best Regards, > > > Mike > > > > Does 'https://bugzilla.redhat.com/show_bug.cgi?id=690521#c70' help? > > I am afraid not , following is the reason : > > this bug was mainly about *increasing migration_max_speed* costs migration > downtime also increased itself(from my result ,it increase to 4s) ,that't the > reason cause customers application failed . > > If I can not verify this Bug via increase migration_down_time.migration may > finish ,but it is not the original Bug. > > Actually ,QE really found that ,doing (qemu)migration_set_speed 1G the speed > increased on kvm-239 was much smaller than that on kvm-238 That's because we revert some buggy change. Before that revert, it looked like the bandwidth and the migration convergence are fine but when you expected a down time of 0.1s you actually got several seconds of downtime. With the patch revert migration with config of 0.1s take lots of time because we're accurate. If you'll want to achive similar convergence/bandwidth, you'll need to increase the configuration > > Mike Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Due to a regression, when the values for maximum downtime or maximum speed were increased during a migration, the guests experienced heavy stalls and the migration did not finish in a reasonable time. With this update, a patch has been provided and the migration process finishes successfully in the described scenario. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1068.html An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1068.html |