Bug 1289290 - [RFE] Live Migration dynamic cpu throttling for auto-convergence (RHEV)
Summary: [RFE] Live Migration dynamic cpu throttling for auto-convergence (RHEV)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ovirt-4.1.0-alpha
: ---
Assignee: Michal Skrivanek
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On: 1289285 1289291
Blocks: 1289288 1358141
TreeView+ depends on / blocked
 
Reported: 2015-12-07 20:18 UTC by Hai Huang
Modified: 2017-04-25 01:00 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Previously, if a live migration was performed with extreme memory write intensive workloads, the migration would never be able to complete because QEMU could not transfer the memory changes fast enough. In this case, the migration could not reach the non-live finishing phase.  In this release and in these situations, RHV will restrict the amount of CPU given to the guest to reduce the rate at which memory is changed and allow the migration to complete.
Clone Of: 1289285
Environment:
Last Closed: 2017-04-25 01:00:46 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:
ipinto: testing_plan_complete+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:0997 0 normal SHIPPED_LIVE Red Hat Virtualization Manager (ovirt-engine) 4.1 GA 2017-04-18 20:11:26 UTC

Description Hai Huang 2015-12-07 20:18:43 UTC
+++ This bug was initially created as a clone of Bug #1289285 +++

Description of problem:

With extreme memory write intensive workloads, normal live migration will never complete because the guest is writing to memory faster than Qemu can transfer the memory changes to the destination system. In this case normal migration will continue forever, not making enough progress to stop the guest and proceed to the non-live "finishing up" phase of migration.

This feature provides a method for slowing down guest execution speed, thus hopefully, also slowing down guest memory write speed. As time advances autoconverge will continually increase the amount of guest cpu throttling until guest memory write speed slows enough to allow the guest to be stopped and migration to finish.

As of Qemu 2.5 dynamic throttling has been added to autoconverge dramatically increasing its effectiveness.

This feature will be available in RHEL7.3 qemu-kvm-rhev with the rebase 
to qemu 2.5.

The qemu feature page can be found in:
http://wiki.qemu.org/Features/AutoconvergeLiveMigration


Version-Release number of selected component (if applicable):

  qem-kvm-rhev  


How reproducible:
Always.


Steps to Reproduce:
Please refer to the qemu feature page above.


Actual results:
Live migration fails due to high page dirty rate 
(i.e. intensive memory writes).


Expected results:
Live migration successfully complete.


Additional info:

Comment 3 Michal Skrivanek 2016-01-15 16:17:01 UTC
tentatively setting 4.0 based on tentative 7.3 on RHEL side
should be an enhancement/replacement to the 3.6 autoconvergence

Comment 4 Michal Skrivanek 2016-09-07 12:50:30 UTC
the base autoconvergence mechanism is in place and the new algorithm will be used in 7.3. we don't plan to configure increments at the moment - awaiting feedback on recent migration improvement bug 1252426

Comment 5 Israel Pinto 2017-02-08 16:38:36 UTC
Verify with:
Engine: 4.1.0.3-0.1.el7
Host:
OS Version:RHEL - 7.3 - 7.el7
Kernel Version:3.10.0 - 550.el7.x86_64
KVM Version:2.6.0 - 28.el7_3.3.1
LIBVIRT Version:libvirt-2.0.0-10.el7_3.4
VDSM Version:vdsm-4.19.4-1.el7ev
SPICE Version:0.12.4 - 20.el7_3

Run migration sanity and load cases
All Pass


Note You need to log in before you can comment on or make changes to this bug.