Bug 1289290 - [RFE] Live Migration dynamic cpu throttling for auto-convergence (RHEV)
[RFE] Live Migration dynamic cpu throttling for auto-convergence (RHEV)
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity unspecified
: ovirt-4.1.0-alpha
: ---
Assigned To: Michal Skrivanek
Israel Pinto
: FutureFeature
Depends On: 1289285 1289291
Blocks: 1289288 1358141
  Show dependency treegraph
 
Reported: 2015-12-07 15:18 EST by Hai Huang
Modified: 2017-04-24 21:00 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Previously, if a live migration was performed with extreme memory write intensive workloads, the migration would never be able to complete because QEMU could not transfer the memory changes fast enough. In this case, the migration could not reach the non-live finishing phase.  In this release and in these situations, RHV will restrict the amount of CPU given to the guest to reduce the rate at which memory is changed and allow the migration to complete.
Story Points: ---
Clone Of: 1289285
Environment:
Last Closed: 2017-04-24 21:00:46 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ipinto: testing_plan_complete+


Attachments (Terms of Use)

  None (edit)
Description Hai Huang 2015-12-07 15:18:43 EST
+++ This bug was initially created as a clone of Bug #1289285 +++

Description of problem:

With extreme memory write intensive workloads, normal live migration will never complete because the guest is writing to memory faster than Qemu can transfer the memory changes to the destination system. In this case normal migration will continue forever, not making enough progress to stop the guest and proceed to the non-live "finishing up" phase of migration.

This feature provides a method for slowing down guest execution speed, thus hopefully, also slowing down guest memory write speed. As time advances autoconverge will continually increase the amount of guest cpu throttling until guest memory write speed slows enough to allow the guest to be stopped and migration to finish.

As of Qemu 2.5 dynamic throttling has been added to autoconverge dramatically increasing its effectiveness.

This feature will be available in RHEL7.3 qemu-kvm-rhev with the rebase 
to qemu 2.5.

The qemu feature page can be found in:
http://wiki.qemu.org/Features/AutoconvergeLiveMigration


Version-Release number of selected component (if applicable):

  qem-kvm-rhev  


How reproducible:
Always.


Steps to Reproduce:
Please refer to the qemu feature page above.


Actual results:
Live migration fails due to high page dirty rate 
(i.e. intensive memory writes).


Expected results:
Live migration successfully complete.


Additional info:
Comment 3 Michal Skrivanek 2016-01-15 11:17:01 EST
tentatively setting 4.0 based on tentative 7.3 on RHEL side
should be an enhancement/replacement to the 3.6 autoconvergence
Comment 4 Michal Skrivanek 2016-09-07 08:50:30 EDT
the base autoconvergence mechanism is in place and the new algorithm will be used in 7.3. we don't plan to configure increments at the moment - awaiting feedback on recent migration improvement bug 1252426
Comment 5 Israel Pinto 2017-02-08 11:38:36 EST
Verify with:
Engine: 4.1.0.3-0.1.el7
Host:
OS Version:RHEL - 7.3 - 7.el7
Kernel Version:3.10.0 - 550.el7.x86_64
KVM Version:2.6.0 - 28.el7_3.3.1
LIBVIRT Version:libvirt-2.0.0-10.el7_3.4
VDSM Version:vdsm-4.19.4-1.el7ev
SPICE Version:0.12.4 - 20.el7_3

Run migration sanity and load cases
All Pass

Note You need to log in before you can comment on or make changes to this bug.