Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1289290

Summary: [RFE] Live Migration dynamic cpu throttling for auto-convergence (RHEV)
Product: Red Hat Enterprise Virtualization Manager Reporter: Hai Huang <hhuang>
Component: ovirt-engineAssignee: Michal Skrivanek <michal.skrivanek>
Status: CLOSED ERRATA QA Contact: Israel Pinto <ipinto>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: dmoessne, eheftman, fdeutsch, gklein, ipinto, juzhang, lsurette, mavital, mgoldboi, michal.skrivanek, pstehlik, rbalakri, Rhev-m-bugs, srevivo, virt-bugs, virt-maint, ycui, ykaul
Target Milestone: ovirt-4.1.0-alphaKeywords: FutureFeature
Target Release: ---Flags: ipinto: testing_plan_complete+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Previously, if a live migration was performed with extreme memory write intensive workloads, the migration would never be able to complete because QEMU could not transfer the memory changes fast enough. In this case, the migration could not reach the non-live finishing phase.  In this release and in these situations, RHV will restrict the amount of CPU given to the guest to reduce the rate at which memory is changed and allow the migration to complete.
Story Points: ---
Clone Of: 1289285 Environment:
Last Closed: 2017-04-25 01:00:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1289285, 1289291    
Bug Blocks: 1289288, 1358141    

Description Hai Huang 2015-12-07 20:18:43 UTC
+++ This bug was initially created as a clone of Bug #1289285 +++

Description of problem:

With extreme memory write intensive workloads, normal live migration will never complete because the guest is writing to memory faster than Qemu can transfer the memory changes to the destination system. In this case normal migration will continue forever, not making enough progress to stop the guest and proceed to the non-live "finishing up" phase of migration.

This feature provides a method for slowing down guest execution speed, thus hopefully, also slowing down guest memory write speed. As time advances autoconverge will continually increase the amount of guest cpu throttling until guest memory write speed slows enough to allow the guest to be stopped and migration to finish.

As of Qemu 2.5 dynamic throttling has been added to autoconverge dramatically increasing its effectiveness.

This feature will be available in RHEL7.3 qemu-kvm-rhev with the rebase 
to qemu 2.5.

The qemu feature page can be found in:
http://wiki.qemu.org/Features/AutoconvergeLiveMigration


Version-Release number of selected component (if applicable):

  qem-kvm-rhev  


How reproducible:
Always.


Steps to Reproduce:
Please refer to the qemu feature page above.


Actual results:
Live migration fails due to high page dirty rate 
(i.e. intensive memory writes).


Expected results:
Live migration successfully complete.


Additional info:

Comment 3 Michal Skrivanek 2016-01-15 16:17:01 UTC
tentatively setting 4.0 based on tentative 7.3 on RHEL side
should be an enhancement/replacement to the 3.6 autoconvergence

Comment 4 Michal Skrivanek 2016-09-07 12:50:30 UTC
the base autoconvergence mechanism is in place and the new algorithm will be used in 7.3. we don't plan to configure increments at the moment - awaiting feedback on recent migration improvement bug 1252426

Comment 5 Israel Pinto 2017-02-08 16:38:36 UTC
Verify with:
Engine: 4.1.0.3-0.1.el7
Host:
OS Version:RHEL - 7.3 - 7.el7
Kernel Version:3.10.0 - 550.el7.x86_64
KVM Version:2.6.0 - 28.el7_3.3.1
LIBVIRT Version:libvirt-2.0.0-10.el7_3.4
VDSM Version:vdsm-4.19.4-1.el7ev
SPICE Version:0.12.4 - 20.el7_3

Run migration sanity and load cases
All Pass