Bug 867453 - [RFE] Help live migration converge when guest dirties pages too fast
Summary: [RFE] Help live migration converge when guest dirties pages too fast
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Martin Betak
QA Contact: Israel Pinto
URL: http://www.ovirt.org/Features/XBZRLE_...
Whiteboard:
: 863264 1098291 (view as bug list)
Depends On: 863264 990319 1151001
Blocks: 1255223
TreeView+ depends on / blocked
 
Reported: 2012-10-17 14:35 UTC by Karen Noel
Modified: 2016-03-09 20:27 UTC (History)
26 users (show)

Fixed In Version: ovirt-3-6-0-2
Doc Type: Enhancement
Doc Text:
QEMU capabilities for auto-convergence and/or Xor Binary Zero Run-Length-Encoding (XBZRLE) can be used to reduce virtual machine downtime and improve convergence during migration. This is supported by hierarchical configuration in 3 levels: global (engine-config), cluster, and virtual machine.
Clone Of: 863264
Environment:
Last Closed: 2016-03-09 20:27:34 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:
sherold: Triaged+
ylavi: testing_beta_priority+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:0376 0 normal SHIPPED_LIVE Red Hat Enterprise Virtualization Manager 3.6.0 2016-03-10 01:20:52 UTC

Description Karen Noel 2012-10-17 14:35:32 UTC
RHEV-M should take advantage of this through libvirt

+++ This bug was initially created as a clone of Bug #863264 +++

This new feature will help live migration converge when the guest is dirtying pages too fast for the network throughput.

The performance team is using a script like this:

# cat migrated_cgroup.sh
# Enter guest name $1
#
# Try for 120 seconds with 10 percent of 1-cpu
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Try for 2 minutes with 1% of the cpu of 1-cpu
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Increase the period to then reduce to .1% of 1-cpu.
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_period_us
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_period_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us

This logic should be refined and put into libvirt so it's automatic for the customer.

--- Additional comment from eblake on 2012-10-04 19:28:03 EDT ---

Additionally, libvirt should be using the qemu 1.2 feature of XBZRLE migration, if it can determine that both sides of the migration support it (which also means that XBZRLE needs to be backported into RHEL qemu).

--- Additional comment from jdenemar on 2012-10-08 08:15:58 EDT ---

XBZRLE support is requested by bug 842857

Comment 4 Itamar Heim 2013-06-19 14:14:20 UTC
*** Bug 863264 has been marked as a duplicate of this bug. ***

Comment 6 Doron Fediuck 2014-05-16 04:38:13 UTC
*** Bug 1098291 has been marked as a duplicate of this bug. ***

Comment 8 Eldad Marciano 2014-05-20 09:51:45 UTC
+1

I reproduced the bug too, even if dirty the pages rate is every 30 sec which is not too fast, and if the pages dynamically freed.

Comment 12 Israel Pinto 2015-10-07 15:06:54 UTC
Verify with version : 
RHEVM: 3.6 build: rhevm-3.6.0-0.18.master.el6.noarch
VDSM:vdsm-4.17.8-1.el7ev
libvirt: libvirt-1.2.17-5.el7

1. Performance test pass see comment 11
2. Functionality test check with last build (rhevm-3.6.0-0.18)

Comment 15 errata-xmlrpc 2016-03-09 20:27:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html


Note You need to log in before you can comment on or make changes to this bug.