Bug 867453 - [RFE] Help live migration converge when guest dirties pages too fast
[RFE] Help live migration converge when guest dirties pages too fast
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs (Show other bugs)
unspecified
Unspecified Unspecified
high Severity high
: ovirt-3.6.0-rc
: 3.6.0
Assigned To: Martin Betak
Israel Pinto
http://www.ovirt.org/Features/XBZRLE_...
: FutureFeature
: 863264 1098291 (view as bug list)
Depends On: 863264 990319 1151001
Blocks: 1255223
  Show dependency treegraph
 
Reported: 2012-10-17 10:35 EDT by Karen Noel
Modified: 2016-03-09 15:27 EST (History)
26 users (show)

See Also:
Fixed In Version: ovirt-3-6-0-2
Doc Type: Enhancement
Doc Text:
QEMU capabilities for auto-convergence and/or Xor Binary Zero Run-Length-Encoding (XBZRLE) can be used to reduce virtual machine downtime and improve convergence during migration. This is supported by hierarchical configuration in 3 levels: global (engine-config), cluster, and virtual machine.
Story Points: ---
Clone Of: 863264
Environment:
Last Closed: 2016-03-09 15:27:34 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Virt
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sherold: Triaged+
ylavi: testing_beta_priority+


Attachments (Terms of Use)

  None (edit)
Description Karen Noel 2012-10-17 10:35:32 EDT
RHEV-M should take advantage of this through libvirt

+++ This bug was initially created as a clone of Bug #863264 +++

This new feature will help live migration converge when the guest is dirtying pages too fast for the network throughput.

The performance team is using a script like this:

# cat migrated_cgroup.sh
# Enter guest name $1
#
# Try for 120 seconds with 10 percent of 1-cpu
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Try for 2 minutes with 1% of the cpu of 1-cpu
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Increase the period to then reduce to .1% of 1-cpu.
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_period_us
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_period_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us

This logic should be refined and put into libvirt so it's automatic for the customer.

--- Additional comment from eblake@redhat.com on 2012-10-04 19:28:03 EDT ---

Additionally, libvirt should be using the qemu 1.2 feature of XBZRLE migration, if it can determine that both sides of the migration support it (which also means that XBZRLE needs to be backported into RHEL qemu).

--- Additional comment from jdenemar@redhat.com on 2012-10-08 08:15:58 EDT ---

XBZRLE support is requested by bug 842857
Comment 4 Itamar Heim 2013-06-19 10:14:20 EDT
*** Bug 863264 has been marked as a duplicate of this bug. ***
Comment 6 Doron Fediuck 2014-05-16 00:38:13 EDT
*** Bug 1098291 has been marked as a duplicate of this bug. ***
Comment 8 Eldad Marciano 2014-05-20 05:51:45 EDT
+1

I reproduced the bug too, even if dirty the pages rate is every 30 sec which is not too fast, and if the pages dynamically freed.
Comment 12 Israel Pinto 2015-10-07 11:06:54 EDT
Verify with version : 
RHEVM: 3.6 build: rhevm-3.6.0-0.18.master.el6.noarch
VDSM:vdsm-4.17.8-1.el7ev
libvirt: libvirt-1.2.17-5.el7

1. Performance test pass see comment 11
2. Functionality test check with last build (rhevm-3.6.0-0.18)
Comment 15 errata-xmlrpc 2016-03-09 15:27:34 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html

Note You need to log in before you can comment on or make changes to this bug.