867453 – [RFE] Help live migration converge when guest dirties pages too fast

Bug 867453 - [RFE] Help live migration converge when guest dirties pages too fast

Summary: [RFE] Help live migration converge when guest dirties pages too fast

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	RFEs
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ovirt-3.6.0-rc
Target Release:	3.6.0
Assignee:	Martin Betak
QA Contact:	Israel Pinto
Docs Contact:
URL:	http://www.ovirt.org/Features/XBZRLE_...
Whiteboard:
Duplicates (2):	863264 1098291 (view as bug list)
Depends On:	863264 990319 1151001
Blocks:	1255223
TreeView+	depends on / blocked

Reported:	2012-10-17 14:35 UTC by Karen Noel
Modified:	2016-03-09 20:27 UTC (History)
CC List:	26 users (show)
Fixed In Version:	ovirt-3-6-0-2
Doc Type:	Enhancement
Doc Text:	QEMU capabilities for auto-convergence and/or Xor Binary Zero Run-Length-Encoding (XBZRLE) can be used to reduce virtual machine downtime and improve convergence during migration. This is supported by hierarchical configuration in 3 levels: global (engine-config), cluster, and virtual machine.
Clone Of:	863264
Environment:
Last Closed:	2016-03-09 20:27:34 UTC
oVirt Team:	Virt
Target Upstream Version:
Embargoed:
Flags:	sherold: Triaged+ ylavi: testing_beta_priority+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2016:0376	0	normal	SHIPPED_LIVE	Red Hat Enterprise Virtualization Manager 3.6.0	2016-03-10 01:20:52 UTC

Description Karen Noel 2012-10-17 14:35:32 UTC

RHEV-M should take advantage of this through libvirt

+++ This bug was initially created as a clone of Bug #863264 +++

This new feature will help live migration converge when the guest is dirtying pages too fast for the network throughput.

The performance team is using a script like this:

# cat migrated_cgroup.sh
# Enter guest name $1
#
# Try for 120 seconds with 10 percent of 1-cpu
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 10000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Try for 2 minutes with 1% of the cpu of 1-cpu
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us
sleep 120
# Increase the period to then reduce to .1% of 1-cpu.
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_period_us
echo 1000000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_period_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu0/cpu.cfs_quota_us
echo 1000 > /cgroup/cpu/libvirt/qemu/$1/vcpu1/cpu.cfs_quota_us

This logic should be refined and put into libvirt so it's automatic for the customer.

--- Additional comment from eblake on 2012-10-04 19:28:03 EDT ---

Additionally, libvirt should be using the qemu 1.2 feature of XBZRLE migration, if it can determine that both sides of the migration support it (which also means that XBZRLE needs to be backported into RHEL qemu).

--- Additional comment from jdenemar on 2012-10-08 08:15:58 EDT ---

XBZRLE support is requested by bug 842857

Comment 4 Itamar Heim 2013-06-19 14:14:20 UTC

*** Bug 863264 has been marked as a duplicate of this bug. ***

Comment 6 Doron Fediuck 2014-05-16 04:38:13 UTC

*** Bug 1098291 has been marked as a duplicate of this bug. ***

Comment 8 Eldad Marciano 2014-05-20 09:51:45 UTC

+1

I reproduced the bug too, even if dirty the pages rate is every 30 sec which is not too fast, and if the pages dynamically freed.

Comment 12 Israel Pinto 2015-10-07 15:06:54 UTC

Verify with version : 
RHEVM: 3.6 build: rhevm-3.6.0-0.18.master.el6.noarch
VDSM:vdsm-4.17.8-1.el7ev
libvirt: libvirt-1.2.17-5.el7

1. Performance test pass see comment 11
2. Functionality test check with last build (rhevm-3.6.0-0.18)

Comment 15 errata-xmlrpc 2016-03-09 20:27:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html

Note You need to log in before you can comment on or make changes to this bug.

ahoness
bgraveno
ctusa
cwei
dyuan
eblake
emarcian
iheim
istein
jrfuller
knoel
lbopf
lpeer
lyarwood
mavital
mbetak
michal.skrivanek
mkalinin
mzhan
pablo.iranzo
pdwyer
perfbz
rbalakri
sherold
weizhan
zpeng