1139217 – [RFE] [scale] improve resource usage during sampling

Bug 1139217 - [RFE] [scale] improve resource usage during sampling

Summary: [RFE] [scale] improve resource usage during sampling

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	vdsm
Classification:	oVirt
Component:	RFEs
Sub Component:
Version:	---
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	ovirt-3.6.0-rc
Target Release:	4.17.0
Assignee:	Francesco Romani
QA Contact:	Eldad Marciano
Docs Contact:
URL:
Whiteboard:
Depends On:	1181653
Blocks:
TreeView+	depends on / blocked

Reported:	2014-09-08 12:20 UTC by Francesco Romani
Modified:	2019-04-28 13:23 UTC (History)
CC List:	11 users (show)
Fixed In Version:	ovirt-3.6.0-alpha1.2
Clone Of:
Environment:
Last Closed:	2016-03-11 07:18:11 UTC
oVirt Team:	Virt
Embargoed:
Dependent Products:
Flags:	rule-engine: ovirt-3.6.0+ ylavi: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
oVirt gerrit	36722	0	master	MERGED	virt: stats: periodic sampling using bulk stats	Never

Description Francesco Romani 2014-09-08 12:20:33 UTC

Description of problem:
During normal usage, VDSM monitors the VM to gather statistics and report them to Engine. To do so, it must use the less amount of host resource as possible,
in order to leave them for VMs.

This is a generic tracker bug for improvements in this area.

Comment 1 Francesco Romani 2015-01-09 13:37:41 UTC

after long discussion, many failed attempts and lot of tinkering, patches posted

Comment 2 Francesco Romani 2015-01-13 14:55:52 UTC

the new libvirt bulk stats API are an improvement, even more in the long term.
But the biggest source of load is the disk usage threshold check.

This alone drives up the frequency of polling to very high rates.

Once we get events to be notified of disk usage threshold exceeded, we can greatly reduce the frequency of polling to sane values, thus greatly reducing the load of the system and improving the resource usage.

Comment 3 Michal Skrivanek 2015-03-26 10:16:29 UTC

changing to RFE, the improvements is very significant in cases of high number of VMs per host.
Estimated improvements are in order of 2-4 times less CPU usage

Comment 4 Francesco Romani 2015-05-19 06:38:12 UTC

VDSM patches all merged for 4.17.0 (oVirt 3.6.0)

MOM needs to be updated (work in progress on that front)

Moving to MODIFIED

Comment 5 Red Hat Bugzilla Rules Engine 2015-10-18 08:34:56 UTC

Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Note You need to log in before you can comment on or make changes to this bug.