1402859 – [downstream clone - 3.6.10] [RFE] Optimize performance of host monitoring

Bug 1402859 - [downstream clone - 3.6.10] [RFE] Optimize performance of host monitoring

Summary: [downstream clone - 3.6.10] [RFE] Optimize performance of host monitoring

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	ovirt-3.6.10
Target Release:	---
Assignee:	Ravi Nori
QA Contact:	Eldad Marciano
Docs Contact:
URL:
Whiteboard:
Depends On:	1388536
Blocks:
TreeView+	depends on / blocked

Reported:	2016-12-08 13:31 UTC by rhev-integ
Modified:	2017-01-17 18:06 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:	1388536
Environment:
Last Closed:	2017-01-17 18:06:07 UTC
oVirt Team:	Infra
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2017:0108	normal	SHIPPED_LIVE	Red Hat Enterprise Virtualization Manager 3.6.10	2017-01-17 22:48:34 UTC
oVirt gerrit	65924	master	MERGED	engine: Optimize performance of host monitoring	2016-12-08 13:32:00 UTC
oVirt gerrit	67999	ovirt-engine-4.0	POST	engine: Optimize performance of host monitoring	2016-12-08 13:32:00 UTC
oVirt gerrit	68018	None	None	None	2016-12-09 09:31:57 UTC

Description rhev-integ 2016-12-08 13:31:48 UTC

+++ This bug is an upstream to downstream clone. The original bug is: +++
+++   bug 1388536 +++
======================================================================

Description of problem:

Currently host monitoring is updating database data inefficiently and we have seen it causes big db load for larger setups. So we need to clean up the code here and optimize database updates similarly as it has been done inside VM monitoring

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

(Originally by Martin Perina)

Comment 1 Lukas Svaty 2016-12-08 13:55:51 UTC

Adding CodeChange, for testing it is sufficient the basic sanity on db performance with host and VM statuses. With this interval to write to vds_dynamic was changed from few seconds to 1-2 per minute.

Comment 2 Lukas Svaty 2016-12-08 13:57:12 UTC

By accident change Qa contact, reverting :)

Comment 3 Yaniv Kaul 2016-12-08 14:21:29 UTC

The title is 4.0.z , the target milestone is 3.6.10 - is it going to be in 3.6.10 somehow? I'd be happy to see it, although I would not wait for it.

Comment 4 Martin Perina 2016-12-08 15:04:54 UTC

Yes, this is 4.0.7 clone of upstream BZ1388536, once I receive all acks, I'm going to clone 3.6.10.

Comment 6 Eldad Marciano 2016-12-19 08:50:03 UTC

(In reply to rhev-integ from comment #0)
> +++ This bug is an upstream to downstream clone. The original bug is: +++
> +++   bug 1388536 +++
> ======================================================================
> 
> Description of problem:
> 
> Currently host monitoring is updating database data inefficiently and we
> have seen it causes big db load for larger setups. So we need to clean up
> the code here and optimize database updates similarly as it has been done
> inside VM monitoring
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 1.
> 2.
> 3.
> 
> Actual results:
> 
> 
> Expected results:
> 
> 
> Additional info:
> 
> (Originally by Martin Perina)

Ravi, can you specify what assets profile you used ? how many hosts ? vms? SDs?

Comment 7 Ravi Nori 2016-12-19 16:58:29 UTC

I test with 2 hosts, 4 VMs and 1 NFS storage domain on my dev env.

Comment 8 Eldad Marciano 2016-12-20 13:29:47 UTC

(In reply to Ravi Nori from comment #7)
> I test with 2 hosts, 4 VMs and 1 NFS storage domain on my dev env.

how did you measure the dbload? in terms of how long you monitor it ?
can you specify also which table grows or in which table you saw the load?

Comment 9 Ravi Nori 2016-12-20 13:49:19 UTC

I used break point in the code VdsDynamicDaoImpl.updateIfNeeded and saw that the vds_dynamic table was update less frequently than before this patch. Before this patch the table is updated every 3 seconds, with this patch it is updated only when there is a real change in the VdsDynamic information.

Comment 10 Eldad Marciano 2016-12-20 14:02:15 UTC

(In reply to Ravi Nori from comment #9)
> I used break point in the code VdsDynamicDaoImpl.updateIfNeeded and saw that
> the vds_dynamic table was update less frequently than before this patch.
> Before this patch the table is updated every 3 seconds, with this patch it
> is updated only when there is a real change in the VdsDynamic information.

cpu utilizaion change will trigger the update?!

or its more about status change ?

Comment 11 Yaniv Kaul 2016-12-20 14:09:39 UTC

(In reply to Eldad Marciano from comment #10)
> (In reply to Ravi Nori from comment #9)
> > I used break point in the code VdsDynamicDaoImpl.updateIfNeeded and saw that
> > the vds_dynamic table was update less frequently than before this patch.
> > Before this patch the table is updated every 3 seconds, with this patch it
> > is updated only when there is a real change in the VdsDynamic information.
> 
> cpu utilizaion change will trigger the update?!
> 
> or its more about status change ?

Eldad - this is CodeChange - I don't see a reason for QE to test this.

Comment 12 Eldad Marciano 2016-12-20 14:14:24 UTC

(In reply to Yaniv Kaul from comment #11)
> (In reply to Eldad Marciano from comment #10)
> > (In reply to Ravi Nori from comment #9)
> > > I used break point in the code VdsDynamicDaoImpl.updateIfNeeded and saw that
> > > the vds_dynamic table was update less frequently than before this patch.
> > > Before this patch the table is updated every 3 seconds, with this patch it
> > > is updated only when there is a real change in the VdsDynamic information.
> > 
> > cpu utilizaion change will trigger the update?!
> > 
> > or its more about status change ?
> 
> Eldad - this is CodeChange - I don't see a reason for QE to test this.

OK, so is it ok to close it or move it to verify ?!

Comment 13 Yaniv Kaul 2016-12-20 14:17:34 UTC

Code Change, moving to VERIFIED.

Comment 15 errata-xmlrpc 2017-01-17 18:06:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0108.html

Note You need to log in before you can comment on or make changes to this bug.