Bug 1560132 - [RFE] Add finer grained monitoring thresholds for memory consumption on Hypervisors to RHV
Summary: [RFE] Add finer grained monitoring thresholds for memory consumption on Hyper...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.9
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.3.0
: 4.3.0
Assignee: Ravi Nori
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On: 1629592
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-24 00:39 UTC by Ameya Charekar
Modified: 2019-05-08 12:37 UTC (History)
9 users (show)

Fixed In Version: ovirt-engine-4.3.0_alpha
Doc Type: Enhancement
Doc Text:
In the Administration Portal, it is possible to set a threshold for cluster level monitoring as a percentage or an absolute value, for example, 95% or 2048 MB. When usage exceeds 95% or free memory falls below 2048 MB, a "high memory usage" or "low memory available" event is logged. This reduces log clutter for clusters with large (1.5 TB) amounts of memory.
Clone Of:
Environment:
Last Closed: 2019-05-08 12:37:22 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3433391 0 None None None 2018-05-06 08:20:35 UTC
Red Hat Product Errata RHEA-2019:1085 0 None None None 2019-05-08 12:37:42 UTC
oVirt gerrit 93345 0 master MERGED engine : Add finer grained monitoring thresholds for memory consumption on Hypervisors 2020-06-08 01:52:11 UTC
oVirt gerrit 93444 0 master MERGED webadmin: Add finer grained monitoring thresholds for memory consumption on Hypervisors 2020-06-08 01:52:11 UTC
oVirt gerrit 93486 0 master MERGED Added LogMaxMemoryUsedThresholdType enum and LogMaxMemoryUsedThreshold to Cluster. 2020-06-08 01:52:10 UTC
oVirt gerrit 93487 0 master MERGED restapi: Add finer grained monitoring thresholds for memory consumption on Hypervisors 2020-06-08 01:52:10 UTC
oVirt gerrit 94004 0 master MERGED restapi: Update to model 4.3.16 2020-06-08 01:52:09 UTC
oVirt gerrit 94281 0 master MERGED engine : Set default value for memory consumption threshold 2020-06-08 01:52:09 UTC
oVirt gerrit 94332 0 master MERGED engine : Change the type of logMaxMemoryUsedThreshold to Integer 2020-06-08 01:52:09 UTC
oVirt gerrit 94337 0 master MERGED engine : Remove min annotation on logMaxMemoryUsedThreshold 2020-06-08 01:52:09 UTC

Description Ameya Charekar 2018-03-24 00:39:43 UTC
Description of problem:

RHV manager events spams with warning "Used memory of host hostname [98%] exceeded defined threshold [95%]", for hosts having huge memory.

e.g. for host having 1.5 TB memory, 98% means approximately 30 GB memory still available so no need to report this host but there is no mechanism to skip particular host/cluster so that it will not be spammed in events.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Check events tab in rhv environment having hosts with memory in TBs in one cluster and having hosts with memory in GBs in another cluster.
2. 
3.

Actual results:
Hosts with memory in TBs spam events.

Expected results:
Set different memory thresholds for different clusters/hypervisors.

Additional info:
Currently adjusting memory threshold percentage by changing LogMaxPhysicalMemoryUsedThresholdInPercentage is applicable for every host.

Comment 2 Martin Perina 2018-03-26 13:06:15 UTC
The threshold can be changed using:

 engine-config -s LogMaxPhysicalMemoryUsedThresholdInPercentage=<VALUE>

where <VALUE> is the percentage, when above message is logged. By default it's set to 95%

Comment 3 Yaniv Kaul 2018-03-26 19:14:16 UTC
(In reply to Martin Perina from comment #2)
> The threshold can be changed using:
> 
>  engine-config -s LogMaxPhysicalMemoryUsedThresholdInPercentage=<VALUE>
> 
> where <VALUE> is the percentage, when above message is logged. By default
> it's set to 95%

The request is:
1. Also in absolute numbers 
2. Per cluster 

I think the 1st request is far more important than the first 2nd one.

Comment 7 Pavol Brilla 2018-09-20 18:24:15 UTC
Ravi,

shouldn't be those new LogMaxMemoryUsedThresholdType and LogMaxMemoryUsedThreshold be visible in engine-config?


# engine-config -a | grep MemoryUsed
LogMaxPhysicalMemoryUsedThresholdInPercentage: 95 version: general
[root@10-37-140-204 ~]# yum list ovirt-engine
Last metadata expiration check: 0:13:06 ago on Thu 20 Sep 2018 08:05:54 PM CEST.
Installed Packages
ovirt-engine.noarch                                4.3.0-0.0.master.20180919131647.git219e0985a3b.fc28                                 @ovirt-master-snapshot

Comment 8 Martin Perina 2018-09-20 19:19:48 UTC
(In reply to Pavol Brilla from comment #7)
> Ravi,
> 
> shouldn't be those new LogMaxMemoryUsedThresholdType and
> LogMaxMemoryUsedThreshold be visible in engine-config?
> 
> 
> # engine-config -a | grep MemoryUsed
> LogMaxPhysicalMemoryUsedThresholdInPercentage: 95 version: general
> [root@10-37-140-204 ~]# yum list ovirt-engine
> Last metadata expiration check: 0:13:06 ago on Thu 20 Sep 2018 08:05:54 PM
> CEST.
> Installed Packages
> ovirt-engine.noarch                               
> 4.3.0-0.0.master.20180919131647.git219e0985a3b.fc28                         
> @ovirt-master-snapshot

No, engine-config still contains the default in percentage with 95 as value. But users can change that value for each cluster when you create new or update existing cluster.

Comment 9 Pavol Brilla 2018-09-25 12:56:07 UTC
UI elements found, will test further with host to check correct triggers

Comment 11 Petr Matyáš 2019-01-17 16:38:20 UTC
Verified on ovirt-engine-4.3.0-0.8.rc2.el7.noarch

Comment 13 errata-xmlrpc 2019-05-08 12:37:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085


Note You need to log in before you can comment on or make changes to this bug.