Bug 1807860

Summary: [RFE] Allow resource allocation options to be customized
Product: Red Hat Enterprise Virtualization Manager Reporter: Guilherme Santos <gdeolive>
Component: ovirt-engine-metricsAssignee: Shirly Radco <sradco>
Status: CLOSED ERRATA QA Contact: Guilherme Santos <gdeolive>
Severity: high Docs Contact:
Priority: low    
Version: unspecifiedCC: emarcus, mtessun, sradco
Target Milestone: ovirt-4.3.11Keywords: FutureFeature, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-metrics-1.3.8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-04 13:21:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1709295    

Description Guilherme Santos 2020-02-27 11:41:37 UTC
Description of problem:
When deploying metrics, many resources have their values set by default and not customized. This becomes a problem when deploying in vhosts as the default values might not reflect the host resources allocation and settings thus installation can fails in many ways, few quite not easy to debug. Also, some default values may make installation slower and riskier to fail.

The bad boys are:
- CPU allocation:
Both metrics vms are set to have 4 cores by default and it sets that through the variable *cores* on */usr/share/ansible/roles/oVirt.metrics/roles/oVirt.origin-on-ovirt/templates/vars.yaml.template*
This variable by itself doesn't specify which combination of vcores, threads and vsockets will form the 4 final cores - depending on how these are pre-allocated in the (v)host, this may cause kernel panic or other issues.
The variables *cpu_sockets* and *cpu_threads* should be set and available to be customize (or maybe could discovered by the installation)

- Memory allocation:
Also on */usr/share/ansible/roles/oVirt.metrics/roles/oVirt.origin-on-ovirt/templates/vars.yaml.template* the memory is set to be 8 GiB, however as guarantee memory is not set, its default value is 1 GiB, what can cause a lot of trouble and slowness.
The variable *memory_guaranteed* for both vms should have at least 2 GiB (maybe 4 for a better performance)

- Hosts allocation on vhost:
This one is the more problematic for vhosts. When a vhost needs to handle both metrics-store-installer and master0 vm (like in the middle of the deployment), they vhost can get way too slow, disrupting and eventually failing the installation, or even panicking the kernel - This happen even if the vhost has plenty of resources available. Having  two vhosts (one for each vm) solved the issue.
The workaround that I did to be able to achieve this was to, first have each vhost in a different cluster and manually set them through the *cluster* variable on  */usr/share/ansible/roles/oVirt.metrics/roles/oVirt.origin-on-ovirt/templates/vars.yaml.template*
Then, match the cluster set with the cluster condition on line 23 in */usr/share/ansible/roles/oVirt.metrics/roles/oVirt.origin-on-ovirt/tasks/create_openshift_bastion_vm.yml*

Version-Release number of selected component (if applicable):
all

How reproducible:
always

Steps to Reproduce:
1. deploy metrics based on the tutorial
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sandro Bonazzola 2020-03-20 13:36:08 UTC
Shirly, this is in modified, targeted to 4.4.1 but code for this is included in 4.3.9-1 package.
Should this bug move to 4.3.9-1 and to ON_QA status?

Comment 4 Sandro Bonazzola 2020-07-01 12:10:23 UTC
Moved to 4.3.11 only because we are removing the metrics store deployment in 4.4 with bug #1827177

Comment 5 Guilherme Santos 2020-07-23 14:38:13 UTC
Verified on:
ovirt-engine-4.3.11-0.1.el7.noarch

Deployed metrics successfully on a virtual host fully customizing the CPU threats, cores, RAM guarantee, etc.

Comment 7 errata-xmlrpc 2020-08-04 13:21:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3247