1935219 – [CNV-2.5] Set memory and CPU request on hco-operator and hco-webhook deployments

Bug 1935219 - [CNV-2.5] Set memory and CPU request on hco-operator and hco-webhook deployments

Summary: [CNV-2.5] Set memory and CPU request on hco-operator and hco-webhook deployments

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Installation
Sub Component:
Version:	2.5.3
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Simone Tiraboschi
QA Contact:	Debarati Basu-Nag
Docs Contact:
URL:
Whiteboard:
Depends On:	1931519
Blocks:	1935217 1935218
TreeView+	depends on / blocked

Reported:	2021-03-04 14:45 UTC by sgott
Modified:	2021-11-02 15:57 UTC (History)
CC List:	6 users (show)
Fixed In Version:	hco-bundle-registry-container-v4.9.0-32
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1931519
Environment:
Last Closed:	2021-11-02 15:57:28 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	kubevirt hyperconverged-cluster-operator pull 1335	None	closed	Reduce the memory footprint using cache selectors	2021-06-16 07:24:08 UTC
Github	kubevirt hyperconverged-cluster-operator pull 1405	None	closed	Tune resources requests	2021-07-06 15:41:18 UTC
Red Hat Product Errata	RHSA-2021:4104	None	None	None	2021-11-02 15:57:41 UTC

Description sgott 2021-03-04 14:45:53 UTC

+++ This bug was initially created as a clone of Bug #1931519 +++

This is a clone to track items specifically related to install component

------------

Description of problem:

Most of the deployments and daemonsets stored in the openshift-cnv namespace don't specify the resource request in their manifests. Only daemonset/kube-cni-linux-bridge-plugin, deployment/kubemacpool-mac-controller-manager and daemonset/kube-cni-linux-bridge-plugin have it defined as follows:


Kind       | Name                               | CPU Req/Limits | Mem Req/Limits
---------- | ---------------------------------- | -------------- | ---------------
daemonset  | kube-cni-linux-bridge-plugin       | 60m/0m         | 30Mi/0Mi
deployment | kubemacpool-mac-controller-manager | 100m/300m      | 300Mi/600Mi


The following list of manifests don't define the resource requirements:

Kind       | Name
---------- | ---- 
daemonset  | bridge-marker
daemonset  | nmstate-handler
daemonset  | ovs-cni-amd64
daemonset  | bridge-marker
daemonset  | nmstate-handler
daemonset  | ovs-cni-amd64
daemonset  | kubevirt-node-labeller
daemonset  | ovs-cni-amd64
daemonset  | nmstate-handler
deployment | cdi-uploadproxy
deployment | cdi-apiserver
deployment | nmstate-webhook
deployment | hostpath-provisioner-operator
deployment | virt-api
deployment | virt-controller
deployment | virt-handler
deployment | virt-operator
deployment | virt-template-validator
deployment | vm-import-controller
deployment | vm-import-operator
deployment | cdi-deployment
deployment | cluster-network-addons-operator
deployment | cdi-operator
deployment | cluster-network-addons-operator
deployment | kubevirt-ssp-operato
deployment | hco-operator


Version-Release number of selected component (if applicable):
CNV 2.5.3 and onward.

How reproducible:



Steps to Reproduce:
1.Create CNV namespace
2.Create CNV Operator Group
3.Create HCO subscription and deploy stable
4.Wait for deployment of HCO operator to complete
5.Check for resource requests in deployed manifests.

Actual results:
Only 2 deployed manifests define their resource requirements, and only 1 define the resource limits (see list above). 

Expected results:
All deployed manifests define the resource requirements.

Additional info:
N/A

Comment 1 Simone Tiraboschi 2021-03-10 14:47:10 UTC

This is a subset of a larger effort ( https://bugzilla.redhat.com/1931519 ); in this specific bug we are focusing only at setting memory and CPU limits on hco-operator and hco-webhook deployments.

Comment 2 Nico Schieder 2021-03-22 10:42:48 UTC

While working on this, we found out that HCO is watching ConfigMaps (and Services) across the whole cluster, leading to unpredictable memory consumption depending on the size of the cluster.
To rectify this we are looking into filtering our caches for those objects.

Will update this issue as soon as we agreed on how to tackle it.

Comment 3 Simone Tiraboschi 2021-04-19 08:29:43 UTC

We are now waiting this change on controller-runtime:
https://github.com/kubernetes-sigs/controller-runtime/pull/1435

to have a predictable memory consumption.
Only at that time we will be able to really implement a memory limit.
This is probably not going to happen in 4.8 timeframe.

Comment 4 Simone Tiraboschi 2021-04-30 12:58:48 UTC

https://github.com/kubernetes-sigs/controller-runtime/pull/1435 got merged, we can start consuming it as soon as we will get a new release of controller-runtime

Comment 5 Simone Tiraboschi 2021-06-16 08:52:42 UTC

According to https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#resources-and-limits guidelines that states:
"
Therefore, cluster components SHOULD NOT be configured with resource limits.
However, cluster components MUST declare resource requests for both CPU and memory.
"

we are going to set resource requests for both CPU and memory but not resource limits.

Comment 6 Debarati Basu-Nag 2021-08-09 15:10:16 UTC

Validated against at 4.9 cluster:

For hco-operator:
=================
 resources:
          requests:
            cpu: 10m
            memory: 96Mi
=================

For hco-webhook:
=================
        resources:
          requests:
            cpu: 5m
            memory: 48Mi
=================

Based on this above results, marking this ticket as verified.

Comment 9 errata-xmlrpc 2021-11-02 15:57:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4104

Note You need to log in before you can comment on or make changes to this bug.