+++ This bug was initially created as a clone of Bug #1931519 +++ This is a clone to track items specifically related to install component ------------ Description of problem: Most of the deployments and daemonsets stored in the openshift-cnv namespace don't specify the resource request in their manifests. Only daemonset/kube-cni-linux-bridge-plugin, deployment/kubemacpool-mac-controller-manager and daemonset/kube-cni-linux-bridge-plugin have it defined as follows: Kind | Name | CPU Req/Limits | Mem Req/Limits ---------- | ---------------------------------- | -------------- | --------------- daemonset | kube-cni-linux-bridge-plugin | 60m/0m | 30Mi/0Mi deployment | kubemacpool-mac-controller-manager | 100m/300m | 300Mi/600Mi The following list of manifests don't define the resource requirements: Kind | Name ---------- | ---- daemonset | bridge-marker daemonset | nmstate-handler daemonset | ovs-cni-amd64 daemonset | bridge-marker daemonset | nmstate-handler daemonset | ovs-cni-amd64 daemonset | kubevirt-node-labeller daemonset | ovs-cni-amd64 daemonset | nmstate-handler deployment | cdi-uploadproxy deployment | cdi-apiserver deployment | nmstate-webhook deployment | hostpath-provisioner-operator deployment | virt-api deployment | virt-controller deployment | virt-handler deployment | virt-operator deployment | virt-template-validator deployment | vm-import-controller deployment | vm-import-operator deployment | cdi-deployment deployment | cluster-network-addons-operator deployment | cdi-operator deployment | cluster-network-addons-operator deployment | kubevirt-ssp-operato deployment | hco-operator Version-Release number of selected component (if applicable): CNV 2.5.3 and onward. How reproducible: Steps to Reproduce: 1.Create CNV namespace 2.Create CNV Operator Group 3.Create HCO subscription and deploy stable 4.Wait for deployment of HCO operator to complete 5.Check for resource requests in deployed manifests. Actual results: Only 2 deployed manifests define their resource requirements, and only 1 define the resource limits (see list above). Expected results: All deployed manifests define the resource requirements. Additional info: N/A
This is a subset of a larger effort ( https://bugzilla.redhat.com/1931519 ); in this specific bug we are focusing only at setting memory and CPU limits on hco-operator and hco-webhook deployments.
While working on this, we found out that HCO is watching ConfigMaps (and Services) across the whole cluster, leading to unpredictable memory consumption depending on the size of the cluster. To rectify this we are looking into filtering our caches for those objects. Will update this issue as soon as we agreed on how to tackle it.
We are now waiting this change on controller-runtime: https://github.com/kubernetes-sigs/controller-runtime/pull/1435 to have a predictable memory consumption. Only at that time we will be able to really implement a memory limit. This is probably not going to happen in 4.8 timeframe.
https://github.com/kubernetes-sigs/controller-runtime/pull/1435 got merged, we can start consuming it as soon as we will get a new release of controller-runtime
According to https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#resources-and-limits guidelines that states: " Therefore, cluster components SHOULD NOT be configured with resource limits. However, cluster components MUST declare resource requests for both CPU and memory. " we are going to set resource requests for both CPU and memory but not resource limits.
Validated against at 4.9 cluster: For hco-operator: ================= resources: requests: cpu: 10m memory: 96Mi ================= For hco-webhook: ================= resources: requests: cpu: 5m memory: 48Mi ================= Based on this above results, marking this ticket as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:4104