2072989 – Fix the resources (limits and resources) for few CSI containers (rook fix already in place Bug 2062559 )

Bug 2072989 - Fix the resources (limits and resources) for few CSI containers (rook fix already in place Bug 2062559 )

Summary: Fix the resources (limits and resources) for few CSI containers (rook fix alr...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	odf-managed-service
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Dhruv Bindra
QA Contact:	Itzhak
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-04-07 12:25 UTC by Neha Berry
Modified:	2023-08-09 17:00 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-03-14 15:37:54 UTC
Embargoed:

Attachments	(Terms of Use)

Description Neha Berry 2022-04-07 12:25:16 UTC

Description of problem:
==================================

Raising this bug as a continuation of discussion on Bug 2062559 , where it was observed while veriffying the bug that following containers have missing limits

1. No limits and requests specified for

Name:     "driver-registar",
			Resource: utils.GetResourceRequirements("driver-registrar"),

http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/rook-bug-2069795/must-gather.local.8892466999213447117/quay-io-rhceph-dev-ocs-must-gather-sha256-2657d3206a2ff2a742043b6432387fe63fba26c056bf8afd0296ffc42cf29fe1/namespaces/openshift-storage/pods/csi-rbdplugin-j8c9m/csi-rbdplugin-j8c9m.yaml

2. No limits specified for csi-cephfsplugin container in csi-cephfsplugin-provisioner

    name: csi-cephfsplugin
        resources:
          requests:
            cpu: 20m
            memory: 160Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
+++++++++++++++++++++++++++++++++++++++++++++++++++++


3.

even this container does not have it. Is it expected

oc get pod csi-cephfsplugin-provisioner-55f4775bc4-hnmg6 -o yaml

env:
    - name: ADDRESS
   :q!
:q:
    name: csi-snapshotter
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /csi
      name: socket-dir
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-7m4v5
      readOnly: true
  - args:


Update from engineering:

1. No limits are request for driver-registrar is due to typo at https://github.com/red-hat-storage/ocs-osd-deployer/blob/f9915c0851908048843a8a6c6775ab7996ed9396/controllers/managedocs_controller.go#L1308. it should be driver-registrar not driver-registar

2.  No limits specified for csi-cephfsplugin container in csi-cephfsplugin-provisioner --> Because no limit is defined for it at https://github.com/red-hat-storage/ocs-osd-deployer/blob/a3d080b2bb27b2a011bb78794d575560f14630a8/utils/resources.go#L204-L210


Logs for reference

comment  - https://bugzilla.redhat.com/show_bug.cgi?id=2062559#c6

For more details - see pod yamls in log - http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/rook-bug-2069795/must-gather.local.8892466999213447117/quay-io-rhceph-dev-ocs-must-gather-sha256-2657d3206a2ff2a742043b6432387fe63fba26c056bf8afd0296ffc42cf29fe1/namespaces/openshift-storage/pods/


Version-Release number of selected component (if applicable):
=================================================================
ODF 4.10.0-219
Deployer -2.0.0-8

How reproducible:
==================
Always

Steps to Reproduce:
1. Create a ODF to ODF cluster with provider and consumer
2. Install ODF add-ons on both
3. Check whether correct limits and requests are set for all CSI pods

Actual results:
================
Some containers in some pods are missing relevant settings



Expected results:
===================
All containers hould have limits and requests specified

Please check all the pods in case I missed anything.


Additional info:
===================
 chat  -https://chat.google.com/room/AAAAREGEba8/I4Yy17D243w

Comment 2 Dhruv Bindra 2022-06-28 07:56:13 UTC

We can try reproducing the bug but I think this bug should be raised in ocs-operator because the deployer doesn't update the grantedCapacity, it is a product feature.

Comment 3 Dhruv Bindra 2022-06-28 08:00:54 UTC

Please ignore my above comment(https://bugzilla.redhat.com/show_bug.cgi?id=2072989#c2), it was supposed to be on some other bug

Comment 14 Itzhak 2023-02-23 17:42:30 UTC

In the comment above, I attached the limits and requests set for all CSI pods of the consumer cluster.

Cluster Versions:


Ceph version:
ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)

OC version:
Client Version: 4.10.24
Server Version: 4.11.27
Kubernetes Version: v1.24.6+263df15

OCS version:
ocs-operator.v4.10.9                      OpenShift Container Storage   4.10.9            ocs-operator.v4.10.8                      Succeeded

Cluster version
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.27   True        False         25h     Cluster version is 4.11.27

Rook version:
rook: v4.10.9-0.b7b3a0044169fd9364683e2e4e6968361f8f3c08
go: go1.16.12

Ceph version:
ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)


CSV version:
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.10.10                     NooBaa Operator               4.10.10           mcg-operator.v4.10.9                      Succeeded
observability-operator.v0.0.20            Observability Operator        0.0.20            observability-operator.v0.0.19            Succeeded
ocs-operator.v4.10.9                      OpenShift Container Storage   4.10.9            ocs-operator.v4.10.8                      Succeeded
ocs-osd-deployer.v2.0.11                  OCS OSD Deployer              2.0.11-11         ocs-osd-deployer.v2.0.10                  Succeeded
odf-csi-addons-operator.v4.10.9           CSI Addons                    4.10.9            odf-csi-addons-operator.v4.10.8           Succeeded
odf-operator.v4.10.9                      OpenShift Data Foundation     4.10.9            odf-operator.v4.10.8                      Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0            ose-prometheus-operator.4.8.0             Succeeded
route-monitor-operator.v0.1.461-dbddf1f   Route Monitor Operator        0.1.461-dbddf1f   route-monitor-operator.v0.1.456-02ea942   Succeeded

Comment 15 Itzhak 2023-02-28 10:42:57 UTC

Please let me know if the output in the comment https://bugzilla.redhat.com/show_bug.cgi?id=2072989#c13 looks fine. 
I used a script to gather all the limits and requests set for all CSI pods of the consumer cluster.

Comment 16 Dhruv Bindra 2023-03-13 04:49:07 UTC

(In reply to Itzhak from comment #15)
> Please let me know if the output in the comment
> https://bugzilla.redhat.com/show_bug.cgi?id=2072989#c13 looks fine. 
> I used a script to gather all the limits and requests set for all CSI pods
> of the consumer cluster.

Yes, the output looks good to me.

Comment 17 Itzhak 2023-03-13 10:17:05 UTC

Okay, thanks. I am moving the BZ to Verified.

Comment 18 Ritesh Chikatwar 2023-03-14 15:37:54 UTC

Closing this bug as fixed in v2.0.11 and tested by QE.

Note You need to log in before you can comment on or make changes to this bug.