1962751 – ocs operator does not marshal resource requests and limits for ceph crushcollector pods

Bug 1962751 - ocs operator does not marshal resource requests and limits for ceph crushcollector pods

Summary: ocs operator does not marshal resource requests and limits for ceph crushcoll...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	OCS 4.8.0
Assignee:	umanga
QA Contact:	suchita
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1964365
TreeView+	depends on / blocked

Reported:	2021-05-20 14:59 UTC by Ohad
Modified:	2023-09-15 01:06 UTC (History)
CC List:	9 users (show)
Fixed In Version:	4.8.0-406.ci
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	1964365 (view as bug list)
Environment:
Last Closed:	2021-08-03 18:16:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift ocs-operator pull 1185	None	open	fix merge strategy for newCephDaemonResources	2021-05-21 12:17:53 UTC
Github	openshift ocs-operator pull 1188	None	open	Bug 1962751: [release-4.8] fix merge strategy for newCephDaemonResources	2021-05-25 11:21:24 UTC
Red Hat Product Errata	RHBA-2021:3003	None	None	None	2021-08-03 18:17:03 UTC

Description Ohad 2021-05-20 14:59:43 UTC

Description of problem (please be detailed as possible and provide log
snippests):

For OCS dedicated (ocs-converged offering), we need to set resource requests and limits on all containers deployed by OCS. 

For the rook-ceph-crashcollector pods/containers, the way to do it is by setting requests and limits on the cephcluster resource (like all other ceph based components: mons, mgr, etc.)

The way we do it today in OCS is by setting the requests and limits for ceph containers on the storagecluster resource which ocs-operator reads and sets on the underlaying cephclsuter it creates. 

The gap/issue is that ocs-operator does not support the setting crush collector limits and requests on the storagecluster resource.  


Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Possibly, as setting the requests and limits on all containers is an acceptance criterion to onboarding OCS as a managed offering for Openshift dedicated production env.


Is there any workaround available to the best of your knowledge?
No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1 - the ability to do so is just missing


Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
N/A

If this is a regression, please provide more details to justify this:
No

Steps to Reproduce:
1. deploy OCS 
2. set requests and limits in the proper section on the storagecluster resource under the crushcollector key


Actual results:
The requests and limits are ignored (not marshaled to the cephcluster resource)

Expected results:
The requests and limits should be marshaled to the cephcluster resource


Additional info:

Comment 3 umanga 2021-05-21 08:06:04 UTC

Was the original intention of "StorageCluster.Spec.Resources" to allow users to provide resource requirements only for mgr and mon pods?
If yes, then we need to agree on allowing it for "crashcollector" and probably identify a sensible default.
If it was always meant to be used for all daemons/pods, then this bug goes back to OCS 4.2 from what I observed.

Either way, the fix is simple: https://github.com/openshift/ocs-operator/pull/1185

Comment 6 suchita 2021-07-15 16:24:28 UTC


In ocs-operator.v4.8.0-452.ci on vSphere cluster, 
As it is vsphere platform, 'limits' and 'requests' do not exist by default. 
So edited storage cluster resource under the crushcollector key with 'limit' and 'request' 
---------------------------snipet------------
  ...
  resources:
      crashcollector:
        limits:
          cpu: 50m
          memory: 80Mi
        requests:
          cpu: 50m
          memory: 80Mi
  storageDeviceSets
  ...
----------------------------------------------
And after few seconds all crash collector pods respin, then verified the resource 'requests' and 'limits' in rook-ceph-crashcollector-*yaml
----------------------------------------------
  name: make-container-crash-dir
    resources:
      limits:
        cpu: 50m
        memory: 80Mi
      requests:
        cpu: 50m
        memory: 80Mi
----------------------------------------------
Secondly,
In ocs-operator.v4.7.2 on Red hat ODF Managed service on OSD platform, 'limits' and 'requests' are expected to exist on freshly deployed cluster. so verified the storage cluster yaml and crashcollector pod yaml for the existence of requests and limits on resources.  

Verified the result on 2 versions:
1.ocs-operator.v4.8.0-452.ci on vSphere
2.ocs-operator.v4.7.2 on Red hat ODF Managed service

Hence marking this BZ as Verified

Comment 8 errata-xmlrpc 2021-08-03 18:16:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3003

Comment 9 Red Hat Bugzilla 2023-09-15 01:06:58 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.