Description of problem (please be detailed as possible and provide log snippests): For OCS dedicated (ocs-converged offering), we need to set resource requests and limits on all containers deployed by OCS. For the rook-ceph-crashcollector pods/containers, the way to do it is by setting requests and limits on the cephcluster resource (like all other ceph based components: mons, mgr, etc.) The way we do it today in OCS is by setting the requests and limits for ceph containers on the storagecluster resource which ocs-operator reads and sets on the underlaying cephclsuter it creates. The gap/issue is that ocs-operator does not support the setting crush collector limits and requests on the storagecluster resource. Version of all relevant components (if applicable): Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Possibly, as setting the requests and limits on all containers is an acceptance criterion to onboarding OCS as a managed offering for Openshift dedicated production env. Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 - the ability to do so is just missing Can this issue reproducible? Yes Can this issue reproduce from the UI? N/A If this is a regression, please provide more details to justify this: No Steps to Reproduce: 1. deploy OCS 2. set requests and limits in the proper section on the storagecluster resource under the crushcollector key Actual results: The requests and limits are ignored (not marshaled to the cephcluster resource) Expected results: The requests and limits should be marshaled to the cephcluster resource Additional info:
Was the original intention of "StorageCluster.Spec.Resources" to allow users to provide resource requirements only for mgr and mon pods? If yes, then we need to agree on allowing it for "crashcollector" and probably identify a sensible default. If it was always meant to be used for all daemons/pods, then this bug goes back to OCS 4.2 from what I observed. Either way, the fix is simple: https://github.com/openshift/ocs-operator/pull/1185
In ocs-operator.v4.8.0-452.ci on vSphere cluster, As it is vsphere platform, 'limits' and 'requests' do not exist by default. So edited storage cluster resource under the crushcollector key with 'limit' and 'request' ---------------------------snipet------------ ... resources: crashcollector: limits: cpu: 50m memory: 80Mi requests: cpu: 50m memory: 80Mi storageDeviceSets ... ---------------------------------------------- And after few seconds all crash collector pods respin, then verified the resource 'requests' and 'limits' in rook-ceph-crashcollector-*yaml ---------------------------------------------- name: make-container-crash-dir resources: limits: cpu: 50m memory: 80Mi requests: cpu: 50m memory: 80Mi ---------------------------------------------- Secondly, In ocs-operator.v4.7.2 on Red hat ODF Managed service on OSD platform, 'limits' and 'requests' are expected to exist on freshly deployed cluster. so verified the storage cluster yaml and crashcollector pod yaml for the existence of requests and limits on resources. Verified the result on 2 versions: 1.ocs-operator.v4.8.0-452.ci on vSphere 2.ocs-operator.v4.7.2 on Red hat ODF Managed service Hence marking this BZ as Verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3003
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days