Bug 1885313
Summary: | noobaa-endpoint HPA fires KubeHpaReplicasMismatch alert after installation | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Filip Balák <fbalak> |
Component: | Multi-Cloud Object Gateway | Assignee: | Nimrod Becker <nbecker> |
Status: | CLOSED NOTABUG | QA Contact: | Raz Tamir <ratamir> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.6 | CC: | ebenahar, etamir, nberry, ocs-bugs, omitrani, tunguyen |
Target Milestone: | --- | ||
Target Release: | OCS 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-20 09:38:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1885320 | ||
Bug Blocks: |
Description
Filip Balák
2020-10-05 15:09:31 UTC
Filip, have we got this alert with previous OCS versions? If not, it's a regression IMO We found the cause to the issue: ocs-ci deploy cluster with the following resources configuration on the storagecluster CR (under spec.resources): resources: mds: {} mgr: {} mon: {} noobaa-core: {} noobaa-db: {} noobaa-endpoint: {} rgw: {} This will cause all pods related to these deployments/statefullsets to configure their pod's templates with a resources value of {}. HPA, as of spec, cannot work without specific values set on the resources section of the pods it is observing which is causing the issue described here. As is, this is not a bug in the OCS product, or at least not in the official/default/supported deployment for the product. I am not sure if this configuration (configured via the ocs-ci deployment scripts) is indented or is a bug. @Filip (or any other QE representative) can you please check and update the reasoning behind this setup, and what can we do in order to mitigate the problem? After discussing the issue with Elad and Petr, it seems this is done in order to run OCS on clusters with low resources. @Filip, can we please verify that deploying the cluster with specific values (maybe via UI) solves the issue This way we could verify that this is not a bug in the product so we could close the BZ I confirm that cluster with supported configuration installed manually doesn't trigger this alert. -> NOTABUG Although, I don't see resources defined in storagecluster CR as provided in comment 7. Excerpt from ocs-storagecluster instance of StorageCluster CR: (...) spec: encryption: {} externalStorage: {} managedResources: cephBlockPools: {} cephFilesystems: {} cephObjectStoreUsers: {} cephObjectStores: {} snapshotClasses: {} storageClasses: {} storageDeviceSets: - config: {} count: 1 dataPVCTemplate: metadata: creationTimestamp: null spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Ti storageClassName: gp2 volumeMode: Block status: {} name: ocs-deviceset-gp2 placement: {} portable: true replica: 3 resources: {} version: 4.6.0 (...) Filip, It is ok to not set anything. The problem arises when you set them to empty objects or null's |