Bug 2297097
| Summary: | Alert 'CephClusterCriticallyFull' not triggered when ceph filled 85% of its capacity. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Nagendra Reddy <nagreddy> | ||||
| Component: | ceph-monitoring | Assignee: | arun kumar mohan <amohan> | ||||
| Status: | ASSIGNED --- | QA Contact: | Harish NV Rao <hnallurv> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.16 | CC: | amohan, odf-bz-bot | ||||
| Target Milestone: | --- | Flags: | amohan:
needinfo?
(nagreddy) |
||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | Bug | |||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Moving the non-blocker BZs out of ODF-4.17.0 as part of Development Freeze. |
Created attachment 2039415 [details] Pools 100% utilised when raw 85% used Description of problem (please be detailed as possible and provide log snippests): I have filled 85% of my cluster capacity in 100Gi OSDs using benchmark-operator io. I observed 'CephClusterCriticallyFull' didn't triggered in the cluster. Version of all relevant components (if applicable): ocp: 4.16.0-0.nightly-2024-07-09-093958 odf: 4.16.0-rhodf Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Y Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Intermittent Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Fill the cluster till 85% using benchmark-operator io. 2. No alerts will be seen for Clustercriticalfull. 3. I tried this on IBM cloud 100Gi OSD cluster. There is a similar test [tests/cross_functional/system_test/test_cluster_full_and_recovery.py] automated which may help in reproducing this issue. Actual results: Alert 'CephClusterCriticallyFull' not triggered. Expected results: Alert 'CephClusterCriticallyFull' should be triggered at 85% of cluster full. Additional info: I observed pools were 100% utilised when RAW used capacity was 85%. Is this the reason for not triggering the alert? Not sure.