Bug 2297097 - Alert 'CephClusterCriticallyFull' not triggered when ceph filled 85% of its capacity. [NEEDINFO]
Summary: Alert 'CephClusterCriticallyFull' not triggered when ceph filled 85% of its c...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.16
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: arun kumar mohan
QA Contact: Harish NV Rao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-07-10 13:58 UTC by Nagendra Reddy
Modified: 2024-09-17 10:05 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
amohan: needinfo? (nagreddy)


Attachments (Terms of Use)
Pools 100% utilised when raw 85% used (42.09 KB, image/png)
2024-07-10 13:58 UTC, Nagendra Reddy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OCSBZM-8668 0 None None None 2024-07-23 07:37:57 UTC

Description Nagendra Reddy 2024-07-10 13:58:37 UTC
Created attachment 2039415 [details]
Pools 100% utilised when raw 85% used

Description of problem (please be detailed as possible and provide log
snippests):

I have filled 85% of my cluster capacity in 100Gi OSDs using benchmark-operator 
io. I observed 'CephClusterCriticallyFull' didn't triggered in the cluster.


Version of all relevant components (if applicable):
ocp: 4.16.0-0.nightly-2024-07-09-093958
odf: 4.16.0-rhodf

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Y

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2

Can this issue reproducible?
Intermittent

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Fill the cluster till 85% using benchmark-operator io.
2. No alerts will be seen for Clustercriticalfull.
3. I tried this on IBM cloud 100Gi OSD cluster.

There is a similar test [tests/cross_functional/system_test/test_cluster_full_and_recovery.py] automated which may help in reproducing this issue.


Actual results:
 Alert 'CephClusterCriticallyFull' not triggered.

Expected results:
Alert 'CephClusterCriticallyFull' should be triggered at 85% of cluster full.

Additional info:

I observed pools were 100% utilised when RAW used capacity was 85%. Is this the reason for not triggering the alert? Not sure.

Comment 6 Sunil Kumar Acharya 2024-09-17 10:05:02 UTC
Moving the non-blocker BZs out of ODF-4.17.0 as part of Development Freeze.


Note You need to log in before you can comment on or make changes to this bug.