Document URL ============ Red Hat OpenShift Container Storage 4.3 Troubleshooting OpenShift Container Storage Section Number and Name ======================= Chapter 5. Troubleshooting alerts and errors in OpenShift Container Storage 5.1. Resolving alerts and errors Describe the issue ================== In the list of OCS alerts, I see entries for CephClusterCriticallyFull and CephClusterNearFull, but it's description is insufficient, lacking clear and precise meaning. What will happen when an action is not taken is not discussed. Suggestions for improvement =========================== For all storage utilization alerts (such as CephClusterNearFull and CephClusterCriticallyFull), we should provide the following details in a clear way: - Exact definition of the alert, and how to understand it wrt cluster state. What is based on? How does it related to cluster vs usable storage? Does it mean I will be able to write 25% data untill hiting out of space issue when the alert states that utilization crossed 75%? - What is going to happen when the alert is not acted upon (include worst case scenario) - Impact on OCP Prometheus monitoring when it's storage is backed by OCS. We should also make sure that all storage utilization alerts are listed. Additional information ====================== Exact content depends on engineering resolution for BZ 1809248. Please reach out to dev team when BZ 1809248 so that doc changes can be drafted. Action items for admin to follow as listed in Procedure section needs to be also revisited, if changes in eng. BZs makes it necessary. Other related eng. bugs include BZ 1818736 and BZ 1775432.
Marking BZ 1818736 as a blocker for this doc bug, as discussed in Additional information section above.
Fixing copy-paste typo in a blocker bug.