Bug 2084534
| Summary: | OSD utilization notifications are mentioned in Cluster utilization alert message | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Filip Balák <fbalak> |
| Component: | odf-managed-service | Assignee: | Kaustav Majumder <kmajumde> |
| Status: | CLOSED EOL | QA Contact: | Itzhak <ikave> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.10 | CC: | aeyal, ebenahar, kmajumde, odf-bz-bot |
| Target Milestone: | --- | Keywords: | UserExperience |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 2.0.2 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-07-11 10:26:45 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2084014, 2136854 | ||
| Bug Blocks: | |||
|
Description
Filip Balák
2022-05-12 11:50:14 UTC
I tested the BZ with MS provider OCP 4.11, ODF 4.11 cluster, and MS consumer OCP 4.12, ODF 4.11 cluster. I performed the following steps: 1. Utilize the MS consumer cluster for 97 percent. To achieve this, I used a built-in fixture in the ocs-ci project. 2. I got three emails during the utilization: - First email, when it reached 75%: " Persistent Volume Usage is Nearly Full The utilization of one or more of the PVs in your cluster (e20c0f51-9b43-4a28-b0bf-6fe8bb44845d) has exceeded 75%. Please free up some space or expand the PV if possible. Failure to address this issue may lead to service interruptions. PVC Name: fio-target Namespace: namespace-test-1f0e855b37b94c00a1ca48587 " - Second email, when it reached 85%: " Persistent Volume Usage Critical The utilization of one or more of the PVs in your cluster (e20c0f51-9b43-4a28-b0bf-6fe8bb44845d) has exceeded 85%. Please free up some space immediately or expand the PV if possible. Failure to address this issue may lead to service interruptions. PVC Name: fio-target Namespace: namespace-test-1f0e855b37b94c00a1ca48587 " - Third email, again when it was in 85% or higher: " Ceph Cluster is Critically Full Your storage cluster (96cf3749-ede7-453e-a652-e7e7ce6700c8) utilization has crossed 80% and will move into a read-only state at 85%! Please free up some space or if possible expand the storage cluster immediately to prevent any service access issues. " 3. I checked the three emails above; they do not mention the osd devices or osd. 4. I also checked(as part of the ocs-ci test) that the space was reclaimed successfully. Link to the Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-odf-multicluster/1970/. One more thing about the deployment:
The OSD size was 4Ti:
$ oc rsh -n openshift-storage $(oc get pods -o wide -n openshift-storage|grep tool|awk '{print$1}') ceph osd status
ID HOST USED AVAIL WR OPS WR DATA RD OPS RD DATA STATE
0 ip-10-206-38-19.us-east-2.compute.internal 172G 3923G 81 79.4M 0 0 exists,up
1 ip-10-206-41-81.us-east-2.compute.internal 172G 3923G 27 105M 1 105 exists,up
2 ip-10-206-43-103.us-east-2.compute.internal 171G 3924G 15 56.0M 0 819 exists,up
$ oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-069d2401-9fc2-4b45-89c3-b35e4c8d3cd6 50Gi RWO Delete Bound openshift-storage/rook-ceph-mon-c gp2 78m
pvc-0ac224a5-612e-42fe-b193-1a9982961d8b 4Ti RWO Delete Bound openshift-storage/default-2-data-0l749v gp2 73m
pvc-23f208f0-e729-468a-b896-1b672ddb8ccf 50Gi RWO Delete Bound openshift-storage/rook-ceph-mon-a gp2 80m
pvc-4d8dde2e-255a-4f16-881e-3a578827ae82 50Gi RWO Delete Bound openshift-storage/rook-ceph-mon-b gp2 80m
pvc-505e6d7d-01bd-4543-95cb-8dc43254e68c 4Ti RWO Delete Bound openshift-storage/default-1-data-0dxgvw gp2 74m
pvc-5788ee83-3dc5-4b7b-b561-a210e9acadda 10Gi RWO Delete Bound openshift-monitoring/alertmanager-data-alertmanager-main-0 gp3 85m
pvc-91bb56f3-eeab-44b3-9071-f602b2a5ba58 10Gi RWO Delete Bound openshift-monitoring/alertmanager-data-alertmanager-main-1 gp3 85m
pvc-a0b51a12-f426-4149-9385-ad5d400f6294 100Gi RWO Delete Bound openshift-monitoring/prometheus-data-prometheus-k8s-0 gp3 85m
pvc-c39e951a-597d-4a40-b3ea-a341751cad0a 4Ti RWO Delete Bound openshift-storage/default-0-data-04g4gt gp2 74m
pvc-f2a6e0aa-4268-4f5b-93d9-f4f428aa8ee2 100Gi RWO Delete Bound openshift-monitoring/prometheus-data-prometheus-k8s-1 gp3 85m
The ODF Managed Service Project has sunset and is now consider obsolete |