Bug 1810525
| Summary: | [GSS][RFE] [Tracker for Ceph # BZ # 1910272] Deletion of data is not allowed after the Ceph cluster reaches osd-full-ratio threshold. | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Ashish Singh <assingh> |
| Component: | ceph | Assignee: | Kotresh HR <khiremat> |
| Status: | CLOSED ERRATA | QA Contact: | Anna Sandler <asandler> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.2 | CC: | bkunal, bniver, ebenahar, etamir, gmeno, hchiramm, jdurgin, kbg, khiremat, kramdoss, madam, mrajanna, muagarwa, ndevos, ocs-bugs, odf-bz-bot, owasserm, rcyriac, sostapov |
| Target Milestone: | --- | Keywords: | FutureFeature, Tracking |
| Target Release: | ODF 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | v4.9.0-182.ci | Doc Type: | Enhancement |
| Doc Text: |
.Deletion of data is allowed when the storage cluster is full
Previously, when the storage cluster was full, the Ceph Manager hung on checking pool permissions while reading the configuration file. The Ceph Metadata Server (MDS) did not allow write operations to occur when the Ceph OSD was full, resulting in an `ENOSPACE` error. When the storage cluster hit full ratio, users could not delete data to free space using the Ceph Manager volume plugin.
With this release, the new FULL capability is introduced. With the FULL capability, the Ceph Manager bypasses the Ceph OSD full check. The `client_check_pool_permission` option is disabled by default whereas, in previous releases, it was enabled. With the Ceph Manager having FULL capabilities, the MDS no longer blocks Ceph Manager calls. This results in allowing the Ceph Manager to free up space by deleting subvolumes and snapshots when a storage cluster is full.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-12-13 17:44:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1897351 | ||
| Bug Blocks: | 1841426, 2011326 | ||
|
Description
Ashish Singh
2020-03-05 12:40:29 UTC
As Ashish Singh is from the GSS team, re-adding the [GSS] tag. Or should it only be used for BZs with customer cases attached? Also moving this out of 4.3 We can keep the title as is. Is this something we could or would have to address in ceph itself? (In reply to Michael Adam from comment #4) > Is this something we could or would have to address in ceph itself? Ceph itself can't add more capacity, OCS may be able to, so it should be addressed there. FYI deletion in Ceph is allowed when it is full. What version and which commands are not working? Doesn't seem like this RFE was priortized in 4.6 and now when we are nearly approaching dev freeze, I don't think we have a chance to fix it. Moving it out, please retarget if some one thinks otherwise. Also, we are in early phase of 4.7 so if we don't want this BZ to drag further now is the time to priortize it. In order to enable deletion when cluster is we need to enable it in Ceph MGR: https://bugzilla.redhat.com/show_bug.cgi?id=1910272 We require change is Ceph-CSI as well. As we won't need any changes in OCS operator I am moving the BZ to Ceph-CSI. (In reply to Orit Wasserman from comment #9) > In order to enable deletion when cluster is we need to enable it in Ceph > MGR: https://bugzilla.redhat.com/show_bug.cgi?id=1910272 > We require change is Ceph-CSI as well. > As we won't need any changes in OCS operator I am moving the BZ to Ceph-CSI. Marking this bug for Ceph CSI component as Tracker till we get it addressed in Ceph Core components. https://bugzilla.redhat.com/show_bug.cgi?id=1910272 is acked for 5.0z1 This is getting fixed in 5.0z1 tested the workflow:
wrote data to the cluster using ocs-ci function write_data_via_fio() until cluster was almost full
sh-4.4$ ceph -s
cluster:
id: 1ae93eb4-edd9-4942-a27e-13dba341f1f2
health: HEALTH_ERR
3 full osd(s)
3 pool(s) full
then deleted the data using delete_fio_data()
data was deleted as expected
sh-4.4$ ceph -s
cluster:
id: 1ae93eb4-edd9-4942-a27e-13dba341f1f2
health: HEALTH_OK
moving to verified.
tested the workflow:
wrote data to the cluster using ocs-ci function write_data_via_fio() until cluster was almost full
sh-4.4$ ceph -s
cluster:
id: 1ae93eb4-edd9-4942-a27e-13dba341f1f2
health: HEALTH_ERR
3 full osd(s)
3 pool(s) full
then deleted the data using delete_fio_data()
data was deleted as expected
sh-4.4$ ceph -s
cluster:
id: 1ae93eb4-edd9-4942-a27e-13dba341f1f2
health: HEALTH_OK
moving to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:5086 |