Bug 1600138
Summary: | [Bluestore]: one of the osds flapped multiple times with 1525: FAILED assert(0 == "bluefs enospc") | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Parikshith <pbyregow> | ||||||||
Component: | RADOS | Assignee: | Sage Weil <sweil> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Manohar Murthy <mmurthy> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 3.1 | CC: | akupczyk, anharris, assingh, ceph-eng-bugs, dzafman, edonnell, gsitlani, hnallurv, jdurgin, kchai, linuxkidd, mamccoma, mhackett, mmurthy, nojha, pasik, shalygin.k, sweil, tchandra, tonay, tpetr, tserlin, vumrao | ||||||||
Target Milestone: | z2 | ||||||||||
Target Release: | 3.2 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | RHEL: ceph-12.2.8-108.el7cp Ubuntu: ceph_12.2.8-93redhat1xenial | Doc Type: | Bug Fix | ||||||||
Doc Text: |
.RocksDB compaction no longer exhausts free space of BlueFS
Previously, the balancing of free space between main storage and storage for RocksDB, managed by BlueFS, happened only when write operations were underway. This caused an `ENOSPC` error for BlueFS to be returned when RocksDB compaction was triggered right before long interval without write operations. With this update, the code has been modified to periodically check free space balance even if no write operations are ongoing so that compaction no longer exhausts free space of BlueFS.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2019-04-30 15:56:43 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1629656 | ||||||||||
Attachments: |
|
Description
Parikshith
2018-07-11 13:57:13 UTC
Created attachment 1458772 [details]
ceph.log
Created attachment 1458773 [details]
mon logs
How reproducible is this? I'm looking at the luminous code and I think the only way this would happen is if the device is fast and/or the osd is idle, but rocksdb is doing compaction. We have a configurable that gives bluefs a minimum if 1 GB of free space. I think the way to address this is to bump that to, say, 5GB or 10GB. It's hard to tell if that will do the trick, though, without being able to reproduce... Can verify it reproduces, and then try it again with bluestore_bluefs_min_free = 10737418240 # 10gb (current default is 1gb) ? Moving to 3.* unless we can reproduce to test setting bluestore_bluefs_min_free to a higher value. Tomas, what is the current state of the customer cluster? We are working on backporting this patch, but that might take some time. Adam(akupczyk) is investigating a way to get around this issue without applying the patch immediately. Sounds good. Thanks Vikhyat. We will be backporting https://github.com/ceph/ceph/pull/26735 to luminous and shipping it in 3.2z2. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911 |