Bug 1576095
Summary: | Continually increasing memory consumption in ceph-osd | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Michael J. Kidd <linuxkidd> |
Component: | RADOS | Assignee: | Kefu Chai <kchai> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 2.5 | CC: | acalhoun, anharris, bengland, bhubbard, brian.fife, ceph-eng-bugs, dzafman, jdurgin, kchai, linuxkidd, mhackett, vumrao |
Target Milestone: | rc | ||
Target Release: | 3.* | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-12-13 22:08:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Michael J. Kidd
2018-05-08 19:48:20 UTC
I have seen problems in the past with RHOSP12+RHCS2.4 OSD memory increase during situations where a lot of backfilling is occurring. There was sort of a chain reaction where OSDs got too big, ran past their cgroup limit, an d died, setting off more backfilling and more OSD memory growth. Here's the article about how we resolved it then, and how we might prevent it in the future. https://docs.google.com/document/d/1e2jn8DbVbpwYcuhPG18tP3tDzaJ6DBHiacdzqiZcFow/edit#heading=h.x4uti0xeq736 Short summary: The workaround for this problem was: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-May/040030.html Josh Durgin's recommendation to prevent this problem was: osd_max_pg_log_entries = 3000 (default 10000) osd_min_pg_log_entries = 3000 (default 1500) I have not yet tried this, we plan to try it in the scale lab in the next month or so but just wanted you to know about it. Let me know if this helps, or if it needs clarification. Alex Calhoun reported some OSD memory growth in his RHCS 3.2 Bluestore testing here: https://docs.google.com/document/d/1yiYxsxSP__SWMm-FkJjn1kFu1rkVLwaaVnUV4G720hA/edit#heading=h.e8tdgt9to8b2 This is an admittedly extreme I/O workload, but still it shouldn't happen. |