Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1674017

Summary: user MUST NOT specify ceph_osd_docker_memory_limit, specify osd_memory_target
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ben England <bengland>
Component: DocumentationAssignee: Bara Ancincova <bancinco>
Status: CLOSED CURRENTRELEASE QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact: John Brier <jbrier>
Priority: high    
Version: 3.2CC: asriram, jbrier, jdurgin, jharriga, johfulto, kdreyer, mhackett, mnelson, pnguyen, tchandra, twilkins, vumrao
Target Milestone: z2   
Target Release: 3.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-13 16:19:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1685931    

Description Ben England 2019-02-08 18:23:07 UTC
Description of problem:

The documentation is extremely wrong about how to control memory usage for a Ceph OSD.   It says to change the ceph-ansible parameter ceph_osd_docker_memory_limit, this is dead wrong.  The way to control OSD memory consumption is with the osd_memory_target ceph.conf parameter, which causes the OSD itself to limit its memory consumption.  If you do it with the container CGroup limit, this will cause the daemon to suffer a memory allocation failure, which will make it abort.  This in turn will cause other OSDs to go into recovery mode and may cause them to hit the same limit and abort!!!

Version-Release number of selected component (if applicable):

RHCS 3.2 documentation on customer portal here:

https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/colocation-of-containerized-ceph-daemons#setting-dedicated-resources-for-colocated-daemons


See bz 1637153 for an example of how badly it can go if you use CGroup memory limit.

Comment 1 Ben England 2019-03-26 15:15:20 UTC
slight correction: the penalty for exceeding ceph_osd_docker_memory limit is not a memory allocation failure, the penalty is that the OOM (out-of-memory) killer will terminate the process.   A program might conceivably recover from a memory allocation failure in some way, it can't recover from OOM process termination.  Details are in:

https://github.com/torvalds/linux/blob/master/Documentation/cgroup-v1/memory.txt#L241

Comment 3 Ben England 2019-03-27 19:28:35 UTC
Bara, the section should be updated, not removed.   The section on ceph_osd_docker_cpu_limit is still valid (we need sections for ceph_rgw_docker_cpu_limit and ceph_mds_docker_cpu_limit too, different bzs).  It's just the part about memory that needs updating.  

As for removing the memory limit entirely, there may be one exception to that -- hyperconverged infrastructure (HCI), where applications and other Ceph services have to be co-resident on the same physical hosts.  In that case, I'd suggest setting the ceph_osd_docker_memory_limit to 50% higher than osd_memory_target ceph.conf parameter, so that if some daemon grows really excessively, it can be stopped before it triggers an OOM (out-of-memory) kill that could affect other services or applications.  This is a more sane use of the memory limit than before, and was suggested in rook.io discussions concerning memory CGroup limits.

Comment 5 Ben England 2019-04-01 20:08:23 UTC
Bara, you are correct, osd_memory_target is specifically in the ceph.conf section of all.yml, so to set it you need to have something in all.yml like this -- in this example, we are asking for each OSD to be limited to 6 GB (not GiB) of memory.  Normally the user should not have to set ceph_osd_docker_memory_limit, but for this example, suppose you wanted to override the Docker memory CGroup limit to be more constraining than it is by default for an HCI configuration.  Using the 50% rule above, that would be 9 GB.

ceph_conf_overrides:
  osd:
    osd_memory_target=6000000000
ceph_osd_docker_memory_limit: 9g

As for FileStore, this limit does not apply there, but FileStore is gradually being phased out and there should be a migration procedure released with RHCS 4 to migrate existing FileStore customer sites to BlueStore, so I'm not too worried about it.