Bug 1638148

Summary:	[RFE][HCI] set osd max memory based on osd container memory
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Vikhyat Umrao <vumrao>
Component:	Ceph-Ansible	Assignee:	Neha Ojha <nojha>
Status:	CLOSED ERRATA	QA Contact:	Parikshith <pbyregow>
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.1	CC:	acalhoun, aschoen, bengland, ceph-eng-bugs, gmeno, hnallurv, jdurgin, jharriga, johfulto, mhackett, nojha, nthomas, sankarshan, shan, tserlin, vakulkar
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	3.2
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	RHEL: ceph-ansible-3.2.0-0.1.rc1.el7cp Ubuntu: ceph-ansible_3.2.0~rc1-2redhat1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-01-03 19:02:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1637153
Bug Blocks:	1644347

Description Vikhyat Umrao 2018-10-10 22:08:47 UTC

Description of problem:
[RFE][HCI] set osd max memory based on osd container memory
This is an extension or a fix for this bug:

[RFE] set osd max memory based on host memory
https://bugzilla.redhat.com/show_bug.cgi?id=1595003
https://github.com/ceph/ceph-ansible/pull/3113

Version-Release number of selected component (if applicable):
RHCS 3.1

BZ1595003 handles this for non-containerized(non-HCI) environment pretty well but in the container(HCI) environment we should not consider complete OSD host/node memory because container OSD will only get 5G from option ceph_osd_docker_memory_limit.

https://bugzilla.redhat.com/show_bug.cgi?id=1591876
https://github.com/ceph/ceph-ansible/pull/2775

- I had discussed with Neha for this and We think in HCI check it should be:

- {% set _osd_memory_target = (ansible_memtotal_mb * hci_safety_factor / _num_osds) %}

+ {% set _osd_memory_target = (ceph_osd_docker_memory_limit * hci_safety_factor) %}

- This ceph_osd_docker_memory_limit defaults to 5G and we are giving 4G to BlueStore cache by default because there is no code for having BlueStore cache less than 4G.

- So either we need to change the default osd_memory_target for the container(HCI) to less than 4G or we need to bump the default for ceph_osd_docker_memory_limit to 8G or something?

Maybe we need to talk to the performance team for these defaults. We also need to think about the hci_safety_factor default value.

The reason for all this is the container OSD will not have access to memory more than ceph_osd_docker_memory_limit configuration.

Comment 1 Ben England 2018-10-12 17:57:26 UTC

You have to set the container CGRoup memory limit based on size of OSD cache rather than what you said above here?  So add a fixed amount to the OSD cache size and make that the CGroup limit, right?

Comment 5 Ben England 2018-10-19 18:50:37 UTC

During Bluestore discussion we came up with this idea to prevent the OSD (or other daemon) from getting into a situation where the CGroup memory limit for a Ceph daemon container was less than the amount of memory that it needed.  During daemon startup, if it can determine the container ID (available from "docker inspect container-name" for example), it can get the memory limit at

/sys/fs/cgroup/memory/system.slice/docker-$containerid.scope/memory.limit_in_bytes

and exit with an error message if this is not sufficient (example: Bluestore OSDs that do caching in userspace and need lots of memory).

This doesn't fix orchestration software or prevent the daemon from consuming too much memory, it just ensures that we don't get into a situation where the daemon is likely to die just because the CGroup limit was accidentally set too low, and informs the sysadmin of this right away.

Comment 6 Sébastien Han 2018-10-30 13:03:19 UTC

Present in https://github.com/ceph/ceph-ansible/releases/tag/v3.2.0rc1

Comment 11 errata-xmlrpc 2019-01-03 19:02:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020