Description of problem: MGR: The osd_memory_target_autotune option is not working as expected - set osd_memory_target_autotune = true to enable and we should see the autotune start when this option is enabled. - but in the workload-dfg RHCS 5 cluster, we are not seeing it happen. What we see right now: - The osd_memory_target_autotune enabled for all the 192 OSDs. Listing for one of them - - We did dump osd_memory_target value `config show` for all 192 OSDs and they still have the default value as 4G. # ceph tell osd.1 config show | grep osd_memory_target "osd_memory_target": "4294967296", "osd_memory_target_autotune": "true", "osd_memory_target_cgroup_limit_ratio": "0.800000", - We went ahead and ran `ceph orch ps --daemon-type osd --format json-pretty` { "container_id": "fedc4933cd26", "container_image_digests": [ "registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:55cb1de88341300daa1ab6d59e4897edc733 a3f90162149c21f18abe49ed87c7" ], "container_image_id": "2142b60d797408c7a0e9210489bd599cd7addb2d4c6e31da769eba248208ca44 ", "container_image_name": "registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:55cb1de8834130 0daa1ab6d59e4897edc733a3f90162149c21f18abe49ed87c7", "created": "2021-09-04T12:31:42.062065Z", "daemon_id": "1", "daemon_type": "osd", "hostname": "f23-h05-000-6048r.rdu2.scalelab.redhat.com", "is_active": false, "last_refresh": "2021-09-13T16:40:43.590086Z", "memory_usage": 9093519507, "osdspec_affinity": "defaultDG", "ports": [], "started": "2021-09-04T12:31:49.409595Z", "status": 1, "status_desc": "running", "version": "16.2.0-117.el8cp" }, - We did run the top command to see what is the exact usage at the system for this particular OSD PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 576023 ceph 20 0 6609044 3.9g 15824 S 5.6 1.5 449:20.13 /usr/bin/ceph-osd -n osd.1 -f --setuser ceph --setgroup ceph --default-log-to-file=false --defau+ Looks like the current set `osd_memory_target` is limiting it to the default value 4G but in `ceph orch` command we can clearly see the usage is listed around 9G. "memory_usage": 9093519507 We will attach all the commands dumps! Version-Release number of selected component (if applicable): RHCS 5 - 16.2.0-117.el8cp How reproducible: Always
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4105