Bug 2003776 - [Workload-DFG] MGR: The osd_memory_target_autotune option is not working as expected
Summary: [Workload-DFG] MGR: The osd_memory_target_autotune option is not working as e...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 5.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 5.0z1
Assignee: Adam King
QA Contact: Pawan
Mary Frances Hull
URL:
Whiteboard:
Depends On:
Blocks: 1959686 1973155
TreeView+ depends on / blocked
 
Reported: 2021-09-13 16:54 UTC by Vikhyat Umrao
Modified: 2023-06-22 13:36 UTC (History)
18 users (show)

Fixed In Version: ceph-16.2.0-140.el8cp
Doc Type: Enhancement
Doc Text:
.{storage-product} can now automatically tune the Ceph OSD memory target With this release, `osd_memory_target_autotune` option is fixed, and works as expected. Users can enable {storage-product} to automatically tune the Ceph OSD memory target for the Ceph OSDs in the storage cluster for improved performance without explicitly setting the memory target for the Ceph OSDs. {storage-product} sets the Ceph OSD memory target on a per-node basis by evaluating the total memory available, and the daemons running on the node. Users can enable the memory auto-tuning feature for the Ceph OSD by running the following command: ---- ceph config set osd osd_memory_target_autotune true ----
Clone Of:
Environment:
Last Closed: 2021-11-02 16:39:21 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-1683 0 None None None 2021-09-13 16:54:41 UTC
Red Hat Product Errata RHBA-2021:4105 0 None None None 2021-11-02 16:39:47 UTC

Description Vikhyat Umrao 2021-09-13 16:54:13 UTC
Description of problem:
MGR: The osd_memory_target_autotune option is not working as expected

- set osd_memory_target_autotune = true to enable and we should see the autotune start when this option is enabled.
- but in the workload-dfg RHCS 5 cluster, we are not seeing it happen.

What we see right now:

- The osd_memory_target_autotune enabled for all the 192 OSDs. Listing for one of them - 

- We did dump osd_memory_target value `config show` for all 192 OSDs and they still have the default value as 4G. 


# ceph tell osd.1 config show | grep osd_memory_target
    "osd_memory_target": "4294967296",
    "osd_memory_target_autotune": "true",
    "osd_memory_target_cgroup_limit_ratio": "0.800000",

- We went ahead and ran `ceph orch ps --daemon-type osd --format json-pretty` 

  {
    "container_id": "fedc4933cd26",
    "container_image_digests": [
      "registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:55cb1de88341300daa1ab6d59e4897edc733
a3f90162149c21f18abe49ed87c7"
    ],
    "container_image_id": "2142b60d797408c7a0e9210489bd599cd7addb2d4c6e31da769eba248208ca44
",
    "container_image_name": "registry.redhat.io/rhceph/rhceph-5-rhel8@sha256:55cb1de8834130
0daa1ab6d59e4897edc733a3f90162149c21f18abe49ed87c7",
    "created": "2021-09-04T12:31:42.062065Z",
    "daemon_id": "1",
    "daemon_type": "osd",
    "hostname": "f23-h05-000-6048r.rdu2.scalelab.redhat.com",
    "is_active": false,
    "last_refresh": "2021-09-13T16:40:43.590086Z",
    "memory_usage": 9093519507,
    "osdspec_affinity": "defaultDG",
    "ports": [],
    "started": "2021-09-04T12:31:49.409595Z",
    "status": 1,
    "status_desc": "running",
    "version": "16.2.0-117.el8cp"
  },

- We did run the top command to see what is the exact usage at the system for this particular OSD


    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 576023 ceph      20   0 6609044   3.9g  15824 S   5.6   1.5 449:20.13 /usr/bin/ceph-osd -n osd.1 -f --setuser ceph --setgroup ceph --default-log-to-file=false --defau+

Looks like the current set `osd_memory_target` is limiting it to the default value 4G but in `ceph orch` command we can clearly see the usage is listed around 9G.

"memory_usage": 9093519507


We will attach all the commands dumps!


Version-Release number of selected component (if applicable):
RHCS 5 - 16.2.0-117.el8cp

How reproducible:
Always

Comment 27 errata-xmlrpc 2021-11-02 16:39:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4105


Note You need to log in before you can comment on or make changes to this bug.