Bug 1782253

Summary: [RFE] Support Deployment with Autoscaler Enabled
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: John Fulton <johfulto>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.1CC: amsyedha, aschoen, ceph-eng-bugs, ceph-qe-bugs, fpantano, gcharot, gmeno, hyelloji, nthomas, tserlin, ykaul
Target Milestone: rcKeywords: FutureFeature
Target Release: 4.1Flags: hyelloji: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.16-1.el8, ceph-ansible-4.0.16-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1812929 (view as bug list) Environment:
Last Closed: 2020-05-19 17:31:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1760354, 1810321, 1812929, 1821886, 1871864    

Description John Fulton 2019-12-11 13:57:34 UTC
The autoscaler [1] will be supported during RHCSv4 (though not on by default). This is a request for ceph-ansible to support deployments with this feature this as PG management causes field escalations.

One way this could work is if there's a data structure passed like this:

openstack_pools:
    - {"name": backups, "target_size_ratio": 0.1, "pg_autoscale_mode": True, "application": rbd}
    - {"name": volumes, "target_size_ratio": 0.5, "pg_autoscale_mode": True, "application": rbd}
    - {"name": vms,     "target_size_ratio": 0.2, "pg_autoscale_mode": True, "application": rbd}
    - {"name": images,  "target_size_ratio": 0.2, "pg_autoscale_mode": True, "application": rbd}

Then ceph-ansible could execute commands like this for each pool (e.g. this is for the volumes pool). 

    $ ceph osd pool create volumes 16
    $ rbd pool init volumes
    $ ceph osd pool set volumes target_size_ratio .5
    $ ceph osd pool set volumes pg_autoscale_mode on

The target_size_ratio is like the percentages customers input into pgcalc. Putting it directly into THT and skipping pgcalc is a better UX.

The 16 in the first command was recommended as a hard coded value above while the name and target_size_ratio would be variables. The reason for 16 was based on Sage's blog [1] where he set 1, but 16 is the usual default and should work with replicated pools while not triggering the overdose protection.

[1] https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/

Comment 1 RHEL Program Management 2019-12-11 13:57:40 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 9 errata-xmlrpc 2020-05-19 17:31:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:2231