Bug 1782253

Summary:	[RFE] Support Deployment with Autoscaler Enabled
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	John Fulton <johfulto>
Component:	Ceph-Ansible	Assignee:	Guillaume Abrioux <gabrioux>
Status:	CLOSED ERRATA	QA Contact:	Vasishta <vashastr>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.1	CC:	amsyedha, aschoen, ceph-eng-bugs, ceph-qe-bugs, fpantano, gcharot, gmeno, hyelloji, nthomas, tserlin, ykaul
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	4.1	Flags:	hyelloji: needinfo-
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	ceph-ansible-4.0.16-1.el8, ceph-ansible-4.0.16-1.el7	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1812929 (view as bug list)		Environment:
Last Closed:	2020-05-19 17:31:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1760354, 1810321, 1812929, 1821886, 1871864

Description John Fulton 2019-12-11 13:57:34 UTC

The autoscaler [1] will be supported during RHCSv4 (though not on by default). This is a request for ceph-ansible to support deployments with this feature this as PG management causes field escalations.

One way this could work is if there's a data structure passed like this:

openstack_pools:
    - {"name": backups, "target_size_ratio": 0.1, "pg_autoscale_mode": True, "application": rbd}
    - {"name": volumes, "target_size_ratio": 0.5, "pg_autoscale_mode": True, "application": rbd}
    - {"name": vms,     "target_size_ratio": 0.2, "pg_autoscale_mode": True, "application": rbd}
    - {"name": images,  "target_size_ratio": 0.2, "pg_autoscale_mode": True, "application": rbd}

Then ceph-ansible could execute commands like this for each pool (e.g. this is for the volumes pool). 

    $ ceph osd pool create volumes 16
    $ rbd pool init volumes
    $ ceph osd pool set volumes target_size_ratio .5
    $ ceph osd pool set volumes pg_autoscale_mode on

The target_size_ratio is like the percentages customers input into pgcalc. Putting it directly into THT and skipping pgcalc is a better UX.

The 16 in the first command was recommended as a hard coded value above while the name and target_size_ratio would be variables. The reason for 16 was based on Sage's blog [1] where he set 1, but 16 is the usual default and should work with replicated pools while not triggering the overdose protection.

[1] https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/

Comment 1 RHEL Program Management 2019-12-11 13:57:40 UTC

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 9 errata-xmlrpc 2020-05-19 17:31:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:2231