Bug 2021738

Summary:	[RADOS] Have default autoscaler profile as scale-up in RHCS 5.1
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Pawan <pdhiran>
Component:	RADOS	Assignee:	Kamoltat (Junior) Sirivadhna <ksirivad>
Status:	CLOSED ERRATA	QA Contact:	Pawan <pdhiran>
Severity:	high	Docs Contact:	Ranjini M N <rmandyam>
Priority:	unspecified
Version:	5.1	CC:	agunn, akupczyk, amathuri, bhubbard, ceph-eng-bugs, gsitlani, ksirivad, lflores, nojha, pdhange, rfriedma, rmandyam, rzarzyns, sseshasa, tserlin, vereddy, vumrao
Target Milestone:	---	Keywords:	Rebase
Target Release:	5.1
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	ceph-16.2.7-4.el8cp	Doc Type:	Bug Fix
Doc Text:	.The autoscaler is set to scale-up by default in {storage-product} 5.1 With this release, the autoscaler is set to scale-up by default where it starts out each pool with minimal placement groups (PGs) and scales up PGs when there is more usage in each pool. There is also a scale-down profile, where each pool starts with ideal full-capacity PGs and only scales down when the usage ratio across the pools is not even. In a scale-down profile, the autoscaler identifies any overlapping roots and prevents the pools with such roots from scaling because overlapping roots can cause problems with the scaling process.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-04-04 10:22:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2031073

Comment 2 Kamoltat (Junior) Sirivadhna 2021-11-18 14:49:37 UTC

Hi all,

So me Neha and Josh have discussed and come to a conclusion that is better for the autoscaler to start out with a scale-up profile by default. The reason for this is that scale-down was introduced to provide a better out-of-the-box experience. However, without a feature that allows us to impose a limit on the maximum number of PGs in the device_health_metrics or .mgr pool, scale-down mode can sometimes scale the PGs too large and might run into issues such as failed pool creation due to exceeding mon_max_pg_per_osd limit and etc, e.g., https://bugzilla.redhat.com/show_bug.cgi?id=2023171.

Therefore, I have created a PR that will make scale-up the default profile for the autoscaler: https://github.com/ceph/ceph/pull/43999

Comment 3 Vikhyat Umrao 2021-11-19 17:58:14 UTC

(In reply to ksirivad from comment #2)
> Hi all,
> 
> So me Neha and Josh have discussed and come to a conclusion that is better
> for the autoscaler to start out with a scale-up profile by default. The
> reason for this is that scale-down was introduced to provide a better
> out-of-the-box experience. However, without a feature that allows us to
> impose a limit on the maximum number of PGs in the device_health_metrics or
> .mgr pool, scale-down mode can sometimes scale the PGs too large and might
> run into issues such as failed pool creation due to exceeding
> mon_max_pg_per_osd limit and etc, e.g.,
> https://bugzilla.redhat.com/show_bug.cgi?id=2023171.
> 
> Therefore, I have created a PR that will make scale-up the default profile
> for the autoscaler: https://github.com/ceph/ceph/pull/43999

Thanks, Junior. I have renamed the bug title, and also added an update in bz2023171.

Comment 11 Kamoltat (Junior) Sirivadhna 2022-01-27 04:53:35 UTC

Hey Aron,

Just added, let me know if you want me to change anything or add anything more.

Comment 13 Kamoltat (Junior) Sirivadhna 2022-02-16 15:06:00 UTC

Hi,

I would like to change the part ``starts with compliments of PGS`` -->``starts with ideal full-capacity of PGS``. Since the word `compliments` might be confusing to some people.

Thank you,

Comment 15 errata-xmlrpc 2022-04-04 10:22:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1174