Bug 2021738

Summary: [RADOS] Have default autoscaler profile as scale-up in RHCS 5.1
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Pawan <pdhiran>
Component: RADOSAssignee: Kamoltat (Junior) Sirivadhna <ksirivad>
Status: CLOSED ERRATA QA Contact: Pawan <pdhiran>
Severity: high Docs Contact: Ranjini M N <rmandyam>
Priority: unspecified    
Version: 5.1CC: agunn, akupczyk, amathuri, bhubbard, ceph-eng-bugs, gsitlani, ksirivad, lflores, nojha, pdhange, rfriedma, rmandyam, rzarzyns, sseshasa, tserlin, vereddy, vumrao
Target Milestone: ---Keywords: Rebase
Target Release: 5.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.2.7-4.el8cp Doc Type: Bug Fix
Doc Text:
.The autoscaler is set to scale-up by default in {storage-product} 5.1 With this release, the autoscaler is set to scale-up by default where it starts out each pool with minimal placement groups (PGs) and scales up PGs when there is more usage in each pool. There is also a scale-down profile, where each pool starts with ideal full-capacity PGs and only scales down when the usage ratio across the pools is not even. In a scale-down profile, the autoscaler identifies any overlapping roots and prevents the pools with such roots from scaling because overlapping roots can cause problems with the scaling process.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-04 10:22:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2031073    

Comment 2 Kamoltat (Junior) Sirivadhna 2021-11-18 14:49:37 UTC
Hi all,

So me Neha and Josh have discussed and come to a conclusion that is better for the autoscaler to start out with a scale-up profile by default. The reason for this is that scale-down was introduced to provide a better out-of-the-box experience. However, without a feature that allows us to impose a limit on the maximum number of PGs in the device_health_metrics or .mgr pool, scale-down mode can sometimes scale the PGs too large and might run into issues such as failed pool creation due to exceeding mon_max_pg_per_osd limit and etc, e.g., https://bugzilla.redhat.com/show_bug.cgi?id=2023171.

Therefore, I have created a PR that will make scale-up the default profile for the autoscaler: https://github.com/ceph/ceph/pull/43999

Comment 3 Vikhyat Umrao 2021-11-19 17:58:14 UTC
(In reply to ksirivad from comment #2)
> Hi all,
> 
> So me Neha and Josh have discussed and come to a conclusion that is better
> for the autoscaler to start out with a scale-up profile by default. The
> reason for this is that scale-down was introduced to provide a better
> out-of-the-box experience. However, without a feature that allows us to
> impose a limit on the maximum number of PGs in the device_health_metrics or
> .mgr pool, scale-down mode can sometimes scale the PGs too large and might
> run into issues such as failed pool creation due to exceeding
> mon_max_pg_per_osd limit and etc, e.g.,
> https://bugzilla.redhat.com/show_bug.cgi?id=2023171.
> 
> Therefore, I have created a PR that will make scale-up the default profile
> for the autoscaler: https://github.com/ceph/ceph/pull/43999

Thanks, Junior. I have renamed the bug title, and also added an update in bz2023171.

Comment 11 Kamoltat (Junior) Sirivadhna 2022-01-27 04:53:35 UTC
Hey Aron,

Just added, let me know if you want me to change anything or add anything more.

Comment 13 Kamoltat (Junior) Sirivadhna 2022-02-16 15:06:00 UTC
Hi,

I would like to change the part ``starts with compliments of PGS`` -->``starts with ideal full-capacity of PGS``. Since the word `compliments` might be confusing to some people.

Thank you,

Comment 15 errata-xmlrpc 2022-04-04 10:22:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1174