Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2047429

Summary: [Workload-DFG] [RHCS 5.1] - release criteria testing - small objects aged measure(hybrid) workload drops ~20% as compared to upgrade measure workload
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Rachana Patel <racpatel>
Component: RADOSAssignee: Kamoltat (Junior) Sirivadhna <ksirivad>
Status: CLOSED NOTABUG QA Contact: Pawan <pdhiran>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.1CC: akupczyk, amathuri, bhubbard, ceph-eng-bugs, choffman, ksirivad, lflores, mbenjamin, nojha, pdhange, rfriedma, rzarzyns, skanta, sseshasa, twilkins, vumrao
Target Milestone: ---Keywords: Performance
Target Release: 6.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-23 17:53:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Kamoltat (Junior) Sirivadhna 2022-01-28 21:18:52 UTC
As discussed with Vikhyat,

we want the manager module to enable debug log for the autoscaler, this can be done by using the command:

 ``ceph config set mgr mgr/pg_autoscaler/log_level debug``

Now, what we want to investigate is the number of PGs reduced by the autoscaler as well as the number of PGs that are in backfill/recovery.
To do this we can use the command:

 ``ceph osd pool autoscale-status``

This will give us a table of all the pools in the cluster, containing the current PG_NUM and NEW PG_NUM.
From this, we can determine the number of PGs that each pool needs to increase or decrease. Here we can easily compare the total number of pgs that increase/decrease between 5.0 and 5.1.

So I guess we can do a ``ceph osd pool autoscale-status`` immediately at the start (after deployment and all pools are created), 2 more during a fill workload, 3-4 more during the hybrid workload, and 1 more at the end.
However, we need to keep the time at which we do ``ceph osd pool autoscale-status`` the same for each 5.0 and 5.1 so we can get them as accurate results as possible. Maybe a timer script?


cluster.log will provide us with the number of backfill/recovery pgs at a certain timestamp and we can compare between 5.0 and 5.1

Comment 4 Vikhyat Umrao 2022-01-28 21:33:31 UTC
(In reply to ksirivad from comment #3)
> As discussed with Vikhyat,
> 
Thanks, Junior.

> we want the manager module to enable debug log for the autoscaler, this can
> be done by using the command:
> 
>  ``ceph config set mgr mgr/pg_autoscaler/log_level debug``
> 
> Now, what we want to investigate is the number of PGs reduced by the
> autoscaler as well as the number of PGs that are in backfill/recovery.
> To do this we can use the command:
> 
>  ``ceph osd pool autoscale-status``
> 
> This will give us a table of all the pools in the cluster, containing the
> current PG_NUM and NEW PG_NUM.
> From this, we can determine the number of PGs that each pool needs to
> increase or decrease. Here we can easily compare the total number of pgs
> that increase/decrease between 5.0 and 5.1.
> 
> So I guess we can do a ``ceph osd pool autoscale-status`` immediately at the
> start (after deployment and all pools are created), 2 more during a fill
> workload, 3-4 more during the hybrid workload, and 1 more at the end.
> However, we need to keep the time at which we do ``ceph osd pool
> autoscale-status`` the same for each 5.0 and 5.1 so we can get them as
> accurate results as possible. Maybe a timer script?

You got it. We already have a poller script! to do run and many more commands every 5 minutes so we should be all set!

> 
> 
> cluster.log will provide us with the number of backfill/recovery pgs at a
> certain timestamp and we can compare between 5.0 and 5.1



Tim - so you need to do one extra thing is running the following command[1] before starting the tests!

[1] ceph config set mgr mgr/pg_autoscaler/log_level debug

Comment 5 Vikhyat Umrao 2022-01-28 21:36:15 UTC
(In reply to Vikhyat Umrao from comment #4)
> (In reply to ksirivad from comment #3)
> > As discussed with Vikhyat,
> > 
> Thanks, Junior.
> 
> > we want the manager module to enable debug log for the autoscaler, this can
> > be done by using the command:
> > 
> >  ``ceph config set mgr mgr/pg_autoscaler/log_level debug``
> > 
> > Now, what we want to investigate is the number of PGs reduced by the
> > autoscaler as well as the number of PGs that are in backfill/recovery.
> > To do this we can use the command:
> > 
> >  ``ceph osd pool autoscale-status``
> > 
> > This will give us a table of all the pools in the cluster, containing the
> > current PG_NUM and NEW PG_NUM.
> > From this, we can determine the number of PGs that each pool needs to
> > increase or decrease. Here we can easily compare the total number of pgs
> > that increase/decrease between 5.0 and 5.1.
> > 
> > So I guess we can do a ``ceph osd pool autoscale-status`` immediately at the
> > start (after deployment and all pools are created), 2 more during a fill
> > workload, 3-4 more during the hybrid workload, and 1 more at the end.
> > However, we need to keep the time at which we do ``ceph osd pool
> > autoscale-status`` the same for each 5.0 and 5.1 so we can get them as
> > accurate results as possible. Maybe a timer script?
> 
> You got it. We already have a poller script! to do run and many more
> commands every 5 minutes so we should be all set!

Tim - to avoid any confusion if the `ceph osd pool autoscale-status` command is not logged every 5 minutes please log this one every 5 minutes.

Comment 42 Neha Ojha 2023-03-23 17:53:13 UTC
No update since https://bugzilla.redhat.com/show_bug.cgi?id=2047429#c19, closing for now. Please re-open if the issue reproduces.