Bug 1366577 - Wrong calculation of PGs peer OSD leads to cluster in HEALTH_WARN state with explanation "too many PGs per OSD (768 > max 300)"
Summary: Wrong calculation of PGs peer OSD leads to cluster in HEALTH_WARN state with ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Storage Console
Classification: Red Hat
Component: Ceph
Version: 2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 2
Assignee: Shubhendu Tripathi
QA Contact: Daniel Horák
URL:
Whiteboard:
Depends On: 1362403
Blocks: Console-2-Async
TreeView+ depends on / blocked
 
Reported: 2016-08-12 11:30 UTC by Daniel Horák
Modified: 2016-10-19 15:21 UTC (History)
18 users (show)

Fixed In Version: rhscon-core-0.0.44-1.el7scon.x86_64, rhscon-ui-0.0.58-1.el7scon.noarch
Doc Type: Bug Fix
Doc Text:
Previously, the automatic PG calculation logic caused problems as it calculated on per pool basis instead of calculating on a cluster level based on the number of OSDs in the cluster. This incorrect PG calculation issued cluster health warning due to large number of PGs being created during each pool creation. With this update, the automatic calculation of PGs is disabled. The administrator needs to manually provide the PG values per OSD by using the PG calculator tool from Ceph to ensure the cluster remains in a healthy state.
Clone Of: 1362403
Environment:
Last Closed: 2016-10-19 15:21:41 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1375538 0 unspecified CLOSED PG count for pool creation is hard set and calculated in a wrong way 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2016:2082 0 normal SHIPPED_LIVE Moderate: Red Hat Storage Console 2 security and bug fix update 2017-04-18 19:29:02 UTC

Internal Links: 1375538

Comment 2 Shubhendu Tripathi 2016-08-22 06:33:08 UTC
Hi John,

So if I understand it correct, the final values for PGs should be like below-

OSDs                    PGs
============================
<=5                     128
>5 && <=10              512
>10 && <=20             1024
>20 && <=30             2048
>30 && <=40             3072
>40 && <=50             4096
>50                     Use the formula to calculate PGs

Is this understanding correct and should we go ahead and implement this?

Comment 3 Shubhendu Tripathi 2016-08-25 05:14:25 UTC
Sam, can you please confirm the above comment#2.
If this understanding is correct?

Comment 4 Samuel Just 2016-08-26 15:16:19 UTC
                   pgs/osd
osds    pgs    size 2  size 3
5       128     76      51
6       512     256     170
10      512     153     102
11      1024    279     186
20      1024    153     102
21      2048    292     195
30      2048    204     136
31      3072    297     198
40      3072    230     153
41      4096    299     199
50      4096    245     163

With replication set to 2, these numbers work pretty well (between 100 and 200).  With replication set to 3, it's a little on the high side, but still between 150 and 300.  That's probably ok if we need to use the same guideline for both.  300 pgs/osd is on the high side, but not horrible.

Comment 5 Shubhendu Tripathi 2016-08-26 15:58:18 UTC
So looks like the logic in comment#2 looks good. I would go ahead with this.
Thanks Sam for explanation.

Comment 6 Shubhendu Tripathi 2016-09-08 08:38:37 UTC
As confirmed in comment#5 below changes would be done for this issue

1. Update backend logic to calculate the pg nums as per comment#2
2. Update UI slider logic to adhere to logic in comment#2

Comment 7 Karnan 2016-09-30 08:42:54 UTC
Fixed as per comments 2, 3 in https://bugzilla.redhat.com/show_bug.cgi?id=1375538

Comment 9 Daniel Horák 2016-10-03 13:33:07 UTC
What is the final resolution of this Bug?
Accordingly to comment 5, there should be some update in the logic for automatic PG calculation, but accordingly to Comment 7 (pointing to Bug 1375538 comment 2 and 3), the automatic calculation should be completely removed.

Comment 10 Shubhendu Tripathi 2016-10-04 03:58:17 UTC
Daniel, as per latest discussions with Michael Kidd, we no more would do auto calculation of PG nums and there is a text box provided in UI to admin to enter the value. Also there would be a link provided to PG Calc tool in UI.

Refer https://bugzilla.redhat.com/show_bug.cgi?id=1375538 for more details on this decision.

Comment 11 Daniel Horák 2016-10-04 06:16:58 UTC
I'm verifying this bug, as per comment 10, the PG calculation logic was completely removed and there is just box to set manually the number of PGs. Additional details related to the change will be verified in bug 1375538.

Red Hat Enterprise Linux Server release 7.3 (Maipo)

rhscon-ceph-0.0.43-1.el7scon.x86_64
rhscon-core-0.0.45-1.el7scon.x86_64
rhscon-core-selinux-0.0.45-1.el7scon.noarch
rhscon-ui-0.0.59-1.el7scon.noarch

>> VERIFIED

Comment 13 Shubhendu Tripathi 2016-10-17 12:17:53 UTC
Doc-text looks good.

Comment 14 errata-xmlrpc 2016-10-19 15:21:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:2082


Note You need to log in before you can comment on or make changes to this bug.