2155766 – (RHCS 6.0) [RADOS] Check for number of pgs is not accurate

Bug 2155766 - (RHCS 6.0) [RADOS] Check for number of pgs is not accurate

Summary: (RHCS 6.0) [RADOS] Check for number of pgs is not accurate

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	---
Target Release:	6.1z1
Assignee:	Matan Breizman
QA Contact:	Pawan
Docs Contact:	Akash Raj
URL:
Whiteboard:
Depends On:
Blocks:	2221020
TreeView+	depends on / blocked

Reported:	2022-12-22 09:58 UTC by Matan Breizman
Modified:	2023-08-03 16:46 UTC (History)
CC List:	16 users (show)
Fixed In Version:	ceph-17.2.6-1.el9cp
Doc Type:	Bug Fix
Doc Text:	.Users can check for accurate number of placement groups with the crush rule Previously, `check_pg_num()` function would not take into account the root OSDs used by the crush rule. This resulted in an inaccurate placement group number per OSD count. With this fix, `check_pg_num()` counts the projected placement group number which are part of the pools affected by the crush rule. The same applies to the number of OSDs as well; instead of dividing the projected placement group number total by all the `osdmap` 's OSDs, it is divided only by the OSDs used by the crush rule.
Clone Of:
Environment:
Last Closed:	2023-08-03 16:45:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	58288	None	None	None	2022-12-22 09:58:42 UTC
Red Hat Issue Tracker	RHCEPH-5834	None	None	None	2022-12-22 10:07:12 UTC
Red Hat Product Errata	RHBA-2023:4473	None	None	None	2023-08-03 16:46:06 UTC

Description Matan Breizman 2022-12-22 09:58:43 UTC

Description of problem:

pg_num_check() should take into account the crush rule of the pool.

2153654 reported an error with setting the pool size to 1 using a specified crush rules. In the interest of the dev freeze timelines, the PR where the regression was introduced had been reverted.
Since the PR which made pg_num_check() to work accurately was reverted, we need to fix pg_num_check() to work accurately again.

How reproducible:
Always


Steps to Reproduce:
1. Using a cluster with 6 OSDs, set up a new crush rule where only 1 OSDs is added as root. (See "Additional info")

2. Assuming `mon_max_pg_per_osd` configuration value is 250, create a pool with 256 pg/pgp num with the new crush rule mentioned above.

Actual results:
Although we exceed the `mon_max_pg_per_osd` limit - the pool will be created successfully. 

Since the crush rule is not taken into account while checking, we divide the projected pg num by *all* of the OSDs in the cluster, we should divide by the root OSDs of the crush rule only.

```
pool 'pool_test ' created
```
Notice osd.0:
```
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL    %USE  VAR   PGS  STATUS
 0    ssd  0.09859   1.00000  101 GiB  1.0 GiB  780 KiB   0 B   23 MiB  100 GiB  0.99  1.00  257      up
 1    ssd  0.09859   1.00000  101 GiB  1.0 GiB  812 KiB   0 B   23 MiB  100 GiB  0.99  1.00    3      up
 2    ssd  0.09859   1.00000  101 GiB  1.0 GiB  780 KiB   0 B   19 MiB  100 GiB  0.99  1.00    2      up
 3    ssd  0.09859   1.00000  101 GiB  1.0 GiB  360 KiB   0 B   18 MiB  100 GiB  0.99  1.00    1      up
 4    ssd  0.09859   1.00000  101 GiB  1.0 GiB  328 KiB   0 B   18 MiB  100 GiB  0.99  1.00    1      up
 5    ssd  0.09859   1.00000  101 GiB  1.0 GiB  360 KiB   0 B   22 MiB  100 GiB  0.99  1.00    1      up
```

Expected results:
pg_num_check() should disallow the creation of this pool (since it's pg num will exceed the per_osd limit with this crush rule)

```
Error ERANGE: pg_num 256 size 3 for this pool would result in 256 cumulative PGs per OSD (768 total PG replicas on 3 'in' root OSDs by crush rule) which exceeds the mon_max_pg_per_osd value of 250
```

Additional info:
`ceph osd tree` notice the new crush rule named: `osd_test`.

```
ID  CLASS  WEIGHT   TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-5         0.19717  root osd_test
0    ssd  0.09859      osd.0             up   1.00000  1.00000
-1         0.59151  root default
-3         0.59151      host folio
0    ssd  0.09859          osd.0         up   1.00000  1.00000
1    ssd  0.09859          osd.1         up   1.00000  1.00000
2    ssd  0.09859          osd.2         up   1.00000  1.00000
3    ssd  0.09859          osd.3         up   1.00000  1.00000
4    ssd  0.09859          osd.4         up   1.00000  1.00000
5    ssd  0.09859          osd.5         up   1.00000  1.00000
```

Comment 24 errata-xmlrpc 2023-08-03 16:45:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4473

Note You need to log in before you can comment on or make changes to this bug.