Bug 1879950 - OCS 4.6: Document restriction on pool count & PG numbers for multiple pool scenarios
Summary: OCS 4.6: Document restriction on pool count & PG numbers for multiple pool sc...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: documentation
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: OCS 4.6.0
Assignee: Disha Walvekar
QA Contact: Shay Rozen
URL:
Whiteboard:
Depends On:
Blocks: 1880918 1882363
TreeView+ depends on / blocked
 
Reported: 2020-09-17 12:36 UTC by Neha Berry
Modified: 2020-12-18 11:54 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-18 11:54:03 UTC
Embargoed:


Attachments (Terms of Use)

Description Neha Berry 2020-09-17 12:36:40 UTC
Describe the issue:
-----------------------
With https://issues.redhat.com/browse/KNIP-1462, OCS 4.6 would get the ability to create custom block pools(or any pool). 

But for a 3 OSD cluster, one can have maximum 750 PGs, hence for an on-prem deplyoment, max 12 pools (out of which 10 are default) can be created before we start hitting the max PG count.

Since RHCS 4.1.z1, there are 32 PGs per new pool, i.e. 32 *3(replication factor) PGS for each new ppol.

We need to document this in OCS 4.6 guides. Please request for the content from with Ceph or OCS engg.


Current details from a 3 OSD cluster
-----------------------------------------
For Vmware installs, we get 10 pools, 176 pgs by default deployment
For AWS (no RGW) - 3 pools, 96 pgs by default deployment

So, for on-prem clusters: 

for 10 pools, total PGS = 176*3(replication) = 528
so for 2 new custom pools, PGs = 2 * 32 * 3(replication) = 192

total PGs = 720

Try to create one more pool = 32*3(replication) = 96  + 720(with 12 pools) = 816(which is >750)


Chat threadlink - https://chat.google.com/room/AAAAREGEba8/-5EEyRcDfxw



Document URL:

Probably need to document in the section where we would describe creation of custom SCs and pools in OCS 4.6


Product Version:
-----------------
OCS 4.6



Additional information:
-------------------------------
# ceph daemon /var/run/ceph/ceph-mon.dell-r740-001.asok config show |grep mon_max_pg_per_osd 
    "mon_max_pg_per_osd": "250",


As per chat thread:

"""""""
The recommended pgs per OSD is max 200 when you have dedicated CPU for each osd and good RAM

so if you want to have more pools which means more pgs

then you should have more osds

if data storage is low and you think this can be satisfied with the given amount storage then you can reduce pg count in each pool without increasing osds it should satisfy extra pool

the 750 pgs/osd limit was set to avoid any kind of resource overkill and causing cluster instability

"""""""""""""

Comment 7 Neha Berry 2020-11-24 14:48:32 UTC
I would also request @vikhyat to confirm if the above content looks good.

Comment 10 Travis Nielsen 2020-11-30 19:33:29 UTC
Knowing how many ceph pools can be created before we hit the max PG count depends on a number of factors. Any details we try to tell the users about the pool creation limits will be very confusing. We really need the UI to automatically compute and show to the user when they are about to hit the limit so they know exactly when they have hit the limit. Until we do that on the backend (with the admission controller), we need a simple message for the user.

@Eran Can we just keep the docs simple and generic for now and say we only support adding two pools? But if we really want to support more pools, the docs would need to expose the full formula ceph uses.

Comment 12 Travis Nielsen 2020-12-01 18:56:24 UTC
Yes that should work and keep it simple, although the number of pools that can be created could technically be more than that. We could easily support three more pools per expansion. So I'm understanding we will support the following:
3 OSDs --> OCS creates one default pool and RGW/MDS with their pools. The user can create two additional pools/storage classes.
6 OSDs --> Three additional pools supported
9 OSDs --> Three additional pools supported

@Josh Any concerns with this?

Comment 13 Josh Durgin 2020-12-02 01:16:48 UTC
(In reply to Travis Nielsen from comment #12)
> Yes that should work and keep it simple, although the number of pools that
> can be created could technically be more than that. We could easily support
> three more pools per expansion. So I'm understanding we will support the
> following:
> 3 OSDs --> OCS creates one default pool and RGW/MDS with their pools. The
> user can create two additional pools/storage classes.
> 6 OSDs --> Three additional pools supported
> 9 OSDs --> Three additional pools supported
> 
> @Josh Any concerns with this?

That's fine at a small scale. I'd avoid encouraging users to create tons of pools. They are not the right tool for e.g. multitenancy. You lose many of the scalability advantages of ceph if you create 3 new pools for every osd.

Comment 15 Shay Rozen 2020-12-07 09:47:33 UTC
from https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.6/html-single/deploying_and_managing_openshift_container_storage_using_red_hat_openstack_platform/index?lb_target=preview

"NOTE
With a minimal cluster of a single device set, only two new storage classes can be created. Every storage cluster expansion allows two new additional storage classes."

Moving to verify.

Comment 16 Rejy M Cyriac 2020-12-18 11:54:03 UTC
OCS 4.6.0 GA completed on 17 December 2020


Note You need to log in before you can comment on or make changes to this bug.