Bug 1731148 - multisite pg_num on site2 pools should use site1/source values
Summary: multisite pg_num on site2 pools should use site1/source values
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 4.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: 4.1
Assignee: Ali Maredia
QA Contact: Vasishta
Depends On:
Blocks: 1727980
TreeView+ depends on / blocked
Reported: 2019-07-18 13:04 UTC by John Harrigan
Modified: 2020-05-19 17:31 UTC (History)
11 users (show)

Fixed In Version: ceph-ansible-4.0.15-1.el8, ceph-ansible-4.0.15-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2020-05-19 17:30:41 UTC
Target Upstream Version:
hyelloji: needinfo-

Attachments (Terms of Use)
pgcalcl-2019-07-18.png Ceph PGCalc sample (95.78 KB, image/png)
2019-07-18 16:01 UTC, John Harrigan
no flags Details

System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 4445 0 'None' closed rgw: extend automatic rgw pool creation capability 2020-05-15 15:38:21 UTC
Red Hat Product Errata RHSA-2020:2231 0 None None None 2020-05-19 17:31:02 UTC

Description John Harrigan 2019-07-18 13:04:07 UTC
Description of problem:
Using RHCS 3.2z2 and ceph-ansible to deploy multisite, the pools on site2
are created using default pg_num values. This can result in very poor performance.

The pg_num values on site2 should inherit the pg_num value from the existing
pools on site1 

Version-Release number of selected component (if applicable):
ceph-ansible.noarch      3.2.15-1.el7cp

How reproducible:

Steps to Reproduce:
1. deploy site1
2. create pools on site1 with pg_num values as suggested by pg num calc ( https://access.redhat.com/labsinfo/cephpgc )
3. deploy site2
4. edit all.yaml for multisite values
5. run ceph-ansible
6. view pg_num values on site1 and site2

Actual results:
# for i in `rados lspools` ; do echo -ne $i"\t" ; ceph osd pool get $i pg_num ; done
default.rgw.users.keys pg_num: 64
default.rgw.data.root pg_num: 64
.rgw.root pg_num: 64
default.rgw.control pg_num: 64
default.rgw.gc pg_num: 64
default.rgw.buckets.data pg_num: 1024
default.rgw.buckets.index pg_num: 128
default.rgw.buckets.extra pg_num: 64
default.rgw.log pg_num: 64
default.rgw.meta pg_num: 64
default.rgw.intent-log pg_num: 64
default.rgw.usage pg_num: 64
default.rgw.users pg_num: 64
default.rgw.users.email pg_num: 64
default.rgw.users.swift pg_num: 64
default.rgw.users.uid pg_num: 64
site2.rgw.meta pg_num: 8
site2.rgw.log pg_num: 8
site2.rgw.control pg_num: 8
site2.rgw.buckets.index pg_num: 8
site2.rgw.buckets.data pg_num: 8

Expected results:
The pg_num values on site1 default.rgw pools and site2.rgw pools match

Additional info:

Comment 1 John Harrigan 2019-07-18 16:00:13 UTC
Added attachment 'pgcalc2019-07-18.png'
This is the output from the pg calculator at    https://ceph.com/pgcalc/
I selected "Ceph Use Case Selector: Rados Gateway Only - Jewel or later"
All other values are defaults, including "Size", "OSD #" and "Target PGs per OSD"

There are three distinct pg_num values here:
* default.rgw.buckets.data = 4096
* default.rgw.buckets.index = 128
* default.rgw.*             = 64

Comment 2 John Harrigan 2019-07-18 16:01:44 UTC
Created attachment 1591827 [details]
pgcalcl-2019-07-18.png Ceph PGCalc sample

Comment 3 Giridhar Ramaraju 2019-08-05 13:11:57 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 


Comment 4 Giridhar Ramaraju 2019-08-05 13:12:49 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 


Comment 13 errata-xmlrpc 2020-05-19 17:30:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.