Description of problem: Using RHCS 3.2z2 and ceph-ansible to deploy multisite, the pools on site2 are created using default pg_num values. This can result in very poor performance. The pg_num values on site2 should inherit the pg_num value from the existing pools on site1 Version-Release number of selected component (if applicable): ceph-ansible.noarch 3.2.15-1.el7cp How reproducible: Always Steps to Reproduce: 1. deploy site1 2. create pools on site1 with pg_num values as suggested by pg num calc ( https://access.redhat.com/labsinfo/cephpgc ) 3. deploy site2 4. edit all.yaml for multisite values 5. run ceph-ansible 6. view pg_num values on site1 and site2 Actual results: root@f18-h14-000-r620:~ # for i in `rados lspools` ; do echo -ne $i"\t" ; ceph osd pool get $i pg_num ; done default.rgw.users.keys pg_num: 64 default.rgw.data.root pg_num: 64 .rgw.root pg_num: 64 default.rgw.control pg_num: 64 default.rgw.gc pg_num: 64 default.rgw.buckets.data pg_num: 1024 default.rgw.buckets.index pg_num: 128 default.rgw.buckets.extra pg_num: 64 default.rgw.log pg_num: 64 default.rgw.meta pg_num: 64 default.rgw.intent-log pg_num: 64 default.rgw.usage pg_num: 64 default.rgw.users pg_num: 64 default.rgw.users.email pg_num: 64 default.rgw.users.swift pg_num: 64 default.rgw.users.uid pg_num: 64 site2.rgw.meta pg_num: 8 site2.rgw.log pg_num: 8 site2.rgw.control pg_num: 8 site2.rgw.buckets.index pg_num: 8 site2.rgw.buckets.data pg_num: 8 Expected results: The pg_num values on site1 default.rgw pools and site2.rgw pools match Additional info:
Added attachment 'pgcalc2019-07-18.png' This is the output from the pg calculator at https://ceph.com/pgcalc/ I selected "Ceph Use Case Selector: Rados Gateway Only - Jewel or later" All other values are defaults, including "Size", "OSD #" and "Target PGs per OSD" There are three distinct pg_num values here: * default.rgw.buckets.data = 4096 * default.rgw.buckets.index = 128 * default.rgw.* = 64
Created attachment 1591827 [details] pgcalcl-2019-07-18.png Ceph PGCalc sample
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:2231