In Replica-1 we support increasing the number of osds per failure domain. But even after the number of osds per failure domain is increased the data always goes to one particular osd. This results in a large imbalance of data among the osds in a failure domain. This happens because the PG & PGP number stays at 1 always for the replica-1 pools pool 5 'ocs-storagecluster-cephblockpool-us-east-1b' replicated size 1 min_size 1 crush_rule 8 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 126 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 6 'ocs-storagecluster-cephblockpool-us-east-1c' replicated size 1 min_size 1 crush_rule 10 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 128 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 7 'ocs-storagecluster-cephblockpool-us-east-1a' replicated size 1 min_size 1 crush_rule 13 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 123 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd Is there any workaround available to the best of your knowledge? Yes, disable the reconciliation of cephblockpool and add spec: parameters: pg_num: '16' pgp_num: '16'
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4591