Bug 2219229
| Summary: | PGs are not being autoscaled to desired levels when the cluster has OSDs of multiple device classes with Custom CRUSH rules | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Pawan <pdhiran> |
| Component: | RADOS | Assignee: | Kamoltat (Junior) Sirivadhna <ksirivad> |
| Status: | NEW --- | QA Contact: | Pawan <pdhiran> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.1 | CC: | bhubbard, ceph-eng-bugs, cephqe-warriors, dparkes, nojha, sostapov, vimishra, vumrao |
| Target Milestone: | --- | ||
| Target Release: | 6.1z2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Missed the 6.1 z1 window. Retargeting to 6.1 z2. |
Description of problem: PGs are not being autoscaled to desired levels when the cluster has OSDs of multiple device classes. pool 10 'ecpool-42-new' erasure profile ec42 size 6 min_size 5 crush_rule 2 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 280 flags hashpspool stripe_width 16384 application rados pool 11 'test' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 432 flags hashpspool stripe_width 0 application rados pool 12 'test2' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 437 flags hashpspool stripe_width 0 application rados # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.78070 root default -7 0.19518 host ceph-pdhiran-rnxgw8-node1-installer 19 hdd 0.02440 osd.19 up 1.00000 1.00000 23 hdd 0.02440 osd.23 up 1.00000 1.00000 26 hdd 0.02440 osd.26 up 1.00000 1.00000 30 hdd 0.02440 osd.30 up 1.00000 1.00000 3 ssd 0.02440 osd.3 up 1.00000 1.00000 7 ssd 0.02440 osd.7 up 1.00000 1.00000 11 ssd 0.02440 osd.11 up 1.00000 1.00000 15 ssd 0.02440 osd.15 up 1.00000 1.00000 -3 0.19518 host ceph-pdhiran-rnxgw8-node2 16 hdd 0.02440 osd.16 up 1.00000 1.00000 20 hdd 0.02440 osd.20 up 1.00000 1.00000 24 hdd 0.02440 osd.24 up 1.00000 1.00000 28 hdd 0.02440 osd.28 up 1.00000 1.00000 0 ssd 0.02440 osd.0 up 1.00000 1.00000 4 ssd 0.02440 osd.4 up 1.00000 1.00000 8 ssd 0.02440 osd.8 up 1.00000 1.00000 12 ssd 0.02440 osd.12 up 1.00000 1.00000 -9 0.19518 host ceph-pdhiran-rnxgw8-node3 18 hdd 0.02440 osd.18 up 1.00000 1.00000 22 hdd 0.02440 osd.22 up 1.00000 1.00000 27 hdd 0.02440 osd.27 up 1.00000 1.00000 31 hdd 0.02440 osd.31 up 1.00000 1.00000 2 ssd 0.02440 osd.2 up 1.00000 1.00000 6 ssd 0.02440 osd.6 up 1.00000 1.00000 10 ssd 0.02440 osd.10 up 1.00000 1.00000 14 ssd 0.02440 osd.14 up 1.00000 1.00000 -5 0.19518 host ceph-pdhiran-rnxgw8-node4 17 hdd 0.02440 osd.17 up 1.00000 1.00000 21 hdd 0.02440 osd.21 up 1.00000 1.00000 25 hdd 0.02440 osd.25 up 1.00000 1.00000 29 hdd 0.02440 osd.29 up 1.00000 1.00000 1 ssd 0.02440 osd.1 up 1.00000 1.00000 5 ssd 0.02440 osd.5 up 1.00000 1.00000 9 ssd 0.02440 osd.9 up 1.00000 1.00000 13 ssd 0.02440 osd.13 up 1.00000 1.00000 # ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 400 GiB 390 GiB 10 GiB 10 GiB 2.53 ssd 400 GiB 394 GiB 6.0 GiB 6.0 GiB 1.50 TOTAL 800 GiB 784 GiB 16 GiB 16 GiB 2.02 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL .mgr 1 1 449 KiB 2 1.3 MiB 0 211 GiB cephfs.cephfs.meta 2 16 4.6 KiB 22 96 KiB 0 211 GiB cephfs.cephfs.data 3 32 0 B 0 0 B 0 211 GiB .rgw.root 4 32 1.3 KiB 4 48 KiB 0 211 GiB default.rgw.log 5 32 3.6 KiB 209 408 KiB 0 211 GiB default.rgw.control 6 32 0 B 8 0 B 0 211 GiB default.rgw.meta 7 32 382 B 3 24 KiB 0 211 GiB ecpool-21 8 32 0 B 0 0 B 0 211 GiB ecpool-42-new 10 1 0 B 0 0 B 0 423 GiB test 11 1 3.7 GiB 39.31k 11 GiB 1.74 211 GiB test2 12 1 0 B 0 0 B 0 211 GiB Version-Release number of selected component (if applicable): ceph version 17.2.6-76.el9cp (7d277f1e8500eb73e50260771e11b7bd7d6f34af) quincy (stable) How reproducible: 3/3 times for pools creation on same cluster Steps to Reproduce: 1. Deploy RHCS cluster, with all the services. 2. FOr some OSDs, remove the existing device class and set 'ssd' as the device class ceph osd crush set-device-class ssd 12 ceph osd crush rm-device-class 12 3. Create pools post this operation. PGs are not being autoscaled to desired levels, even after IOs. Also, the o/p of autoscale-status is empty on the cluster. Not sure if this deserves another separate bug. # ceph osd pool autoscale-status -f json-pretty [] [ceph: root@ceph-pdhiran-rnxgw8-node1-installer /]# ceph -s cluster: id: 0acee9a0-141b-11ee-b0ba-fa163eb5f775 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-pdhiran-rnxgw8-node1-installer,ceph-pdhiran-rnxgw8-node3,ceph-pdhiran-rnxgw8-node2 (age 6d) mgr: ceph-pdhiran-rnxgw8-node1-installer.ceobbf(active, since 6d), standbys: ceph-pdhiran-rnxgw8-node4.qpjdxs, ceph-pdhiran-rnxgw8-node2.zvsusw mds: 1/1 daemons up, 2 standby osd: 32 osds: 32 up (since 6d), 32 in (since 6d) rgw: 4 daemons active (4 hosts, 1 zones) data: volumes: 1/1 healthy pools: 11 pools, 212 pgs objects: 39.56k objects, 3.7 GiB usage: 16 GiB used, 784 GiB / 800 GiB avail pgs: 212 active+clean CRUSH map on the cluster : http://pastebin.test.redhat.com/1103986 Actual results: PGs are not scaling up and Autoscaler command o/p is empty Expected results: PGs should be scaled automatically Additional info: