Description of problem: Changing the size (number of replicas) in a pool should cause the pgs to have a new interval. Version-Release number of selected component (if applicable): hammer and earlier How reproducible: Very Steps to Reproduce: Create a crush map with 6 osds on two hosts Create a pool with host level replication, size 3, min_size 2 Observe the pgs are some flavor of remapped and/or degraded (normal so far, 3 replicas on 2 hosts) Set the pool size to 2 Actual results: Observe that some pgs did not go active+clean Expected results: All pgs go active+clean Additional info:
ff79959c037a7145f7104b06d9e6a64492fdb95f
https://github.com/ceph/ceph/pull/5691 was merged upstream and shipped in v0.94.4 upstream. It will be in RHCS 1.3.2.
Verified on RHEL with the following packages ceph-common-0.94.5-4.el7cp.x86_64 ceph-0.94.5-4.el7cp.x86_64 ceph-mon-0.94.5-4.el7cp.x86_64 ceph-selinux-0.94.5-4.el7cp.x86_64 verification procedure I have 3 nodes with 3 osds each but I changed crushmap so that it will choose only 2 hosts with 3 osds root another { id -5 alg straw hash 0 item magna109 weight 2.7000 item magna110 weight 2.7000 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule my_ruleset { ruleset 1 type replicated min_size 1 max_size 10 step take another step chooseleaf firstn 0 type host step emit } created a pool using new ruleset sudo ceph osd pool create newpool 128 128 replicated my_ruleset pgs in this pool were in degraded state 12.e 0 0 0 0 0 0 0 0 active+undersized+degraded 2016-02-01 17:48:27.580444 0'0 183:5 [3,2] 3 [3,2] 3 0'0 2016-02-01 17:48:05.637550 0'0 2016-02-01 17:48:05.637550 12.d 0 0 0 0 0 0 0 0 active+undersized+degraded 2016-02-01 17:48:09.133262 0'0 183:5 [1,5] 1 [1,5] 1 0'0 2016-02-01 17:48:05.637549 0'0 2016-02-01 17:48:05.637549 12.c 0 0 0 0 0 0 0 0 active+undersized+degraded 2016-02-01 17:48:31.020987 0'0 183:5 [5,0] 5 [5,0] 5 0'0 2016-02-01 17:48:05.637548 0'0 2016-02-01 17:48:05.637548 changed the size of the pool sudo ceph osd pool set newpool size 2 then pgs became active+clean 12.f 0 0 0 0 0 0 0 0 active+clean 2016-02-01 17:52:30.296461 0'0 185:12 [3,2] 3 [3,2] 3 0'0 2016-02-01 17:48:05.637550 0'0 2016-02-01 17:48:05.637550 12.e 0 0 0 0 0 0 0 0 active+clean 2016-02-01 17:52:30.318035 0'0 185:12 [3,2] 3 [3,2] 3 0'0 2016-02-01 17:48:05.637550 0'0 2016-02-01 17:48:05.637550 12.d 0 0 0 0 0 0 0 0 active+clean 2016-02-01 17:52:10.462606 0'0 185:12 [1,5] 1 [1,5] 1 0'0 2016-02-01 17:48:05.637549 0'0 2016-02-01 17:48:05.637549 12.c 0 0 0 0 0 0 0 0 active+clean 2016-02-01 17:52:30.304240 0'0 185:12 [5,0] 5 [5,0] 5 0'0 2016-02-01 17:48:05.637548 0'0 2016-02-01 17:48:05.637548 set pool 12 size to 2 hence marking this as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0313