Bug 1255882 - Changing the size (number of replicas) in a pool should cause the pgs to have a new interval.
Summary: Changing the size (number of replicas) in a pool should cause the pgs to have...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 1.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: rc
: 1.3.2
Assignee: Samuel Just
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks: 1255887
TreeView+ depends on / blocked
 
Reported: 2015-08-21 19:01 UTC by Samuel Just
Modified: 2017-07-30 15:08 UTC (History)
7 users (show)

Fixed In Version: RHEL: ceph-0.94.5-2.el7cp Ubuntu: ceph_0.94.5-2redhat1
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1255887 (view as bug list)
Environment:
Last Closed: 2016-02-29 14:43:12 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 11771 0 None None None Never
Red Hat Product Errata RHBA-2016:0313 0 normal SHIPPED_LIVE Red Hat Ceph Storage 1.3.2 bug fix and enhancement update 2016-02-29 19:37:43 UTC

Description Samuel Just 2015-08-21 19:01:52 UTC
Description of problem:

Changing the size (number of replicas) in a pool should cause the pgs to have a new interval.

Version-Release number of selected component (if applicable): hammer and earlier


How reproducible:

Very

Steps to Reproduce:
Create a crush map with 6 osds on two hosts
Create a pool with host level replication, size 3, min_size 2
Observe the pgs are some flavor of remapped and/or degraded (normal so far, 3 replicas on 2 hosts)
Set the pool size to 2

Actual results:

Observe that some pgs did not go active+clean

Expected results:

All pgs go active+clean

Additional info:

Comment 2 Samuel Just 2015-08-21 19:04:12 UTC
ff79959c037a7145f7104b06d9e6a64492fdb95f

Comment 3 Ken Dreyer (Red Hat) 2015-12-11 21:55:36 UTC
https://github.com/ceph/ceph/pull/5691 was merged upstream and shipped in v0.94.4 upstream. It will be in RHCS 1.3.2.

Comment 5 shylesh 2016-02-02 05:57:50 UTC
Verified on  RHEL with the following packages

ceph-common-0.94.5-4.el7cp.x86_64
ceph-0.94.5-4.el7cp.x86_64
ceph-mon-0.94.5-4.el7cp.x86_64
ceph-selinux-0.94.5-4.el7cp.x86_64


verification procedure 

I have 3 nodes with 3 osds each but I changed crushmap so that it will choose only 2 hosts with 3 osds


root another {
        id  -5
        alg straw
        hash 0
        item magna109 weight 2.7000
        item magna110 weight 2.7000
}

# rules
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

rule my_ruleset {
        ruleset 1
        type replicated
        min_size 1
        max_size 10
        step take another
        step chooseleaf firstn 0 type host
        step emit
}


created a pool using new ruleset

sudo ceph osd pool create newpool 128 128 replicated my_ruleset

pgs in this pool were in degraded state

12.e    0       0       0       0       0       0       0       0       active+undersized+degraded      2016-02-01 17:48:27.580444        0'0     183:5   [3,2]   3       [3,2]   3       0'0     2016-02-01 17:48:05.637550      0'0     2016-02-01 17:48:05.637550
12.d    0       0       0       0       0       0       0       0       active+undersized+degraded      2016-02-01 17:48:09.133262        0'0     183:5   [1,5]   1       [1,5]   1       0'0     2016-02-01 17:48:05.637549      0'0     2016-02-01 17:48:05.637549
12.c    0       0       0       0       0       0       0       0       active+undersized+degraded      2016-02-01 17:48:31.020987        0'0     183:5   [5,0]   5       [5,0]   5       0'0     2016-02-01 17:48:05.637548      0'0     2016-02-01 17:48:05.637548



changed the size of the pool

sudo ceph osd pool set newpool size 2


then pgs became active+clean

12.f    0       0       0       0       0       0       0       0       active+clean    2016-02-01 17:52:30.296461      0'0       185:12  [3,2]   3       [3,2]   3       0'0     2016-02-01 17:48:05.637550      0'0     2016-02-01 17:48:05.637550
12.e    0       0       0       0       0       0       0       0       active+clean    2016-02-01 17:52:30.318035      0'0       185:12  [3,2]   3       [3,2]   3       0'0     2016-02-01 17:48:05.637550      0'0     2016-02-01 17:48:05.637550
12.d    0       0       0       0       0       0       0       0       active+clean    2016-02-01 17:52:10.462606      0'0       185:12  [1,5]   1       [1,5]   1       0'0     2016-02-01 17:48:05.637549      0'0     2016-02-01 17:48:05.637549
12.c    0       0       0       0       0       0       0       0       active+clean    2016-02-01 17:52:30.304240      0'0       185:12  [5,0]   5       [5,0]   5       0'0     2016-02-01 17:48:05.637548      0'0     2016-02-01 17:48:05.637548
set pool 12 size to 2


hence marking this as verified

Comment 7 errata-xmlrpc 2016-02-29 14:43:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:0313


Note You need to log in before you can comment on or make changes to this bug.