Description of problem: EC pools do not use the IsRecoverablePredicate to determine whether a particular interval could have gone active. For example, in a 4+2 ec pool with min_size=2 (which would be a dumb configuration, but possible), an interval with 3 osds would be marked as maybe_rw even though the primary OSD could not have gone active. Version-Release number of selected component (if applicable): How reproducible: Easy artificially, less so in the wild. Not serious since you can work around by marking the missing osds as lost without losing any data. Steps to Reproduce: 1. Create 6 osd 4+2 ec pool with min_size = 1 2. Kill 5 osds and mark down 3. Wait 1 minute 4. Kill the 6th. 5. Revive the first 5 Actual results: PG goes to down state. Expected results: PG goes active+degraded since it could not have gone active in that state. Additional info:
Merged to master, pending backports.
This does *not* need to be added for 1.3.1.
Shipped in v0.94.4 upstream - will be in RHCS 1.3.2
verified on ceph-0.94.5-4.el7cp.x86_64 Now as part of CLI there is a check for ecpools which makes sure that min_size for an ecpool is never less than k and never greater than k+m value. for ex: I created an ecpool with 4+2 configuration. Now if i try to change the min_size of the pool sudo ceph osd pool set ecpool min_size 1 Error EINVAL: pool min_size must be between 4 and 6 hence marking this as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0313