Bug 1576674 - EC pool pgs are getting into incomplete state after killing "M" number of OSDs.
Summary: EC pool pgs are getting into incomplete state after killing "M" number of OSDs.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 5.0
Assignee: Josh Durgin
QA Contact: Pawan
Ranjini M N
URL:
Whiteboard:
Depends On:
Blocks: 1959686
TreeView+ depends on / blocked
 
Reported: 2018-05-10 06:36 UTC by Parikshith
Modified: 2021-08-30 08:23 UTC (History)
12 users (show)

Fixed In Version: ceph-16.0.0-8633.el8cp
Doc Type: Enhancement
Doc Text:
.The {storage-product} recovers with fewer OSDs available in an erasure coded (EC) pool Previously, erasure coded (EC) pools of size `k+m` required at least `k+1` copies for recovery to function. If only `k` copies were available, recovery would be incomplete. With this release, {storage-product} cluster now recovers with `k` or more copies available in an EC pool. For more information on erasure coded pools, see the link:{storage-strategies-guide}#erasure_code_pools[_Erasure coded pools_] chapter in the _{storage-product} Storage Strategies Guide_.
Clone Of:
Environment:
Last Closed: 2021-08-30 08:22:53 UTC
Embargoed:


Attachments (Terms of Use)
Crush map (8.65 KB, text/plain)
2018-05-10 06:36 UTC, Parikshith
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph pull 17619 0 'None' closed osd: allow EC PGs to do recovery below min_size 2021-02-16 04:25:02 UTC
Red Hat Issue Tracker RHCEPH-798 0 None None None 2021-08-19 16:42:27 UTC
Red Hat Product Errata RHBA-2021:3294 0 None None None 2021-08-30 08:23:38 UTC

Description Parikshith 2018-05-10 06:36:53 UTC
Created attachment 1434216 [details]
Crush map

Description of problem:
EC pool pgs are getting into incomplete state after killing "M" number of OSDs.

Version-Release number of selected component (if applicable):
ceph version 12.2.4-10.el7cp

My Setup:
3 Mons, 3 OSD host with 8 osds in total
Ec Pool: K=3 M=2, osd level failure domain

Steps done:
1.Configured a ceph cluster 
2.Created an EC pool(5Pgs)(k=3,M=2) configured with osd level failure domain.

Profile: sudo ceph osd erasure-code-profile get myprofile --cluster slave
crush-device-class=
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=3
m=2
plugin=jerasure
technique=reed_sol_van
w=8

Crush rule dump of this pool: sudo ceph osd crush rule dump ecpool --cluster slave
{
    "rule_id": 1,
    "rule_name": "ecpool",
    "ruleset": 1,
    "type": 3,
    "min_size": 3,
    "max_size": 5,
    "steps": [
        {
            "op": "set_chooseleaf_tries",
            "num": 5
        },
        {
            "op": "set_choose_tries",
            "num": 100
        },
        {
            "op": "take",
            "item": -1,
            "item_name": "default"
        },
        {
            "op": "choose_indep",
            "num": 0,
            "type": "osd"
        },
        {
            "op": "emit"
        }
    ]
}

3. Killed "M"(2) number of OSDs.

Actual results:
After killing 2 OSDs some of the Pgs of this ecpool got into incomplete state.

sudo ceph pg dump --cluster slave | grep "^12."
dumped all
12.4          0                  0        0         0       0          0    0        0          active+undersized 2018-05-09 15:09:52.831363        0'0       502:19    [NONE,2,1,3,5]          2    [NONE,2,1,3,5]              2        0'0 2018-05-09 15:07:37.577427             0'0 2018-05-09 15:07:37.577427             0 
12.0          0                  0        0         0       0          0    0        0          active+undersized 2018-05-09 15:09:52.835938        0'0       502:30    [0,3,1,2,NONE]          0    [0,3,1,2,NONE]              0        0'0 2018-05-09 15:07:37.577427             0'0 2018-05-09 15:07:37.577427             0 
12.1          0                  0        0         0       0          0    0        0               active+clean 2018-05-09 15:07:39.630869        0'0       502:19       [3,1,5,2,0]          3       [3,1,5,2,0]              3        0'0 2018-05-09 15:07:37.577427             0'0 2018-05-09 15:07:37.577427             0 
12.2          0                  0        0         0       0          0    0        0                 incomplete 2018-05-09 15:09:57.773180        0'0       502:27 [NONE,2,0,3,NONE]          2 [NONE,2,0,3,NONE]              2        0'0 2018-05-09 15:07:37.577427             0'0 2018-05-09 15:07:37.577427             0 
12.3          0                  0        0         0       0          0    0        0                 incomplete 2018-05-09 15:09:57.771912        0'0       502:27 [NONE,3,1,NONE,5]          3 [NONE,3,1,NONE,5]              3        0'0 2018-05-09 15:07:37.577427             0'0 2018-05-09 15:07:37.577427             0 
12     0 0     0 0 0           0      0      0 


Expected results:
None of pgs should get into incomplete state since k=3 and m=2 and maximum 2 osds can go down at OSD level failure domain.

Additional info:
By default this ecpool was created with min_size of '4'.
sudo ceph osd pool get ecpool min_size --cluster slave
min_size: 4
I am not sure whether this is applicable to erasure coded pools but by manually reducing min_size to '3', incomplete pgs were cleared.


I have attached crush map of my cluster.

Comment 10 Giridhar Ramaraju 2019-08-05 13:11:09 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 11 Giridhar Ramaraju 2019-08-05 13:12:10 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 13 Josh Durgin 2020-04-22 18:32:13 UTC
This is in all 5.0 builds - needs qa ack

Comment 22 errata-xmlrpc 2021-08-30 08:22:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294


Note You need to log in before you can comment on or make changes to this bug.