Created attachment 1434216 [details] Crush map Description of problem: EC pool pgs are getting into incomplete state after killing "M" number of OSDs. Version-Release number of selected component (if applicable): ceph version 12.2.4-10.el7cp My Setup: 3 Mons, 3 OSD host with 8 osds in total Ec Pool: K=3 M=2, osd level failure domain Steps done: 1.Configured a ceph cluster 2.Created an EC pool(5Pgs)(k=3,M=2) configured with osd level failure domain. Profile: sudo ceph osd erasure-code-profile get myprofile --cluster slave crush-device-class= crush-failure-domain=osd crush-root=default jerasure-per-chunk-alignment=false k=3 m=2 plugin=jerasure technique=reed_sol_van w=8 Crush rule dump of this pool: sudo ceph osd crush rule dump ecpool --cluster slave { "rule_id": 1, "rule_name": "ecpool", "ruleset": 1, "type": 3, "min_size": 3, "max_size": 5, "steps": [ { "op": "set_chooseleaf_tries", "num": 5 }, { "op": "set_choose_tries", "num": 100 }, { "op": "take", "item": -1, "item_name": "default" }, { "op": "choose_indep", "num": 0, "type": "osd" }, { "op": "emit" } ] } 3. Killed "M"(2) number of OSDs. Actual results: After killing 2 OSDs some of the Pgs of this ecpool got into incomplete state. sudo ceph pg dump --cluster slave | grep "^12." dumped all 12.4 0 0 0 0 0 0 0 0 active+undersized 2018-05-09 15:09:52.831363 0'0 502:19 [NONE,2,1,3,5] 2 [NONE,2,1,3,5] 2 0'0 2018-05-09 15:07:37.577427 0'0 2018-05-09 15:07:37.577427 0 12.0 0 0 0 0 0 0 0 0 active+undersized 2018-05-09 15:09:52.835938 0'0 502:30 [0,3,1,2,NONE] 0 [0,3,1,2,NONE] 0 0'0 2018-05-09 15:07:37.577427 0'0 2018-05-09 15:07:37.577427 0 12.1 0 0 0 0 0 0 0 0 active+clean 2018-05-09 15:07:39.630869 0'0 502:19 [3,1,5,2,0] 3 [3,1,5,2,0] 3 0'0 2018-05-09 15:07:37.577427 0'0 2018-05-09 15:07:37.577427 0 12.2 0 0 0 0 0 0 0 0 incomplete 2018-05-09 15:09:57.773180 0'0 502:27 [NONE,2,0,3,NONE] 2 [NONE,2,0,3,NONE] 2 0'0 2018-05-09 15:07:37.577427 0'0 2018-05-09 15:07:37.577427 0 12.3 0 0 0 0 0 0 0 0 incomplete 2018-05-09 15:09:57.771912 0'0 502:27 [NONE,3,1,NONE,5] 3 [NONE,3,1,NONE,5] 3 0'0 2018-05-09 15:07:37.577427 0'0 2018-05-09 15:07:37.577427 0 12 0 0 0 0 0 0 0 0 Expected results: None of pgs should get into incomplete state since k=3 and m=2 and maximum 2 osds can go down at OSD level failure domain. Additional info: By default this ecpool was created with min_size of '4'. sudo ceph osd pool get ecpool min_size --cluster slave min_size: 4 I am not sure whether this is applicable to erasure coded pools but by manually reducing min_size to '3', incomplete pgs were cleared. I have attached crush map of my cluster.
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
This is in all 5.0 builds - needs qa ack
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3294